January 19, 2012 at 1:53 pm
I have dataset that consists of 3 fields. Package number, IC_CODE and a description
I am trying to write a query that spilts/partitions data into groups based on the package number and the outputs data where for each package the data is the same.
I have provided some test data
CREATE TABLE #TEST
(
PACKAGE INT
,CODE VARCHAR(20)
,[DESCRIPTION] VARCHAR(20)
)
INSERT INTO #TEST
SELECT 10000,'CSAC','CASING'UNION ALL
SELECT 10000,'BOXC','BOXING'UNION ALL
SELECT 10000,'CSAC','CASING' UNION ALL
SELECT 12000,'BOXC','BOXING'UNION ALL
SELECT 12000,'BOXC','BOXING'UNION ALL
SELECT 12000,'BOXC','BOXING' UNION ALL
SELECT 13000,'BASK','BASKET'UNION ALL
SELECT 13000,'CSAC','CASING'UNION ALL
SELECT 13000,'BOXC','BOXING'
Based on th above data the output should be
Package CODE DESCRIPTION
______________________________
12000 BOXC BOXING
12000 BOXC BOXING
12000 BOXC BOXING
January 19, 2012 at 1:56 pm
eseosaoregie (1/19/2012)
Based on th above data the output should bePackage CODE DESCRIPTION
______________________________
12000 BOXC BOXING
12000 BOXC BOXING
12000 BOXC BOXING
Can you doublecheck your expected results? This doesn't seem right.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 19, 2012 at 2:24 pm
This output is correct because package 120000 has only code associated with it. The other packages have 2 or more different codes. The output is required to be the packages which within them have the same codes
January 19, 2012 at 2:58 pm
Based on the information you gave and the question you asked...
SELECT PACKAGE,CODE,[DESCRIPTION]
FROM #TEST
WHERE PACKAGE = 12000 AND CODE = 'BOXC' AND [DESCRIPTION] = 'BOXING'
So, if there is more to this, please give ALL of the expected results.
Jared
CE - Microsoft
January 19, 2012 at 3:02 pm
eseosaoregie (1/19/2012)
This output is correct because package 120000 has only code associated with it. The other packages have 2 or more different codes. The output is required to be the packages which within them have the same codes
Ah, now I understand. Sorry, first time I looked at it didn't make sense.
Jared, seriously? It's sample data, and that's obviously not the correct algorithm for a full data set.
Gimme a bit on the final solution, thanks for the sample dataset. General principal: Group By on all involved columns, pull having HAVING COUNT(*) > 1, and then using that as a sub-query to re-connect to the primary data to pull back all rows having that count > 1.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 19, 2012 at 3:03 pm
In the example I gave there are only 3 different packages. However what if there were 2000 packages. I like to have a generic query that will select all the packages with only one code associated with it.
January 19, 2012 at 3:11 pm
Evil Kraig F (1/19/2012)
eseosaoregie (1/19/2012)
This output is correct because package 120000 has only code associated with it. The other packages have 2 or more different codes. The output is required to be the packages which within them have the same codesAh, now I understand. Sorry, first time I looked at it didn't make sense.
Jared, seriously? It's sample data, and that's obviously not the correct algorithm for a full data set.
Gimme a bit on the final solution, thanks for the sample dataset. General principal: Group By on all involved columns, pull having HAVING COUNT(*) > 1, and then using that as a sub-query to re-connect to the primary data to pull back all rows having that count > 1.
I just didn't know what was being asked... I could have been nicer about it though. 🙂
So, are we expecting a parameter to filter the grouped results? Are there any aggregates or counts?
Jared
CE - Microsoft
January 19, 2012 at 3:13 pm
eseosaoregie (1/19/2012)
In the example I gave there are only 3 different packages. However what if there were 2000 packages. I like to have a generic query that will select all the packages with only one code associated with it.
I'm sorry, I'm still confused. With the data you gave us and the statement above, you just want to filter on a given code?
Jared
CE - Microsoft
January 19, 2012 at 3:15 pm
1 more guess from me... Are you trying to identify duplicates?
Jared
CE - Microsoft
January 19, 2012 at 3:18 pm
Had to make a little adjustment to the sample script, it didn't copy/paste well for me.
However, remembering now why I questioned the result set, there's two 'multiples' of items. CSAC Casing in 10000 and BOXC Boxing in 12000.
The following code will show you the algorithm and the insert into script I rebuilt to get CRLFs to behave properly. (EDIT: which still won't behave for me when I copy/paste my own code back into my own window. Blasted wierd...)
IF OBJECT_ID('tempdb..#TEST') IS NOT NULL
DROP TABLE #TEST
CREATE TABLE #TEST(
PACKAGE INT,
CODE VARCHAR(20),
[DESCRIPTION] VARCHAR(20))
INSERT INTO #TEST
SELECT 10000,'CSAC','CASING' UNION ALL
SELECT 10000,'BOXC','BOXING' UNION ALL
SELECT 10000,'CSAC','CASING' UNION ALL
SELECT 12000,'BOXC','BOXING' UNION ALL
SELECT 12000,'BOXC','BOXING' UNION ALL
SELECT 12000,'BOXC','BOXING' UNION ALL
SELECT 13000,'BASK','BASKET' UNION ALL
SELECT 13000,'CSAC','CASING' UNION ALL
SELECT 13000,'BOXC','BOXING'
;WITH cte AS
(SELECT
package, code, [description], count(*) AS cnt
FROM
#test
GROUP BY
package, code, [description]
HAVING
COUNT(*) > 1
)
--SELECT * FROM cte
SELECT
t.*
FROM
#TEST AS t
JOIN
cte
ONt.package = cte.package
and t.code = cte.code
and t.[description] = cte.[description]
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 19, 2012 at 3:22 pm
CREATE TABLE #TEST
(
PACKAGE INT
,CODE VARCHAR(20)
,[DESCRIPTION] VARCHAR(20)
)
INSERT INTO #TEST
SELECT 10000,'CSAC','CASING'UNION ALL
SELECT 10000,'BOXC','BOXING'UNION ALL
SELECT 10000,'CSAC','CASING' UNION ALL
SELECT 12000,'BOXC','BOXING'UNION ALL
SELECT 12000,'BOXC','BOXING'UNION ALL
SELECT 12000,'BOXC','BOXING' UNION ALL
SELECT 13000,'BASK','BASKET'UNION ALL
SELECT 13000,'CSAC','CASING'UNION ALL
SELECT 13000,'BOXC','BOXING'
;
WITH RowNumbers AS
(
SELECT PACKAGE, CODE, [DESCRIPTION],
ROW_NUMBER() OVER (PARTITION BY PACKAGE, CODE, [DESCRIPTION] ORDER BY PACKAGE, CODE, [DESCRIPTION]) AS rn
FROM #test
)
SELECT PACKAGE, CODE, [DESCRIPTION]
FROM RowNumbers WHERE rn > 1
Something like this?
EDIT: This is wrong! I'm sorry... it has been a long day. This will give you duplicates leaving 1 if you need to remove duplicates.
Jared
CE - Microsoft
January 19, 2012 at 4:41 pm
thanks, this is what i wanted. Apologies for my poor explanations at the beginning
January 19, 2012 at 4:47 pm
eseosaoregie (1/19/2012)
thanks, this is what i wanted. Apologies for my poor explanations at the beginning
No worries, it happens. I wasn't explaining myself well either. Happy to help, good luck. 🙂
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply