need help and getting unique set

Question

need help and getting unique set

Walton

SSCommitted

Points: 1638
More actions
August 19, 2013 at 1:19 pm

#281356

Hi all,
I need a help on getting the distinct set.
My Data look like this:
SET 1
SET ID Product Attribute
1 A 10
1 A 11
SET 2
SET ID Product Attribute
2 A 10
2 A 12
SET 3
SET ID Product Attribute
3 A 10
3 A 11
Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this
SET ID Product Attribute
1 A 10
1 A 11
2 A 10
2 A 12
Thank you all in advance for your help.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286601 More actions · Answer 1

Walton (8/19/2013)
Hi all,
I need a help on getting the distinct set.
My Data look like this:
SET 1
SET ID Product Attribute
1 A 10
1 A 11
SET 2
SET ID Product Attribute
2 A 10
2 A 12
SET 3
SET ID Product Attribute
3 A 10
3 A 11
Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this
SET ID Product Attribute
1 A 10
1 A 11
2 A 10
2 A 12
Thank you all in advance for your help.

So what happens to Set 3? The product/attribute combination is both Set 1 and 3 but how to decide which Set to use?

Maybe as simple as using MIN?

You didn't provide anything usable so my code probably doesn't work as is.

select min(setid), Product, Attribute

From SomeTable

group by Product, Attribute

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Walton SSCommitted Points: 1638 More actions · Answer 2

Sean Lange (8/19/2013)
Walton (8/19/2013)
Hi all,
I need a help on getting the distinct set.
My Data look like this:
SET 1
SET ID Product Attribute
1 A 10
1 A 11
SET 2
SET ID Product Attribute
2 A 10
2 A 12
SET 3
SET ID Product Attribute
3 A 10
3 A 11
Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this
SET ID Product Attribute
1 A 10
1 A 11
2 A 10
2 A 12
Thank you all in advance for your help.
So what happens to Set 3? The product/attribute combination is both Set 1 and 3 but how to decide which Set to use?
Maybe as simple as using MIN?
You didn't provide anything usable so my code probably doesn't work as is.
select min(setid), Product, Attribute
From SomeTable
group by Product, Attribute

Hi Sean,

I do not care if I am getting set 1 or set 3 either one would be fine. But the result set should return the complete list for the set. The result set should look like this :

SET ID Product Attribute

1 A 10

1 A 11

2 A 10

2 A 12

But with the solution you provided, result will like this:

SET_ID Product Attribute

1 A 10

1 A 11

2 A 12

Thanks for looking onto it.

Thanks

Sean Lange SSC Guru Points: 286601 More actions · Answer 3

Since the pseudocode doesn't work I assume that means you need or at least want some additional help? You have been around here long enough to know that the next question is...can you post ddl and sample data? If you have somehow managed to miss what it takes to post a complete question please take a few minutes and read the first link in my signature.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Walton SSCommitted Points: 1638 More actions · Answer 4

Hi Sean,

Here are the DDL and sample data.

Thanks a lot for spending your time on it

CREATE TABLE #temp

(set_id INT

,product CHAR(5)

,attribute INT)

--insert set 1

INSERT INTO #temp

(set_id, product, attribute)

VALUES

(1,'A',10)

,(1,'A',11)

--insert set 2

INSERT INTO #temp

(set_id, product, attribute)

VALUES

(2,'A',10)

,(2,'A',12)

--insert set 2

INSERT INTO #temp

(set_id, product, attribute)

VALUES

(3,'A',10)

,(3,'A',11)

DROP TABLE #temp

Dwain Camps SSC Guru Points: 86978 More actions · Answer 5

Not sure this is a particularly efficient way to do this but this might work:

SELECT set_id, product, attribute=CAST(item AS INT)

FROM (

SELECT set_id=MIN(set_id), product, allattributes

FROM (

SELECT set_id, product, attribute

,allattributes=STUFF((

SELECT ',' + STR(attribute, 5)

FROM #temp b

WHERE a.set_id = b.set_id AND a.product = b.product

ORDER BY attribute

FOR XML PATH('')), 1, 1, '')

FROM #temp a) a

GROUP BY product, allattributes) a

CROSS APPLY dbo.DelimitedSplit8K(allattributes, ',') b

DelimitedSplit8K is (as the name implies) a delimited string splitter that can be found here[/url].

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 6

And here's another way (that I am less confident of because it requires that the SUM of attributes for a set is unique), but maybe it'll give you some additional ideas.

SELECT set_id, product, attribute

FROM (

SELECT set_id, product, attribute, m, c

,n=ROW_NUMBER() OVER (PARTITION BY product, m ORDER BY set_id)

FROM (

SELECT set_id, product, attribute

,m=SUM(attribute) OVER (PARTITION BY product, set_id)

,c=COUNT(attribute) OVER (PARTITION BY product, set_id)

FROM #temp) a) a

WHERE n <= c

ORDER BY set_id, attribute;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

Hi Walton

How many sets will you have?

How many rows will each set have?

Thanks.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

dwain.c (8/19/2013)
Not sure this is a particularly efficient way to do this but this might work:
SELECT set_id, product, attribute=CAST(item AS INT)
FROM (
SELECT set_id=MIN(set_id), product, allattributes
FROM (
SELECT set_id, product, attribute
,allattributes=STUFF((
SELECT ',' + STR(attribute, 5)
FROM #temp b
WHERE a.set_id = b.set_id AND a.product = b.product
ORDER BY attribute
FOR XML PATH('')), 1, 1, '')
FROM #temp a) a
GROUP BY product, allattributes) a
CROSS APPLY dbo.DelimitedSplit8K(allattributes, ',') b
DelimitedSplit8K is (as the name implies) a delimited string splitter that can be found here[/url].

It's about as inefficient as the brute-force method:

SELECT s1.*

FROM #temp s1 -- have to start somewhere.

WHERE s1.set_id = 1

UNION ALL

SELECT s2.*

FROM (SELECT * FROM #temp WHERE set_id = 2) s2

WHERE EXISTS ( -- check for a row mismatch between set 2 and set 1.

SELECT 1

FROM (SELECT * FROM #temp WHERE s2.set_id = 2) s2

FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 1) s1

ON s1.Product = s2.Product AND s1.Attribute = s2.Attribute

WHERE (s2.SET_ID IS NULL OR s1.SET_ID IS NULL)

)

UNION ALL

SELECT s3.*

FROM #SET3 s3

WHERE s3.set_id = 3

AND EXISTS ( -- check for a row mismatch between set 3 and set 1.

SELECT 1

FROM (SELECT * FROM #temp WHERE set_id = 3) s3

FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 1) s1

ON s1.Product = s3.Product AND s1.Attribute = s3.Attribute

WHERE s3.SET_ID IS NULL OR s1.SET_ID IS NULL

)

AND EXISTS ( -- check for a row mismatch between set 3 and set 2.

SELECT 1

FROM (SELECT * FROM #temp WHERE set_id = 3) s3

FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 2) s2

ON s2.Product = s3.Product AND s2.Attribute = s3.Attribute

WHERE s3.SET_ID IS NULL OR s2.SET_ID IS NULL

)

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

mickyT SSChampion Points: 10360 More actions · Answer 9

Hi

These will probably perform like a dog on larger sets, but here is 2 more options

SELECT set_id, product, attribute

FROM #temp t

INNER JOIN

(

SELECT DISTINCT MIN(set_id) minid

FROM #temp

GROUP BY product, attribute

) s ON t.set_id = s.minid

SELECT set_id, product, attribute

FROM #temp t

WHERE set_id in (

SELECT MIN(set_id) minid

FROM #temp

GROUP BY product, attribute

)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

I didn't realize that brute force was allowed.

CREATE TABLE #temp

(set_id INT

,product CHAR(5)

,attribute INT)

INSERT INTO #temp (set_id, product, attribute) VALUES (1,'A',10) ,(1,'A',11)

INSERT INTO #temp (set_id, product, attribute) VALUES (2,'A',10),(2,'A',12)

INSERT INTO #temp (set_id, product, attribute) VALUES (3,'A',10),(3,'A',11);

INSERT INTO #temp (set_id, product, attribute) VALUES (4,'A',10),(4,'A',11);

INSERT INTO #temp (set_id, product, attribute) VALUES (5,'A',10),(5,'A',12);

INSERT INTO #temp (set_id, product, attribute) VALUES (6,'A',10),(6,'A',12),(6,'A',13);

DECLARE @setid INT = 1, @maxid INT = (SELECT MAX(set_id) FROM #temp);

-- Gotta start somewhere

SELECT * INTO #temp2 FROM #temp WHERE set_id = @setid;

WHILE @setid < @maxid

BEGIN

SELECT @setid = @setid + 1;

INSERT INTO #temp2

SELECT *

FROM #temp

WHERE set_id = @setid AND EXISTS (

SELECT product, attribute

FROM #temp

WHERE set_id = @setid

EXCEPT

SELECT product, attribute

FROM #temp2

GROUP BY set_id, product, attribute

);

END

SELECT * FROM #temp2 ORDER BY set_id, product, attribute;

GO

DROP TABLE #temp;

DROP TABLE #temp2;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

need help and getting unique set

Cookies on SQLServerCentral