Compare multiple rows in same column

Question

Compare multiple rows in same column

DataAnalyst011

SSCarpal Tunnel

Points: 4035
More actions
July 14, 2016 at 11:22 am

#400408

Hi,
I was hoping to get some thoughts on the best approach for a query. It would probably be easier to show a scaled version of my dataset. The real columns are varchar(50), but the below will suffice (feel free to replace the select *... section with any suggestions on the best way to return the result set).
Below, I need to return all Persons with 1 and 2 but no 3. I clarify which Persons (letter in this case) should be returned in the comment section.
--return all Persons with 1 and 2 but no 3
/*
a - Has all three
b - Has 1 and 2 but no 3 (RETURN)
c - Has 1
d - Has 1 and 2 but no 3 (RETURN)
e - Has 2 and 3 but no 1
f - Has 3
*/
with t as (
select 1 as Comp, 'a' as Person union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person
)
select *
from t
Does anyone have any thoughts or suggestions? The only thing I came up with was to pivot 1, 2, and 3 as columns in a subquery so I could use a simple WHERE clause. However, I feel like there has to be a more efficient way to accomplish this.
If I need to generate a create that would create a real table with my sample data please let me know and I'll be glad to do that.
There must be an index on Person/Comp or Person that includes Comp.
There must be fewer than 31 (or 63) possible distinct Comp values.
There cannot be duplicates for any Person/Comp value.

I tried creating a million row test, but I had trouble getting the right distribution of values to provide a valid test (specifically getting rid of duplicates).

The following code works with the sample, but will need to be manipulated in order to work with the live data. Specifically, a zero-based DENSE_RANK needs to be assigned to the distinct Comp values, and you may need to use a CTE/derived table to get unique combinations of Person/Comp.

SELECT t.Person

FROM #t t

GROUP BY t.Person

HAVING SUM(POWER(2, t.Comp)) = 6

It basically creates a bitmap of the possible values and then chooses only the records that match the required bitmap.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

Check the following post: http://www.sqlservercentral.com/Forums/FindPost1275453.aspx

You could follow the whole discussion as well.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

assuming you have a guarantee that only 1 and 3 or the other possible values and you can't have more than one of either it is simply this:

select person

from table

where comp in (1,3)

group by person

having count(*) = 2

Updated to change > to = ...

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Y.B. SSChampion Points: 11652 More actions · Answer 3

What you originally proposed isn't that complicated...

with t as (

select 1 as Comp, 'a' as Person union all

select 2 as Comp, 'a' as Person union all

select 3 as Comp, 'a' as Person union all

select 1 as Comp, 'b' as Person union all

select 2 as Comp, 'b' as Person union all

select 1 as Comp, 'c' as Person union all

select 1 as Comp, 'd' as Person union all

select 2 as Comp, 'd' as Person union all

select 2 as Comp, 'e' as Person union all

select 3 as Comp, 'e' as Person union all

select 3 as Comp, 'f' as Person

)

SELECT

Person

FROM (

SELECT

person,

MAX(CASE WHEN comp = 1 THEN 'YES' END) AS One,

MAX(CASE WHEN comp = 2 THEN 'YES' END) AS Two,

MAX(CASE WHEN comp = 3 THEN 'YES' END) AS Three

FROM t

GROUP BY person

) x

WHERE One = 'YES' AND Two = 'YES' AND Three IS NULL

SELECT quote FROM brain WHERE original = 1
0 rows returned

GSquared SSC Guru Points: 260824 More actions · Answer 4

SELECT

t.Person

FROM

t

GROUP BY

t.Person

HAVING

SUM(t.Comp) = 3

AND COUNT(*) = 2;

or

SELECT

t.Person

FROM

t

INNER JOIN t AS t2

ON t2.Person = t.Person

AND t2.Comp = 2

AND t.Comp = 1

LEFT JOIN t AS t3

ON t3.Person = t.Person

AND t3.Comp = 3

WHERE

t3.Person IS NULL;

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

Y.B. (7/14/2016)
What you originally proposed isn't that complicated...

I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.

select 1 as Comp, 'a' as Person INTO #t union all

select 2 as Comp, 'a' as Person union all

select 3 as Comp, 'a' as Person union all

select 1 as Comp, 'b' as Person union all

select 2 as Comp, 'b' as Person union all

select 1 as Comp, 'c' as Person union all

select 1 as Comp, 'd' as Person union all

select 2 as Comp, 'd' as Person union all

select 2 as Comp, 'e' as Person union all

select 3 as Comp, 'e' as Person union all

select 3 as Comp, 'f' as Person;

--Option 1

SELECT Person FROM #t WHERE Comp = 1

INTERSECT

SELECT Person FROM #t WHERE Comp = 2

EXCEPT

SELECT Person FROM #t WHERE Comp = 3;

--Option 2

SELECT Person FROM #t t1 WHERE Comp = 1

AND EXISTS(

SELECT Person FROM #t t2

WHERE Comp = 2

AND t1.Person = t2.Person

AND NOT EXISTS(

SELECT Person FROM #t t3

WHERE Comp = 3

AND t2.Person = t3.Person));

GO

DROP TABLE #t;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Y.B. SSChampion Points: 11652 More actions · Answer 6

Luis Cazares (7/14/2016)
Y.B. (7/14/2016)
What you originally proposed isn't that complicated...
I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.
select 1 as Comp, 'a' as Person INTO #t union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person;
--Option 1
SELECT Person FROM #t WHERE Comp = 1
INTERSECT
SELECT Person FROM #t WHERE Comp = 2
EXCEPT
SELECT Person FROM #t WHERE Comp = 3;
--Option 2
SELECT Person FROM #t t1 WHERE Comp = 1
AND EXISTS(
SELECT Person FROM #t t2
WHERE Comp = 2
AND t1.Person = t2.Person
AND NOT EXISTS(
SELECT Person FROM #t t3
WHERE Comp = 3
AND t2.Person = t3.Person));
GO
DROP TABLE #t;

Busy post but interesting nonetheless. I chose that method primarily for the simplicity in which the logic can be changed for different scenarios. I always try to post the best performing code (to my knowledge) but experience tells me someone can/will always come out with a faster/better solution. Heck most of the great techniques I've learned were from this site. i.e. Using cross tabs over pivots. 😉

SELECT quote FROM brain WHERE original = 1
0 rows returned

drew.allen SSC Guru Points: 77014 More actions · Answer 7

I found a solution that seems to perform very well given certain pre-conditions.