July 14, 2016 at 11:22 am
Hi,
I was hoping to get some thoughts on the best approach for a query. It would probably be easier to show a scaled version of my dataset. The real columns are varchar(50), but the below will suffice (feel free to replace the select *... section with any suggestions on the best way to return the result set).
Below, I need to return all Persons with 1 and 2 but no 3. I clarify which Persons (letter in this case) should be returned in the comment section.
--return all Persons with 1 and 2 but no 3
/*
a - Has all three
b - Has 1 and 2 but no 3 (RETURN)
c - Has 1
d - Has 1 and 2 but no 3 (RETURN)
e - Has 2 and 3 but no 1
f - Has 3
*/
with t as (
select 1 as Comp, 'a' as Person union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person
)
select *
from t
Does anyone have any thoughts or suggestions? The only thing I came up with was to pivot 1, 2, and 3 as columns in a subquery so I could use a simple WHERE clause. However, I feel like there has to be a more efficient way to accomplish this.
If I need to generate a create that would create a real table with my sample data please let me know and I'll be glad to do that.
July 14, 2016 at 11:34 am
Check the following post: http://www.sqlservercentral.com/Forums/FindPost1275453.aspx
You could follow the whole discussion as well.
July 14, 2016 at 11:38 am
assuming you have a guarantee that only 1 and 3 or the other possible values and you can't have more than one of either it is simply this:
select person
from table
where comp in (1,3)
group by person
having count(*) = 2
Updated to change > to = ...
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 14, 2016 at 12:28 pm
What you originally proposed isn't that complicated...
with t as (
select 1 as Comp, 'a' as Person union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person
)
SELECT
Person
FROM (
SELECT
person,
MAX(CASE WHEN comp = 1 THEN 'YES' END) AS One,
MAX(CASE WHEN comp = 2 THEN 'YES' END) AS Two,
MAX(CASE WHEN comp = 3 THEN 'YES' END) AS Three
FROM t
GROUP BY person
) x
WHERE One = 'YES' AND Two = 'YES' AND Three IS NULL
July 14, 2016 at 12:52 pm
SELECT
t.Person
FROM
t
GROUP BY
t.Person
HAVING
SUM(t.Comp) = 3
AND COUNT(*) = 2;
or
SELECT
t.Person
FROM
t
INNER JOIN t AS t2
ON t2.Person = t.Person
AND t2.Comp = 2
AND t.Comp = 1
LEFT JOIN t AS t3
ON t3.Person = t.Person
AND t3.Comp = 3
WHERE
t3.Person IS NULL;
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 14, 2016 at 1:25 pm
Y.B. (7/14/2016)
What you originally proposed isn't that complicated...
I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.
select 1 as Comp, 'a' as Person INTO #t union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person;
--Option 1
SELECT Person FROM #t WHERE Comp = 1
INTERSECT
SELECT Person FROM #t WHERE Comp = 2
EXCEPT
SELECT Person FROM #t WHERE Comp = 3;
--Option 2
SELECT Person FROM #t t1 WHERE Comp = 1
AND EXISTS(
SELECT Person FROM #t t2
WHERE Comp = 2
AND t1.Person = t2.Person
AND NOT EXISTS(
SELECT Person FROM #t t3
WHERE Comp = 3
AND t2.Person = t3.Person));
GO
DROP TABLE #t;
July 14, 2016 at 2:28 pm
Luis Cazares (7/14/2016)
Y.B. (7/14/2016)
What you originally proposed isn't that complicated...I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.
select 1 as Comp, 'a' as Person INTO #t union all
select 2 as Comp, 'a' as Person union all
select 3 as Comp, 'a' as Person union all
select 1 as Comp, 'b' as Person union all
select 2 as Comp, 'b' as Person union all
select 1 as Comp, 'c' as Person union all
select 1 as Comp, 'd' as Person union all
select 2 as Comp, 'd' as Person union all
select 2 as Comp, 'e' as Person union all
select 3 as Comp, 'e' as Person union all
select 3 as Comp, 'f' as Person;
--Option 1
SELECT Person FROM #t WHERE Comp = 1
INTERSECT
SELECT Person FROM #t WHERE Comp = 2
EXCEPT
SELECT Person FROM #t WHERE Comp = 3;
--Option 2
SELECT Person FROM #t t1 WHERE Comp = 1
AND EXISTS(
SELECT Person FROM #t t2
WHERE Comp = 2
AND t1.Person = t2.Person
AND NOT EXISTS(
SELECT Person FROM #t t3
WHERE Comp = 3
AND t2.Person = t3.Person));
GO
DROP TABLE #t;
Busy post but interesting nonetheless. I chose that method primarily for the simplicity in which the logic can be changed for different scenarios. I always try to post the best performing code (to my knowledge) but experience tells me someone can/will always come out with a faster/better solution. Heck most of the great techniques I've learned were from this site. i.e. Using cross tabs over pivots. 😉
July 14, 2016 at 3:38 pm
I found a solution that seems to perform very well given certain pre-conditions.
I tried creating a million row test, but I had trouble getting the right distribution of values to provide a valid test (specifically getting rid of duplicates).
The following code works with the sample, but will need to be manipulated in order to work with the live data. Specifically, a zero-based DENSE_RANK needs to be assigned to the distinct Comp values, and you may need to use a CTE/derived table to get unique combinations of Person/Comp.
SELECT t.Person
FROM #t t
GROUP BY t.Person
HAVING SUM(POWER(2, t.Comp)) = 6
It basically creates a bitmap of the possible values and then chooses only the records that match the required bitmap.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply