March 30, 2012 at 4:33 am
This one gives you even more flexibility to filter on/output counts of each product:
SELECT CustomerID, cnt_A, cnt_B, cnt_C
FROM (
SELECT
CustomerID
, SUM(CASE ProductCode WHEN 'A' THEN 1 ELSE 0 END) AS cnt_A
, SUM(CASE ProductCode WHEN 'B' THEN 1 ELSE 0 END) AS cnt_B
, SUM(CASE ProductCode WHEN 'C' THEN 1 ELSE 0 END) AS cnt_C
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
GROUP BY CustomerID
) t
WHERE cnt_A > 0 AND cnt_B > 0 AND cnt_C = 0
March 30, 2012 at 7:08 am
Thanks for testing.
I think a simpler version might be
SELECT
CustomerID
, SUM(CASE ProductCode WHEN 'A' THEN 1 ELSE 0 END) AS cnt_A
, SUM(CASE ProductCode WHEN 'B' THEN 1 ELSE 0 END) AS cnt_B
, SUM(CASE ProductCode WHEN 'C' THEN 1 ELSE 0 END) AS cnt_C
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
GROUP BY CustomerID
Having cnt_A > 0 AND cnt_B > 0 AND cnt_C = 0
Or to let the inner/initial query just do the string-to-number translation.
SELECT CustomerID
, SUM(Case_A) AS cnt_A
, SUM(Case_B) AS cnt_B
, SUM(Case_C) AS cnt_C
FROM (
SELECT
CustomerID
, CASE ProductCode WHEN 'A' THEN 1 ELSE 0 END AS Case_A
, CASE ProductCode WHEN 'B' THEN 1 ELSE 0 END AS Case_B
, CASE ProductCode WHEN 'C' THEN 1 ELSE 0 END AS Case_C
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
) t
WHERE cnt_A <> 0 AND cnt_B <> 0 AND cnt_C = 0
GROUP BY CustomerID
March 30, 2012 at 8:41 am
select customer,
sum(case where product = 'A' then 1 else 0 end) as A,
sum(case where product = 'B' then 1 else 0 end) as B,
sum(case where product = 'C' then 1 else 0 end) as C
from
group by customer
having sum(case where product = 'A' then 1 else 0 end) > 1
and sum(case where product = 'B' then 1 else 0 end) > 1
and sum(case where product = 'C' then 1 else 0 end) = 0
March 30, 2012 at 2:12 pm
Might I suggest another, old school, alternative that seems to run a little bit faster, which uses a sub-query to eliminate those customers who all ready purchased item "C".
Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('A','B')
AND CustomerID NOT IN (SELECT DISTINCT CustomerID FROM #Purchase AS P WHERE ProductCode IN ('C'))
GROUP BY CustomerID
HAVING COUNT(DISTINCT ProductCode) = 2
Here are the statistics:
Find Customers that did not buy "C" from article
(6673 row(s) affected)
SQL Server Execution Times:
CPU time = 94 ms, elapsed time = 265 ms.
Find Customers that did not buy "C" using subquery
(6673 row(s) affected)
SQL Server Execution Times:
CPU time = 93 ms, elapsed time = 202 ms.
March 30, 2012 at 4:21 pm
Berl (3/30/2012)
Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
Elapsed time includes wait time and is not a good measure of performance.
March 31, 2012 at 1:37 pm
Dean Cochrane (3/30/2012)
Berl (3/30/2012)
Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
Elapsed time includes wait time and is not a good measure of performance.
Actually, it is. The wait time you speak of also includes I/O time and that can be a major part of a performance problem. I do agree, though, that you really need to take "time to display" out of the picture because it will severly mask problems. It takes a certain amount of time to display a million rows and can make differences between two methods seem much more trivial than they might be.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 31, 2012 at 10:43 pm
Thank you Jeff.
April 2, 2012 at 5:38 am
Imho
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Regards
Mike
April 2, 2012 at 6:31 am
michal.lisinski (4/2/2012)
Imho
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Regards
Mike
This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.
April 2, 2012 at 6:35 am
Jeff Moden (3/31/2012)
Dean Cochrane (3/30/2012)
Berl (3/30/2012)
Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
Elapsed time includes wait time and is not a good measure of performance.
Actually, it is. The wait time you speak of also includes I/O time and that can be a major part of a performance problem. I do agree, though, that you really need to take "time to display" out of the picture because it will severly mask problems. It takes a certain amount of time to display a million rows and can make differences between two methods seem much more trivial than they might be.
Well depends, the elapsed time can an ok performance indicator, however when you run the queries for testing back to back in the same batch without clearing the buffers you may get a false sense of performance from elapsed time and many other stats. I have a local test SQL Server which I always test by stopping the instance then starting so I can be sure everything is clear for testing. But the query plans, so I would be carefull in saying one scenario over another is truely best if you compare in a single batch. Now all that said, a huge difference in elapsed time shows a performance change and when I say huge took 30minutes to run first time then 30 seconds with second query.
April 2, 2012 at 7:00 am
Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike
April 2, 2012 at 7:14 am
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Regards
Mike
This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.
April 2, 2012 at 7:15 am
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Regards
Mike
This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.
Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike
April 2, 2012 at 7:32 am
michal.lisinski (4/2/2012)
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Regards
Mike
This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.
Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike
Hi Mike,
I tried the following on your code:
;WITH [#Purchase] AS
(
SELECT 'A' ProductCode, 1 CustomerId UNION ALL
SELECT 'B' ProductCode, 1 CustomerId UNION ALL
SELECT 'A' ProductCode, 2 CustomerId UNION ALL
SELECT 'B' ProductCode, 3 CustomerId UNION ALL
SELECT 'A' ProductCode, 4 CustomerId UNION ALL
SELECT 'A' ProductCode, 4 CustomerId UNION ALL
SELECT 'A' ProductCode, 5 CustomerId UNION ALL
SELECT 'B' ProductCode, 5 CustomerId UNION ALL
SELECT 'C' ProductCode, 5 CustomerId
)
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)
Customer 1 has bought 'A' and 'B' and not 'C' and he gets found correctly
Customer 2 has bought 'A' and not 'B' and not 'C' and he gets found incorrectly
Customer 3 has bought not 'A' and 'B' and not 'C' and he gets found incorrectly
Customer 4 has bought two 'A' and not 'B' and not 'C' and he doesn't get found correctly
Customer 5 has bought 'A' and 'B' and 'C' and he doesn't get found correctly
May 11, 2012 at 3:21 am
select * from
(
select id,max(case when cd = 'A' then 1 else 0 end) + max(case when cd = 'B' then 2 else 0 end) + max(case when cd = 'C' then 4 else 0 end) status from a
--where cd in ('A','B','C')
group by id
) dta
where
1=1
--and status = 3 -- a,b but not c
--and status = 5--does have a,c but not b
--and status = 6--does have b,c but not a
--and status = 7 --all 3
--and status = 0 --none of them
--and status = 1 -- just A
--and status = 2 -- just b
and status = 4 -- just C
Replace Id with CustomerId and cd with ProductCode. You can generalize the code the way you want..You can use analytic function instead of group by as well.
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
Viewing 15 posts - 76 through 90 (of 166 total)
You must be logged in to reply to this topic. Login to reply