June 6, 2014 at 8:36 am
The Wizard Of Oz (6/6/2014)
ben.norris (6/6/2014)
The Wizard Of Oz (6/6/2014)
There's no dependance on ordering, so no cheating going on. It's a natural part of MIN() and MAX() functions.It would be strange if someone caught SwePeso pulling a fast one, but he's "quite good with his SQL" to put it lightly 😀
Are you sure? It looks to me that if C was between B and A then it wouldn't exclude customers who bought C
If we are ever unsure about how some code will behave, we can always test out assumptions on an actual server:
Below I have 3 customers with A, B and C "ordered" differently, but the MIN() and MAX() functions pull out the correct results irrespective of the orderings.
WITH TestTable
AS (
SELECT 1 AS CustomerID, ProductID
FROM (
VALUES ('A'), ('C'), ('B')
) x(ProductID)
UNION ALL
SELECT 2 AS CustomerID, ProductID
FROM (
VALUES ('A'), ('B'), ('C')
) x(ProductID)
UNION ALL
SELECT 3 AS CustomerID, ProductID
FROM (
VALUES ('C'), ('A'), ('B')
) x(ProductID)
)
SELECT CustomerID, MIN(ProductID) AS MinProductID, MAX(ProductID) AS MaxProductID
FROM TestTable
GROUP BY CustomerID
No you misunderstand, I don't mean the ordering of values within the table. Check the code for example to find customers who bought 'A' and 'C' but not 'B', it doesn't work for that.
June 6, 2014 at 8:43 am
To craig 81366: Thank you, 81366. Certainly the problem can be easily solved with cte. I just wanted to make sure that order by does not work with intersect - or something wrong with my app
2000
June 6, 2014 at 8:45 am
Alexander-449406 (6/6/2014)
I have generated test data with 1 000 000 000 000 000 000 000 rows and according to my tests all the queries proposed run within 1 second. You guys can choose the query that has fewer characters!
That is some sort of joke, right? :ermm: I'm struggling to find what's humorous about it. I take it writing jokes isn't your day job?
June 6, 2014 at 8:45 am
ben.norris (6/6/2014)
No you misunderstand, I don't mean the ordering of values within the table. Check the code for example to find customers who bought 'A' and 'C' but not 'B', it doesn't work for that.
Ohhhhh, I get what you mean now!
Well, the scope of the question was clearly "A and B but not C", and yes SwePeso's code depends on C coming after B, but I don't think that's cheating, considering it gives the right answer to the question 🙂
If you wanted "A and C but not B", you could change the HAVING filters to look for MIN(A), MAX(C) and COUNT(*) = 2 😉
June 6, 2014 at 8:53 am
valeryk2000 (6/6/2014)
To craig 81366: Thank you, 81366. Certainly the problem can be easily solved with cte. I just wanted to make sure that order by does not work with intersect - or something wrong with my app2000
Do you have an implicit type conversion?
The following makes mention of the same kind of problem you're experiencing.
The good news is maybe you just need to install a patch.
June 6, 2014 at 8:57 am
The Wizard Of Oz (6/6/2014)
ben.norris (6/6/2014)
No you misunderstand, I don't mean the ordering of values within the table. Check the code for example to find customers who bought 'A' and 'C' but not 'B', it doesn't work for that.
Ohhhhh, I get what you mean now!
Well, the scope of the question was clearly "A and B but not C", and yes SwePeso's code depends on C coming after B, but I don't think that's cheating, considering it gives the right answer to the question 🙂
If you wanted "A and C but not B", you could change the HAVING filters to look for MIN(A), MAX(C) and COUNT(*) = 2 😉
Not quite. This would exclude customers who also bought product D, or bought multiple of A.
June 6, 2014 at 9:09 am
I would suspect this code to run faster than count distinct. Can you perform tests on a big table?
June 6, 2014 at 9:10 am
craig 81366 (6/6/2014)
The Wizard Of Oz (6/6/2014)
ben.norris (6/6/2014)
No you misunderstand, I don't mean the ordering of values within the table. Check the code for example to find customers who bought 'A' and 'C' but not 'B', it doesn't work for that.
Ohhhhh, I get what you mean now!
Well, the scope of the question was clearly "A and B but not C", and yes SwePeso's code depends on C coming after B, but I don't think that's cheating, considering it gives the right answer to the question 🙂
If you wanted "A and C but not B", you could change the HAVING filters to look for MIN(A), MAX(C) and COUNT(*) = 2 😉
Not quite. This would exclude customers who also bought product D, or bought multiple of A.
D gets filtered out before the HAVING, but you're right about the multiple A's.
Code should actually be:
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
GROUP BY CustomerID
HAVING MIN(ProductCode) = 'A'
AND MAX(ProductCode) = 'C'
AND COUNT(DISTINCT ProductCode) = 2;
June 6, 2014 at 9:12 am
valeryk2000 (6/6/2014)
I have two tables with patient ID - one contains several records for one ID (tblLetterFlag), in the second it is a primary key.
Please next time start your own thread instead of tossing a completely different issue into the middle of a discussion about something completely different.
And NEVER sort by ordinal position, use the column name. 😎
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 6, 2014 at 9:12 am
Also, I have an article on a similar topic which may be of interest and it's related to this topic
http://social.technet.microsoft.com/wiki/contents/articles/22165.t-sql-relational-division.aspx
June 6, 2014 at 9:16 am
A few answers have suggested first pivoting the data into a column per product, counting how many of each product were bought.
This makes it rather trivial to write a succint WHERE clause to obtain data matching the desired combination of rules. E.g.
WHERE [A] > 0 AND > 0 AND [C] = 0
More complex rules are also easy to achieve.
E.g. Customers that bought:
* 5 to 9 of A
* Any number of B or C, but not both.
* And did not buy D
This would use the following where clause:
WHERE (A BETWEEN 5 AND 9)
AND (B > 0 OR C > 0)
AND (B = 0 OR C = 0)
AND (D = 0)
As for performance, I tested the following query against the list Jeff tested 2 years ago:
;WITH PivotTable AS (
SELECT CustomerID, [A], , [C]
FROM (
SELECT CustomerID, ProductCode
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
) AS src
PIVOT
(
COUNT(ProductCode)
FOR ProductCode IN ([A], , [C])
) AS pvt
)
SELECT CustomerID
FROM PivotTable
WHERE [A] > 0 AND > 0 AND [C] = 0
It performed in the same ballpark as the majority of the existing queries, but lagged behind the top performers.
Its main benefit is the ability to stipulate complex rules very succinctly.
June 6, 2014 at 9:16 am
This is the way I would have done it and it feels a lot cleaner and easier to read than the example. IMHO
Arjun S (3/28/2012)
Another way to get the result would be to use Intersect and then combine it with Except. The distinct part is handled implicitly.
--===== Find Customers that bought both "A" AND "B"
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('A')
INTERSECT
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('B')
EXCEPT
--===== Find Customers that bought "C".
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('C')
;
June 6, 2014 at 9:34 am
Ordering by column name does not help
June 6, 2014 at 9:37 am
valeryk2000 (6/6/2014)
Ordering by column name does not help
It isn't a logic thing. The reason you sort by column name is because if you order by ordinal position and your query changes you have to change your order by also.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 6, 2014 at 9:43 am
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('A', 'B', 'C')
GROUP BY CustomerID
HAVING MIN(ProductCode) = 'A'
AND MAX(ProductCode) = 'C'
AND SUM(CASE WHEN ProductCode = 'B' THEN 1 ELSE 0 END) = 0;
Avoid DISTINCT. It is a huge performance killer.
N 56°04'39.16"
E 12°55'05.25"
Viewing 15 posts - 106 through 120 (of 166 total)
You must be logged in to reply to this topic. Login to reply