May 9, 2008 at 4:37 am
I have to build a very strange query that has been bothering me for a few days now and it's starting to get the best of me.
I know it might just have a simple solution, but it has eluded me for a long time and I just can't nail a solution that will return the corect subset on all situations. I'll try do decribe the problem as simple as I can:
Using the standard Client, Order, Products metaphor, let's say I have a table that describes a many-to-many relationship, say ordered Products:
ClientID | ProductID
1 | 30
1 | 35
2 | 30
3 | 30
3 | 31
etc..
Then I have this second table that acts like a "filter table"
ProductID(int) | HasOrdered (bit NULL)
30 | 1
31 | 1
32 | 0
35 | 0
40 | NULL
Based on the data in the second table, I need to get all the clients that have ordered products 30 and 31 but have not ordered products 32 and 35.
However, I also need to select (in the same query) all of the clients that have ordered product 40 (where HasOrdered is NULL) regardless if they have ordered any of other products or not.
The rule is : to select all (clients that have ordered ProductIDs where HasOrdered is null) OR ( clients that have ordered ALL products where HasOrdered =1 AND that have not ordered any of the products where HasOrdered = 0 )
I could never find the corect way of joining all the tables to work in all situations: eg. if there are no rows with HasOrdered=1 or none with HasOrdered =0 or HasOrdered =NULL.
Some situation will always mess up my joins and it will return a wrong subset of clients.
Hope I managed to make sense in the post.
Thanks for anyone dropping a hand to help me !
May 9, 2008 at 6:07 am
Your data does not make sense. The second table does not indicate a client.
And then a tip on posting...
What would help is if you post with some sample data generation.
[font="Courier New"]CREATE TABLE #ClientProduct (ClientID INT, ProductID INT)
INSERT #ClientProduct VALUES (1,30)
INSERT #ClientProduct VALUES (1,35)[/font]
If your sample data includes a case of the records you want returned (and you tell us which ones they are), it will be really for people to get into writing the query rather than having to build something to create the test data.
May 9, 2008 at 6:36 am
Ok, good tip on the SQL scripts.
Let's see :
CREATE TABLE #ClientProduct (ClientID INT, ProductID INT)
INSERT #ClientProduct VALUES (1,30)
INSERT #ClientProduct VALUES (1,35)
INSERT #ClientProduct VALUES (2,31)
INSERT #ClientProduct VALUES (3,40)
INSERT #ClientProduct VALUES (3,41)
INSERT #ClientProduct VALUES (4,30)
INSERT #ClientProduct VALUES (4,41)
CREATE TABLE #ProductFilter (ProductID INT, HasOrdered BIT NULL)
INSERT #ProductFilter VALUES (30,1) -- all clients that have ordered product 30
INSERT #ProductFilter VALUES (41,0) -- but which have not ordered product 41
INSERT #ProductFilter VALUES (42,0) -- and which have not ordered product 42,etc
INSERT #ProductFilter VALUES (40,NULL) -- all clients that ordered 40, regardless of other criteria
In this case, my query needs to return clients 1 and 3:
ID 1 - OK because he ordered product 30 AND he never ordered product 41
ID 2 - rejected because he never ordered product 30
ID 3 - OK because he ordered 40 and the rest of the criteria do not matter
ID 4 - rejected because he ordered product 30, BUT he also ordered product 41
So in this data case I need all clients that:
1. Either have product 40 OR
2. Have product 30 AND don't have either of 41 and 42
It's possible to have multiple rows in the filter too :
DELETE FROM #ProductFilter
INSERT #ProductFilter VALUES (30,1)
INSERT #ProductFilter VALUES (31,1)
INSERT #ProductFilter VALUES (32,1)
INSERT #ProductFilter VALUES (41,0)
INSERT #ProductFilter VALUES (42,0)
INSERT #ProductFilter VALUES (43,0)
INSERT #ProductFilter VALUES (40,NULL)
INSERT #ProductFilter VALUES (50,NULL)
INSERT #ProductFilter VALUES (60,NULL)
means all clients that:
1. Ordered at least one of 40, 50 or 60 OR (doesn't need to order all 3, just one of them is enough)
2. Ordered all of 30,31,32 but none of 41,42,43
Makes more sense now ?:)
May 9, 2008 at 6:39 am
So the product filter table applies to every client?
May 9, 2008 at 6:47 am
Well, in reality there are a few more tables involved, but for the sake of this problem YES, it applies to all clients in the #ClientProduct table.
May 9, 2008 at 6:59 am
Ok, there is probably a more efficient way than this, but keeping it simple and just creating sub-queries (using CTE syntax to make them look nice) for your criteria, this seems to work:
[font="Courier New"]; WITH MustHaveOrdered (ClientID)
AS (SELECT ClientID FROM #ProductFilter PF
INNER JOIN #ClientProduct CP ON CP.ProductID = PF.ProductID WHERE PF.HasOrdered = 1)
, MustHaveNotOrdered (ClientID)
AS (SELECT ClientID FROM #ProductFilter PF
INNER JOIN #ClientProduct CP ON CP.ProductID = PF.ProductID WHERE PF.HasOrdered = 0)
, IncludeAll (ClientID)
AS (SELECT ClientID FROM #ProductFilter PF
INNER JOIN #ClientProduct CP ON CP.ProductID = PF.ProductID WHERE PF.HasOrdered IS NULL)
SELECT DISTINCT
CP.ClientID
FROM
#ClientProduct CP
WHERE
(ClientID IN (SELECT ClientID FROM MustHaveOrdered)
AND ClientID NOT IN (SELECT ClientID FROM MustHaveNotOrdered))
OR
(ClientID IN (SELECT ClientID FROM IncludeAll))[/font]
May 9, 2008 at 7:15 am
It seems to work indeed. Haven't used CTE before :-), looks nice.
Only problem is the MustHaveNotOrdered table. In my real-life case it can mean a list of 1.000.000 rows (ok, so they're not clients :P).
I hate to keep asking for free lunches like that, but can you see a way to join MustHaveOrdered and MustHaveNotOrdered into only 1 CTE?
Oh, and thanks for helping out !
May 9, 2008 at 7:30 am
As a CTE, even if there were the same CTE, they would get translated into individual sub-queries. So, combining them just makes the logic complicated as does not help performance.
Let me do my job for a few minutes then I will change this into some join logic.
May 9, 2008 at 7:53 am
[font="Courier New"]SELECT
CP.ClientID
FROM
#ClientProduct CP
LEFT JOIN #ProductFilter MHO ON MHO.ProductID = CP.ProductID AND MHO.HasOrdered = 1
LEFT JOIN #ProductFilter MNHO ON MNHO.ProductID = CP.ProductID AND MNHO.HasOrdered = 0
LEFT JOIN #ProductFilter IA ON IA.ProductID = CP.ProductID AND IA.HasOrdered IS NULL
GROUP BY
CP.ClientID
HAVING
(COUNT(MHO.ProductID) > 0
AND COUNT(MNHO.ProductID) = 0)
OR (COUNT(IA.ProductID) > 0)[/font]
May 9, 2008 at 7:56 am
Looks great ! Let me test this on the real DB for a while. Sounds just like what I needed !
Forever in debt !;)
Edit:Actually, it needs a bit of editing. unless I'm mistaken it will return clients that ordered AT LEAST one product => COUNT(MHO.ProductID) > 0
whereas I need clients that ordered all
COUNT(MHO.ProductID) = (SELECT count(*) from ProductFilter where HasOrdered =1)
no?
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply