November 26, 2009 at 1:30 pm
Suppose there's a 3 number lottery that pays out with 1, 2, and 3 number matches. For simplicity's sake, suppose that there are only 3 balls to draw from which are drawn without replacement. I want to create a list of possible winning combinations.
I can do this creating cartesian products and using Union statements. I'd like to simplify it using left joins, but it isn't working. Can anybody explain why? Is there a better way (cross joins perhaps)? Here's the code followed by the query results...
CREATE TABLE #Numbers (Number int)
Insert into #Numbers (Number) values(1)
Insert into #Numbers (Number) values(2)
Insert into #Numbers (Number) values(3)
--Approach 1, this correctly returns 7 rows.
SELECT
Na.Number [Ball 1], NULL [Ball 2], NULL [Ball 3]
FROM #Numbers Na
UNION ALL
SELECT
Na.Number, Nb.Number, NULL
FROM #Numbers Na, #Numbers Nb
WHERE
Na.Number < Nb.Number
UNION ALL
SELECT
Na.Number, Nb.Number, Nc.Number
FROM #Numbers Na, #Numbers Nb, #Numbers Nc
WHERE
Na.Number < Nb.Number
AND
Nb.Number < Nc.Number
--Approach 2, this returns only 4 rows.
SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]
FROM #Numbers Na
LEFT JOIN #Numbers Nb
ON Na.Number < Nb.Number
LEFT JOIN #Numbers Nc
ON Nb.Number < Nc.Number
DROP TABLE #Numbers
The results
This is what I want
Ball 1 Ball 2 Ball 3
------------------------
1 NULL NULL
2 NULL NULL
3 NULL NULL
1 2 NULL
1 3 NULL
2 3 NULL
1 2 3
(7 row(s) affected)
This is missing some rows for some reason
Ball 1 Ball 2 Ball 3
-----------------------
1 2 3
1 3 NULL
2 3 NULL
3 NULL NULL
(4 row(s) affected)
So why doesn't the second query return 1 Null Null, 2 Null Null, and 1 2 Null?
Thanks
Stephen Sjodin
November 26, 2009 at 5:09 pm
Reason: By including NULLS in your SELECTS, you are sort of adding extra NULL values to the sets Nb and Nc...
You can achieve the same by doing this:
SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]
FROM #Numbers Na
LEFT JOIN (SELECT Number FROM #Numbers UNION ALL SELECT NULL) Nb
ON Na.Number < ISNULL(Nb.Number,Na.Number+1)
LEFT JOIN (SELECT Number FROM #Numbers UNION ALL SELECT NULL) Nc
ON Nb.Number < ISNULL(Nc.Number,Nb.Number+1)
Here, I am adding the extra NULLS into the Nb and Nc sets to get the desired result.
This is absolutely NOT scalable - don't even think of going above a 100 or so "Balls" unless you have a lot of patience and ram.
Also, this is not faster or better than the UNION solution - just wanted to show WHY...and HOW...
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 27, 2009 at 6:13 am
Thanks Mr. Magoo.
I think the problem I'm seeing in approach 2 is due to the left joins' "on" statements. For example, when the filter Na.Number < Nb.Number can be satisfied in more than one condition it only returns the last of the possible Na.Number < Nb.Number results. This can be more clearly seen if you add another ball into the code (Insert into #Numbers (Number) values(4)).
1 2 NULL
1 3 NULL
1 4 NULL -- of the combinations starting with "1", only this row is returned.
I guess when the joins are testing for Nx.Number<Ny.Number and Ny.Number doesn't exist, something is throwing a wrench into the gears.
I can see how your solution works - it explicitly introduces a record with a null value to test against rather than letting SQL figure it out. I've achieved the same results by inserting a record with a Null value for the number and changing my code as follows:
Insert into #Numbers (Number) values(NULL) -- a null record to filter on
SELECT Na.Number [Ball 1], Nb.Number [Ball 2], Nc.Number [Ball 3]
FROM #Numbers Na
LEFT JOIN #Numbers Nb
ON Na.Number < Nb.Number OR Nb.Number IS NULL
LEFT JOIN #Numbers Nc
ON Nb.Number < Nc.Number OR (Nc.Number IS NULL AND Nb.Number IS NOT NULL)
WHERE Na.Number IS NOT NULL
ORDER BY Nc.Number, Nb.Number, Na.Number --remove when comparing with Mr. Magoo's code
I would note however that according to the execution metrics in SQL, your query is considerably faster than mine, even when I use your join construct Nx.Number < ISNULL(Ny.Number, Nx.Number+1). It seems adding the null value inline the way you did with the subquery is better than adding an actual null value to the #Numbers table. I don't know why, but clearly you know your stuff.
Anyway, if the issue is indeed about testing records with values against records with Null values versus testing records with values against records that don't exist at all, is there some provision in TSQL for the latter?
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply