September 24, 2012 at 11:29 am
Let's say your coworker gave you the results of a query, and noted he/she did not include SalesOrderID numbers 43674 and 44295. However, you run that query: (from AdventureWorks)
SELECT [SalesOrderID]
, a.City
,[TotalDue]
FROM [AdventureWorks].[Sales].[SalesOrderHeader] s
INNER JOIN AdventureWorks.Person.Address a
ON s.BillToAddressID = a.AddressID
WHERE City = 'Ottawa'
AND SalesOrderID NOT IN ('43674','44295')
And find that you're grand total for orders = $98689.28, whereas the results the coworker sent has as grand total in sales of $109276.67. So you know he/she made a mistake and excluded the wrong two SalesOrderIDs. Now, obviously, the solution is to call and ask them to open the query and verify the excuded IDs. That's exactly what I did, and issue is resolved. But I got it into my head that it would be cool to be able to somehow figure out the two excluded IDs by summing all two number combinations until you found the sum that matched the difference.
This is more of a pet project or brain teaser I've given myself, but I'm not getting very far. A gentle nudge in the right direction would be nice. π
September 24, 2012 at 11:54 am
A wild guess just for fun! Cue all smarter folks to line up and show how inefficient this query is!
SELECT TAB1.SalesOrderId AS ID1,
TAB1.TotalDue AS DUE1,
TAB2.SalesOrderId AS ID2,
TAB2.TotalDue AS DUE2
FROM SALESTABLE TAB1
CROSS JOIN
SALESTABLE TAB2
WHERE TAB1.SalesOrderID <> TAB2.SalesOrderID AND
TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE
September 24, 2012 at 12:25 pm
You know, it worked. It tool 3.58 minutes, but it worked. π
September 24, 2012 at 12:31 pm
Can TotalDue every be less than 0?
September 24, 2012 at 12:37 pm
I don't have anything to test with, but I believe this could be better.
However, the problem is that it's still using a cartesian product to obtain the result.
SELECT TAB1.SalesOrderId AS ID1,
TAB1.TotalDue AS DUE1,
TAB2.SalesOrderId AS ID2,
TAB2.TotalDue AS DUE2
FROM SALESTABLE TAB1
JOIN SALESTABLE TAB2 ON TAB1.SalesOrderID < TAB2.SalesOrderID --Avoid half of the rows.
WHERE TAB1.TotalDue < @SPECIFIED_DIFFERENCE --This are useless unless you have negatives.
AND TAB2.TotalDue < @SPECIFIED_DIFFERENCE --Same as above.
AND TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE
I'm not sure if the conditions added will help but I'm sure the condition for the join must help the performance.
September 24, 2012 at 12:42 pm
Lynn Pettis (9/24/2012)
Can TotalDue every be less than 0?
No. And there are no nulls.
September 24, 2012 at 1:56 pm
Luis Cazares (9/24/2012)
I don't have anything to test with, but I believe this could be better.However, the problem is that it's still using a cartesian product to obtain the result.
SELECT TAB1.SalesOrderId AS ID1,
TAB1.TotalDue AS DUE1,
TAB2.SalesOrderId AS ID2,
TAB2.TotalDue AS DUE2
FROM SALESTABLE TAB1
JOIN SALESTABLE TAB2 ON TAB1.SalesOrderID < TAB2.SalesOrderID --Avoid half of the rows.
WHERE TAB1.TotalDue < @SPECIFIED_DIFFERENCE --This are useless unless you have negatives.
AND TAB2.TotalDue < @SPECIFIED_DIFFERENCE --Same as above.
AND TAB1.TotalDue + TAB2.Totaldue = @SPECIFIED_DIFFERENCE
I'm not sure if the conditions added will help but I'm sure the condition for the join must help the performance.
What you have here is what I was thinking. If the total due for each order can not be less than zero, then it makes sense to restrict the testing to orders where the total due for each order is less than or equal to the total difference. If you could have a negative total, then this check would not be worthwhile.
September 24, 2012 at 2:15 pm
Or we could put an index on the totaldue column π
SELECT TAB1.SalesOrderId AS ID1,
TAB1.TotalDue AS DUE1,
TAB2.SalesOrderId AS ID2,
TAB2.TotalDue AS DUE2
FROM SALESTABLE TAB1
JOIN
SALESTABLE TAB2
ON TAB1.Totaldue = @SPECIFIED_DIFFERENCE - TAB2.Totaldue
September 24, 2012 at 2:29 pm
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 24, 2012 at 6:57 pm
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.
I can do that! And I can examine triplet combinations (or deeper) as well!
DECLARE @Orders Table
(OrderID VARCHAR(4), Amount MONEY)
INSERT INTO @Orders
SELECT 2122, 5400
UNION ALL SELECT '2123', 5500
UNION ALL SELECT '2124', 1500
UNION ALL SELECT '2125', 5700
UNION ALL SELECT '2126', 4500
UNION ALL SELECT '2126', 1500
UNION ALL SELECT '2127', 5200
UNION ALL SELECT '2129', 1000
DECLARE @Amount MONEY = 7000
;WITH UNIQUEnTuples (n, Tuples, Amount) AS (
SELECT 1
-- Add brackets around the OrderID to make later string comparison unique
, CAST('[' + OrderID + ']' AS VARCHAR(max)) COLLATE Latin1_General_BIN
,Amount
FROM @Orders
UNION ALL
SELECT 1 + n.n, a.OrderID + n.Tuples, n.Amount + t.Amount
FROM @Orders t
CROSS APPLY (
SELECT '[' + t.OrderID + ']') a(OrderID)
JOIN UNIQUEnTuples n ON a.OrderID < n.Tuples
WHERE CHARINDEX(a.OrderID, n.Tuples) = 0 AND n < 3 AND
n.Amount + t.Amount <= @Amount
)
SELECT *
FROM UNIQUEnTuples
WHERE Amount = @Amount -- The sum you're intereseted in identifying
For an explanation of this approach, you can take a look at this article: http://www.sqlservercentral.com/articles/sql+n-Tuples/89809/
I've found that the COLLATE I added improves performance slightly. You can modify n < 3 to account for the depth of nTuples you want to examine.
Edit: Added the @Amount variable so I could use it to limit the rows examined in the recursive leg of the rCTE (potentially improves performance).
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 25, 2012 at 8:48 am
Thank you for this solution. I had originally tried to do a recursive cte, but though I'm beginning to understand how they work, I haven't been able to succussfully create my own from scratch. The logic of pulling data from something that doesn't really exist yet is a little much for my linear brain to fully embrace. I'm trying though!
September 25, 2012 at 8:54 am
Hi!
If you are interested in how recursive cte is working, you may check out this blog post by Rob Farley
September 25, 2012 at 8:58 am
dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.I can do that! And I can examine triplet combinations (or deeper) as well!
...
When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 25, 2012 at 11:32 am
ChrisM@Work (9/25/2012)
dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.I can do that! And I can examine triplet combinations (or deeper) as well!
...
When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week π
I knew that article from Dwain would have an answer. Of course my point got lost in the shuffle. The OP said something about knowing WHICH IDs made the sum. My point was that because of the possibility of multiple pairs getting the EXACT IDs would be impossible. Of course you would have to use Dwain's technique to find them. π
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 25, 2012 at 6:24 pm
ChrisM@Work (9/25/2012)
dwain.c (9/24/2012)
Sean Lange (9/24/2012)
And of course if the exact rows matter this can get a LOT more challenging because there may be more than 1 pair where the sum is the total you are looking for.I can do that! And I can examine triplet combinations (or deeper) as well!
...
When you wrote that excellent article Dwain, did you have any idea how often you would have to quote it? I think you would have said "about once a year". In practice it seems like once a week π
Chris - It's interesting that you should say that. When I stumbled upon both the scripts in that article (entirely by accident) I just thought they were neat and banged out the article really quickly. I then later realized they were kinda like a solution looking for a problem. The shortfall of the article (which I remedied to a certain extent in the discussion thread) is that I didn't delve very deeply into their applicability.
And yes, I am a bit surprised at the number of times I've found an opportunity to use them. Apparently there aren't too many methods in SQL to arrive at all combinations like they do (at least not very easily).
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply