April 30, 2012 at 8:31 pm
p.s. Beware of using execution plan cost comparisons. Even the actual execution plans are far from perfect.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 30, 2012 at 8:32 pm
Mea culpa! I missed your comment about the 0 based tally table. Works fine when I fix that although my comment about the query plan cost holds.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 30, 2012 at 8:34 pm
Jeff Moden (4/30/2012)
p.s. Beware of using execution plan cost comparisons. Even the actual execution plans are far from perfect.
I know and have seen cases where the plan cost doesn't match with the actual results.
I really only use it as a first approximation. I then use my gut to tell me I should test further. Or when I'm challenged of course.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 30, 2012 at 8:40 pm
I've taken a bit of the mystery out of the plan costs by converting the tally table used in my first solution to this one.
;WITH Nbrs_2 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n)-1 As n FROM Nbrs)
,AllDates AS (
SELECT Date, Amount, Category
,ROW_NUMBER() OVER (PARTITION BY (SELECT NULL) ORDER BY Date) As rk
FROM @abc)
SELECT DATEADD(day, n, a1.Date) As Date, a1.Amount, a1.Category
FROM AllDates a1
CROSS APPLY (SELECT n FROM Tally WHERE n < DATEDIFF(day, a1.Date,
ISNULL((SELECT Date FROM AllDates a2 WHERE a1.rk+1 = a2.rk),DATEADD(day,1,a1.Date)))) x
Now this one ties with yours but my second solution comes up lowest.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 30, 2012 at 8:48 pm
dwain.c (4/30/2012)
Now this one ties with yours but my second solution comes up lowest.
How did you measure?
--Jeff Moden
Change is inevitable... Change for the better is not.
April 30, 2012 at 9:32 pm
Jeff Moden (4/30/2012)
dwain.c (4/30/2012)
Now this one ties with yours but my second solution comes up lowest.How did you measure?
Actual execution plan costs.
Before you say again that the actual execution plan costs don't tell the whole story, in your experience how often are they wrong (roughly)?
In my limited experience, I'd say they're right about 85% of the time.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 30, 2012 at 9:57 pm
dwain.c (4/30/2012)
Jeff Moden (4/30/2012)
dwain.c (4/30/2012)
Now this one ties with yours but my second solution comes up lowest.How did you measure?
Actual execution plan costs.
Before you say again that the actual execution plan costs don't tell the whole story, in your experience how often are they wrong (roughly)?
In my limited experience, I'd say they're right about 85% of the time.
More than half the time when compared to a profiler run. You also have to be caseful with SEAT STATISTICS. Sometimes just turning one of those settings on can make a query jump from, say, 30 seconds to over 2 minutes.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 30, 2012 at 11:06 pm
Seemingly I have been challenged here.
Rising to the occasion, I can take Jeff's wonderful articles on generating random data and concoct an approach of my own to generate random gaps. I settled on 1000 rows for the test harness (less about 185 for the gaps). The result is this, where I have standardized on the in core tally table for all examples.
DECLARE @abc TABLE (Date DATETIME PRIMARY KEY, Amount INT, Category VARCHAR(10))
DECLARE @NumberOfRows INT,
@StartValue INT,
@EndValue INT,
@Range INT
;
--===== Preset the variables to known values
SELECT @NumberOfRows = 1000,
@StartValue = 400,
@EndValue = 1000,
@Range = @EndValue - @StartValue + 1
-- Create some test data
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT TOP (@NumberOfRows) ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
INSERT INTO @abc
SELECT DATEADD(dd,n,'2009-01-01')
,ABS(CHECKSUM(NEWID())) % @Range + @StartValue
,SUBSTRING('abcdefghijklmnopqrstuvwxyz',ABS(CHECKSUM(NEWID())) % 26 + 1 ,1)
FROM Tally
SELECT @StartValue = 1,
@EndValue = 1000,
@Range = @EndValue - @StartValue + 1
-- Create some gaps in the test data
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT TOP (200) ROW_NUMBER() OVER (ORDER BY n)-1 As n FROM Nbrs)
DELETE a
FROM @abc a
INNER JOIN (
SELECT DATEADD(dd,ABS(CHECKSUM(NEWID())) % @Range + @StartValue, '2009-01-01')
FROM Tally
) x(d) ON Date = d
SELECT * FROM @abc
PRINT 'DWAIN (Padawan learner)''S FIRST QUERY'
SET STATISTICS TIME ON
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n)-1 As n FROM Nbrs)
,AllDates AS (
SELECT Date, Amount, Category
,ROW_NUMBER() OVER (PARTITION BY (SELECT NULL) ORDER BY Date) As rk
FROM @abc)
SELECT DATEADD(day, n, a1.Date) As Date, a1.Amount, a1.Category
FROM AllDates a1
CROSS APPLY (SELECT n FROM Tally WHERE n < DATEDIFF(day, a1.Date,
ISNULL((SELECT Date FROM AllDates a2 WHERE a1.rk+1 = a2.rk),DATEADD(day,1,a1.Date)))) x
SET STATISTICS TIME OFF
PRINT 'DWAIN (Padawan learner)''S SECOND QUERY (add next 2)'
SET STATISTICS TIME ON
DECLARE @start DATETIME, @end DATETIME
SELECT @start = MIN(Date), @end = MAX(Date) FROM @abc
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
,Calendar (d) As (SELECT DATEADD(dd, n-1, @start)
FROM Tally WHERE DATEADD(dd, n-1, @start) BETWEEN @start and @end)
SELECT d
,CASE WHEN Date IS NULL
THEN (SELECT TOP 1 Amount FROM @abc
WHERE d>Date ORDER BY Date DESC) ELSE Amount END As Amount
,CASE WHEN Date IS NULL
THEN (SELECT TOP 1 Category FROM @abc
WHERE d>Date ORDER BY Date DESC) ELSE Category END As Category
FROM Calendar
LEFT OUTER JOIN @abc a ON d=Date
SET STATISTICS TIME OFF
PRINT 'JEFF (Jedi Master)''s SECOND QUERY'
SET STATISTICS TIME ON
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
,cteCount AS
(
SELECT t1.Date,
t1.Amount,
t1.Category,
Days = ISNULL((SELECT TOP (1) DATEDIFF(dd,t1.Date,Date)-1 FROM @abc WHERE Date > t1.Date ORDER BY Date),0)
FROM @abc t1
)
SELECT Date = DATEADD(dd,t.N-1,cte.Date),
cte.Amount,
cte.Category
FROM cteCount cte
INNER JOIN Tally t ON t.N-1 BETWEEN 0 AND Days
ORDER BY cte.Date,t.N-1
SET STATISTICS TIME OFF
Drum roll please. The results are in:
DWAIN (Padawan learner)'S FIRST QUERY
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 23057 ms, elapsed time = 23323 ms.
DWAIN (Padawan learner)'S SECOND QUERY (add next 2)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 235 ms.
JEFF (Jedi Master)'s SECOND QUERY
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 18377 ms, elapsed time = 18485 ms.
It appears that divide and conquer won out by a wide margin! So Jeff, you were right that my first query could be improved upon.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 1, 2012 at 5:25 am
A lot better. Well done! There's just one problem. In order to make my code fit your unit based Tally Table, you changed my code and it no longer works as intended. The original code with the real Tally Table runs a whole lot faster than the changes you made do. I know why... can you see it?
When you post code and say it is mine, make sure you haven't changed the logic of the code. 😉 You changed values of t.N to t.N-1 and that absolutely kills the performance of the code I wrote.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2012 at 6:04 am
Here's the code for the last claim... it contains your second attempt, the changes you made to my code, and my original code.
DECLARE @abc TABLE (Date DATETIME PRIMARY KEY, Amount INT, Category VARCHAR(10))
DECLARE @NumberOfRows INT,
@StartValue INT,
@EndValue INT,
@Range INT
;
--===== Preset the variables to known values
SELECT @NumberOfRows = 1000,
@StartValue = 400,
@EndValue = 1000,
@Range = @EndValue - @StartValue + 1
-- Create some test data
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT TOP (@NumberOfRows) ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
INSERT INTO @abc
SELECT DATEADD(dd,n,'2009-01-01')
,ABS(CHECKSUM(NEWID())) % @Range + @StartValue
,SUBSTRING('abcdefghijklmnopqrstuvwxyz',ABS(CHECKSUM(NEWID())) % 26 + 1 ,1)
FROM Tally
SELECT @StartValue = 1,
@EndValue = 1000,
@Range = @EndValue - @StartValue + 1
-- Create some gaps in the test data
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT TOP (200) ROW_NUMBER() OVER (ORDER BY n)-1 As n FROM Nbrs)
DELETE a
FROM @abc a
INNER JOIN (
SELECT DATEADD(dd,ABS(CHECKSUM(NEWID())) % @Range + @StartValue, '2009-01-01')
FROM Tally
) x(d) ON Date = d
SET STATISTICS TIME OFF
PRINT '============================================================'
PRINT 'DWAIN (Padawan learner)''S SECOND QUERY (add next 2)'
SET STATISTICS TIME ON
DECLARE @start DATETIME, @end DATETIME
SELECT @start = MIN(Date), @end = MAX(Date) FROM @abc
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
,Calendar (d) As (SELECT DATEADD(dd, n-1, @start)
FROM Tally WHERE DATEADD(dd, n-1, @start) BETWEEN @start and @end)
SELECT d
,CASE WHEN Date IS NULL
THEN (SELECT TOP 1 Amount FROM @abc
WHERE d>Date ORDER BY Date DESC) ELSE Amount END As Amount
,CASE WHEN Date IS NULL
THEN (SELECT TOP 1 Category FROM @abc
WHERE d>Date ORDER BY Date DESC) ELSE Category END As Category
FROM Calendar
LEFT OUTER JOIN @abc a ON d=Date
SET STATISTICS TIME OFF
PRINT '============================================================'
PRINT 'JEFF (Jedi Master)''s SECOND QUERY as rewritten by Dwain'
PRINT 'Note that the t.N-1 change to the code kills the performance.'
SET STATISTICS TIME ON
;WITH Nbrs_3 (n) AS (SELECT 1 UNION SELECT 0)
,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)
,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)
,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)
,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)
,cteCount AS
(
SELECT t1.Date,
t1.Amount,
t1.Category,
Days = ISNULL((SELECT TOP (1) DATEDIFF(dd,t1.Date,Date)-1 FROM @abc WHERE Date > t1.Date ORDER BY Date),0)
FROM @abc t1
)
SELECT Date = DATEADD(dd,t.N-1,cte.Date),
cte.Amount,
cte.Category
FROM cteCount cte
INNER JOIN Tally t ON t.N-1 BETWEEN 0 AND Days
ORDER BY cte.Date,t.N-1
SET STATISTICS TIME OFF
PRINT '============================================================'
PRINT 'JEFF (Jedi Master)''s ORIGINAL QUERY'
SET STATISTICS TIME ON;
WITH
cteCount AS
(
SELECT t1.Date,
t1.Amount,
t1.Category,
Days = ISNULL((SELECT TOP (1) DATEDIFF(dd,t1.Date,Date)-1 FROM @abc WHERE Date > t1.Date ORDER BY Date),0)
FROM @abc t1
)
SELECT Date = DATEADD(dd,t.N,cte.Date),
cte.Amount,
cte.Category
FROM cteCount cte
INNER JOIN dbo.Tally t ON t.N BETWEEN 0 AND Days
ORDER BY cte.Date,t.N
;
Here are the results. I believe you're in for a bit of a surprise. 😉
(1000 row(s) affected)
(183 row(s) affected)
============================================================
DWAIN (Padawan learner)'S SECOND QUERY (add next 2)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 828 ms, elapsed time = 995 ms.
============================================================
JEFF (Jedi Master)'s SECOND QUERY as rewritten by Dwain
Note that the t.N-1 change to the code kills the performance.
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 73547 ms, elapsed time = 75927 ms.
============================================================
JEFF (Jedi Master)'s ORIGINAL QUERY
(1000 row(s) affected)
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 147 ms.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2012 at 6:59 am
My humblest apologies sir!
Seems I made several mistakes on this thread this day. Clearly the Force was not with me.
Probably should stick to playing with my cats.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 1, 2012 at 7:30 am
dwain.c (5/1/2012)
My humblest apologies sir!Seems I made several mistakes on this thread this day. Clearly the Force was not with me.
Probably should stick to playing with my cats.
No, no.... mistakes can be good. They cost nothing here and teach much. You did well. Other's would have quit if they had tried at all. Look at how few tried to solve this problem and then realize you were one of them.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 31 through 41 (of 41 total)
You must be logged in to reply to this topic. Login to reply