November 5, 2011 at 5:12 am
Hi
Consider this table of account balances:
use tempdb
go
CREATE TABLE balances (
transactionId INT PRIMARY KEY,
dt datetime,
balance money
)
go
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(1,'Sep 5 2011 12:00:00:000AM',43000.0000)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(2,'Sep 5 2011 12:01:00:000AM',42355.0000)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(3,'Sep 9 2011 8:56:00:000AM',42269.0000)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(4,'Sep 9 2011 3:24:14:000PM',42110.7600)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(5,'Sep 12 2011 11:44:33:000AM',42712.7600)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(6,'Sep 15 2011 6:12:01:000PM',41637.7600)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(7,'Sep 16 2011 3:29:06:000PM',40777.7600)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(8,'Sep 19 2011 6:50:00:000PM',42132.2600)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(9,'Sep 20 2011 8:00:00:000PM',42129.6800)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(10,'Sep 21 2011 5:18:26:000PM',43651.8800)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(11,'Sep 21 2011 6:23:00:000PM',44959.5100)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(12,'Sep 21 2011 8:00:00:000PM',44951.3400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(13,'Sep 25 2011 10:09:01:000PM',45166.3400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(14,'Sep 26 2011 6:54:26:000AM',46241.3400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(15,'Sep 26 2011 9:39:15:000AM',47875.3400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(16,'Sep 27 2011 3:38:33:000PM',49219.5200)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(17,'Sep 29 2011 4:34:18:000PM',52552.0200)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(18,'Oct 3 2011 10:36:54:000AM',53842.0200)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(19,'Oct 4 2011 8:13:08:000PM',50591.2200)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(20,'Oct 10 2011 8:00:00:000PM',50549.0800)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(21,'Oct 11 2011 2:57:22:000PM',50936.0800)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(22,'Oct 11 2011 8:00:00:000PM',50906.8400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(23,'Oct 13 2011 5:31:54:000PM',53435.2400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(24,'Oct 17 2011 6:17:58:000PM',52145.2400)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(25,'Oct 17 2011 8:00:00:000PM',52126.3200)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(26,'Oct 18 2011 5:44:22:000PM',56660.6700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(27,'Oct 19 2011 12:13:46:000AM',58436.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(28,'Oct 20 2011 9:25:54:000AM',54781.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(29,'Oct 20 2011 7:30:42:000PM',56415.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(30,'Oct 21 2011 6:28:13:000AM',63725.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(31,'Oct 21 2011 1:56:48:000PM',56716.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(32,'Oct 21 2011 8:05:57:000PM',54050.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(33,'Oct 26 2011 9:16:11:000AM',56458.5700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(34,'Oct 26 2011 10:33:48:000PM',59477.1700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(35,'Oct 27 2011 3:58:23:000AM',53887.1700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(36,'Oct 27 2011 12:29:41:000PM',51565.1700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(37,'Oct 27 2011 1:29:03:000PM',47114.6700)
INSERT INTO [balances] ([transactionId],[dt],[balance])VALUES(38,'Nov 1 2011 7:24:00:000AM',41309.6700)
I want to return, for each month, the starting balance, the ending balance and the percent change. It's a little trickier than I first thought because for each month I need to find the balance of the max date of that month and the balance of the max date of the previous month and then calculate the % change.
Any ideas?
November 5, 2011 at 7:23 am
Based on your sample data, what would be the expected result?
November 5, 2011 at 7:23 am
Have you looked at this article by Jeff Moden. It might give you enough info to solve your problem.
November 5, 2011 at 7:40 am
bitbucket-25253 (11/5/2011)
Have you looked at this article by Jeff Moden. It might give you enough info to solve your problem.
Ron, I don't think it's a running total scenario. It seems more like using the max data per month. But since I wasn't sure, I asked for the expected result...
November 5, 2011 at 9:20 am
LutzM (11/5/2011)
Based on your sample data, what would be the expected result?
Hi Lutz,
The expected result is
MONTH startBalance endBalance pctChange
============== ============ ========== =========
September 2011 43000.00 52552.02 22.21
October 2011 52552.02 47114.67 -10.35
November 2011 47114.67 41309.67 -12.32
Thanks for looking!
November 5, 2011 at 10:11 am
LutzM (11/5/2011)
bitbucket-25253 (11/5/2011)
Have you looked at this article by Jeff Moden. It might give you enough info to solve your problem.Ron, I don't think it's a running total scenario. It seems more like using the max data per month. But since I wasn't sure, I asked for the expected result...
Oops on my part. I jumped to a an assumption, and well, you know what that makes of me
November 5, 2011 at 10:18 am
Here's my first stab at it:
WITH BalanceDates
AS
(
SELECT CurMonthEnd, PrevMonthEnd, MIN(dt) AS FirstTranDate, MAX(dt) AS LastTranDate
FROM (
SELECT DATEADD(MM,DATEDIFF(MM, 0, dt), -1) AS PrevMonthEnd,
DATEADD(MM,DATEDIFF(MM, 0, dt)+1, -1) AS CurMonthEnd,
dt
FROM balances
) AS b
GROUP BY CurMonthEnd, PrevMonthEnd
),
MonthlyBalances
AS
(
SELECT CurMonthEnd, PrevMonthEnd, bmin.balance AS StartingBalance, bmax.balance AS EndingBalance
FROM BalanceDates AS b INNER JOIN
balances AS bmin ON b.FirstTranDate = bmin.dt INNER JOIN
balances AS bmax ON b.LastTranDate = bmax.dt
)
SELECT DATENAME(MM,curr.CurMonthEnd) + ' ' + CONVERT(CHAR(4),DATEPART(YYYY,curr.CurMonthEnd)) AS CalendarMonth,
curr.StartingBalance, curr.EndingBalance,
(curr.EndingBalance - COALESCE(prev.EndingBalance,curr.StartingBalance))/COALESCE(prev.EndingBalance,curr.StartingBalance)*100 AS PctChange
FROM MonthlyBalances AS curr LEFT OUTER JOIN
MonthlyBalances AS prev ON curr.PrevMonthEnd = prev.CurMonthEnd
I doubt this is the most efficient way of doing it though, as it results in the balances table being read six times...
EDIT: Forgot to deal with the precision issues when calculating the PctChange column...will post fix later when I have time, on my way out the door at the moment...
November 5, 2011 at 10:39 am
Here's a slightly different approach based on the CROSS APPLY method.
One thing making the code more complex is the "interpretation" of the very first balance being the start balance of the month whereas for all other month the latest value of the prev. month is used....
;
-- build a table including the first value with a date of the prev. month.
WITH cte AS
(
SELECT dt,balance FROM [balances]
UNION ALL
SELECT TOP 1 DATEADD(mm,DATEDIFF(mm,0,dt),0)-1,balance
FROM [balances]
ORDER BY dt
)
SELECT
DATENAME(MM,mnth) + ' ' + CONVERT(CHAR(4),DATEPART(YYYY,mnth)) AS CalendarMonth,
x.startBalance,
x2.endBalance,
(x2.endBalance/x.startBalance -1) *100 AS pctChange
FROM
( -- create a pseudo-calendar table
SELECT TOP 5
DATEADD(mm,-number,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) AS mnth,
DATEADD(mm,-number+1,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) AS nextmnth
FROM master..spt_values
WHERE type ='P'
) cte
-- 1st cross appy to get the start balance
CROSS APPLY
(
SELECT TOP 1 balance AS startBalance
FROM cte
WHERE dt<mnth ORDER BY dt DESC
)x
-- 2nd cross appy to get the end balance
CROSS APPLY
(
SELECT TOP 1 balance AS endBalance
FROM [balances]
WHERE dt<nextmnth ORDER BY dt DESC
)x2
ORDER BY mnth
November 5, 2011 at 10:42 am
I forgot: you might also want to add an index on dt with balance included to improve performance. Otherwise there'll be additional sort operations that should be avoided.
November 5, 2011 at 12:54 pm
Awesome Lutz, thanks so much for your help once again! Perfect, elegant solution and thanks for the index tip too.
November 6, 2011 at 12:38 am
You folks know me... I'll be one of the first to use a "Pseudo-Calendar" table if it's warrented. I don't believe it is in this case. We also don't need to have a CTE that adds one row (which looks like there's no guarantee which row that might be in Lutz' code because no MIN and no ORDER BY).
First, let's setup some data for performance testing. A million rows across 12 years should do it... 😉
/******************************************************************************
This section of code builds a shedload of test data and is NOT a part of the
solution. We're just building test data here.
******************************************************************************/
--===== Declare some obviously named variables to control how much data we want
-- and across which years
DECLARE @StartYear DATETIME,
@EndYear DATETIME,
@NextYear DATETIME,
@Rows INT
;
SELECT @StartYear = '2000',
@EndYear = '2011',
@Rows = 1000000,
@NextYear = DATEADD(yy,1,@EndYear)
;
--=============================================================================
-- Create the test data. This is NOT a part of the solution.
-- Usually takes something less than 12 seconds to generate.
--=============================================================================
--===== Conditionally drop the test table to make reruns easier.
IF OBJECT_ID('tempdb..#Balances','U') IS NOT NULL
DROP TABLE #Balances
;
--===== Create the table and populate it on the fly.
-- This builds a table with random dates and amounts from @StartYear
-- through @EndYear
SELECT TOP (@Rows)
TransactionID = IDENTITY(INT,1,1),
Dt = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,@StartYear,@NextYear) + CAST(@StartYear AS DATETIME),
Balance = CAST(RAND(CHECKSUM(NEWID()))*100 AS MONEY)
INTO #Balances
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
--===== Add a clustered index to the table just to keep it from being a heap
-- and to help with speed a bit by sorting the data.
CREATE CLUSTERED INDEX IX_Balances_Dt
ON #Balances (Dt)
;
ALTER TABLE #Balances
ADD PRIMARY KEY NONCLUSTERED (TransactionID)
;
DO notice, please, that I've made my indexing recommendation in the code. I recommend that the PK on the TransactionID be a NONCLUSTERED PK and that the index on the Dt column be a CLUSTERED index.
I believe Lutz' code was setup to only work with the previous 5 months. I've changed the "5" to 144 to work with 12 years of months but that's the only change I made. This isn't a slam on Lutz... a lot of folks wouldn't have even be able to come up with a solution no matter how long it took. My main purpose in this is to show that what SQL Server labels as a SCAN is frequently a SEEK in disguise.{EDIT... Holy Moly! Did I ever get THAT backwards. Not sure what was on my mind when I wrote that but I typed that exactly backwards! My apologies. :blush:} I'll let the code do the rest of the talking...
PRINT '========== Lutz Code 01 ================================================='
SET STATISTICS TIME,IO ON;
WITH cte AS
(
SELECT dt,balance FROM #balances
UNION ALL
SELECT TOP 1 DATEADD(mm,DATEDIFF(mm,0,dt),0)-1,balance
FROM #balances
ORDER BY dt
)
SELECT
DATENAME(MM,mnth) + ' ' + CONVERT(CHAR(4),DATEPART(YYYY,mnth)) AS CalendarMonth,
x.startBalance,
x2.endBalance,
(x2.endBalance/x.startBalance -1) *100 AS pctChange
FROM
( -- create a pseudo-calendar table
SELECT
DATEADD(mm,-number,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) AS mnth,
DATEADD(mm,-number+1,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) AS nextmnth
FROM master..spt_values
WHERE type ='P'
AND Number < 144
) cte
-- 1st cross appy to get the start balance
CROSS APPLY
(
SELECT TOP 1 balance AS startBalance
FROM cte
WHERE dt<mnth ORDER BY dt DESC
)x
-- 2nd cross appy to get the end balance
CROSS APPLY
(
SELECT TOP 1 balance AS endBalance
FROM [#balances]
WHERE dt<nextmnth ORDER BY dt DESC
)x2
ORDER BY mnth
;
SET STATISTICS TIME,IO OFF;
PRINT '========== Moden Code 01 ================================================'
SET STATISTICS TIME,IO ON;
--===== Conditionally drop the test table to make reruns easier.
IF OBJECT_ID('tempdb..#WorkTable','U') IS NOT NULL
DROP TABLE #WorkTable
;
--===== Supress the auto-display of rowcounts
SET NOCOUNT ON
;
--===== Solve the first part of the problem... get the correct dates
-- and the Balances for those dates. We stuff all of this into
-- a temp table (144 rows) because we want to do a self join later.
-- If we do that on a CTE, the CTE will run twice.
SELECT RowNum = IDENTITY(INT,1,1),
StartDate = dates.StartDate,
Balance = bal.Balance
INTO #WorkTable
FROM (
SELECT MIN(Dt)
FROM #Balances
-- WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
UNION ALL
SELECT MAX(Dt)
FROM #Balances
-- WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
GROUP BY DATEDIFF(mm,0,Dt)
) dates (StartDate)
INNER JOIN #Balances bal
ON dates.StartDate = bal.Dt
ORDER BY dates.StartDate
;
--===== Do an "offset" self-join so se can get starting and ending balances
-- and do the necessary calulations for final display.
SELECT CalendarMonth = DATENAME(mm,hi.StartDate)+' '+DATENAME(yy,hi.StartDate),
StartingBalance = lo.Balance,
EndingBalance = hi.Balance,
PercentChange = (hi.Balance/lo.Balance -1) * 100
FROM #WorkTable lo
INNER JOIN #WorkTable hi
ON lo.RowNum+1 = hi.RowNum
ORDER BY lo.RowNum
SET STATISTICS TIME,IO ON;
Notice that I violated the rules of "SARGability" with "GROUP BY DATEDIFF(mm,0,Dt)" and it DOES cause an INDEX SCAN. My code should be a lot slower because of that, right? Here are the results... we don't always get this lucky with non-SARGable code but you just don't know until you try on a large amount of data.
========== Lutz Code 01 =================================================
Table '#Balances___________________________________________________________________________________________________________000000000027'.
[font="Arial Black"]Scan count 431, logical reads 254859,[/font] physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. [font="Arial Black"]Scan count 288, logical reads 1149,[/font] physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'spt_values'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
[font="Arial Black"] CPU time = 69860 ms, elapsed time = 73053 ms.[/font]
========== Moden Code 01 ================================================
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Table '#Balances___________________________________________________________________________________________________________000000000027'.
[font="Arial Black"] Scan count 146, logical reads 2722,[/font] physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
[font="Arial Black"] CPU time = 1625 ms, elapsed time = 1741 ms.[/font]Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#WorkTable__________________________________________________________________________________________________________000000000029'.
[font="Arial Black"]Scan count 2, logical reads 4,[/font] physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
[font="Arial Black"] CPU time = 0 ms, elapsed time = 19 ms.[/font]
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 6, 2011 at 2:36 am
Jeff Moden (11/6/2011)
My main purpose in this is to show that what SQL Server labels as a SCAN is frequently a SEEK in disguise.
The query plan I get looks like this (just the important part shown):
Perhaps we get different plans, anyway my guess is you were looking to show that for a MAX or a MIN, a scan of an index might start at one end or the other and just return one row (still a scan, not a seek). Query plans do not lie, in general, but the quality of the interpretation does vary a bit 🙂
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
November 6, 2011 at 7:28 am
That's great stuff Jeff! I love this forum, I keep learning better ways to do things. 🙂
November 6, 2011 at 4:26 pm
SQL Kiwi (11/6/2011)
Jeff Moden (11/6/2011)
My main purpose in this is to show that what SQL Server labels as a SCAN is frequently a SEEK in disguise.The query plan I get looks like this (just the important part shown):
Perhaps we get different plans, anyway my guess is you were looking to show that for a MAX or a MIN, a scan of an index might start at one end or the other and just return one row (still a scan, not a seek). Query plans do not lie, in general, but the quality of the interpretation does vary a bit 🙂
How VERY embarrassing. :blush: Thanks for the catch, Paul. I typed that exactly backwards. The whole purpose of my post was to show that INDEX SCANS, even CLUSTERED INDEX SCANS (the same table scans, in reality) aren't the proverbial "Boogy-Man" that a lot of folks think and that, sometimes they can't be nor should be necessarily avoided. That'll teach me for trying to write something at 2:38 in the morning.
I went back and corrected the previous post and the following is more of what I was trying to get at.
Let's use DBCC TIMEWARP to go back in time and see what I was talking about and why I say the execution plan lies (or, in Paul's words, can be misinterpreted). Here's the code I originally used to build the data. It has the CLUSTERED index on the "Dt" column, but not the PK constraint/index.
/******************************************************************************
This section of code builds a shedload of test data and is NOT a part of the
solution. We're just building test data here.
******************************************************************************/
--===== Declare some obviously name variables to control how much data we want
-- and across which years
DECLARE @StartYear DATETIME,
@EndYear DATETIME,
@NextYear DATETIME,
@Rows INT
;
SELECT @StartYear = '2000',
@EndYear = '2011',
@Rows = 1000000,
@NextYear = DATEADD(yy,1,@EndYear)
;
--=============================================================================
-- Create the test data. This is NOT a part of the solution.
-- Usually takes something less than 12 seconds to generate.
--=============================================================================
--===== Conditionally drop the test table to make reruns easier.
IF OBJECT_ID('tempdb..#Balances','U') IS NOT NULL
DROP TABLE #Balances
;
--===== Create the table and populate it on the fly.
-- This builds a table with random dates and amounts from @StartYear
-- through @EndYear
SELECT TOP (@Rows)
TransactionID = IDENTITY(INT,1,1),
Dt = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,@StartYear,@NextYear) + CAST(@StartYear AS DATETIME),
Balance = CAST(RAND(CHECKSUM(NEWID()))*100 AS MONEY)
INTO #Balances
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
--===== Add a clustered index to the table just to keep it from being a heap
-- and to help with speed a bit by sorting the data.
CREATE CLUSTERED INDEX IX_Balances_Dt
ON #Balances (Dt)
;
Then, if we run the first part of the solution code to find the correct dates...
PRINT '========== Moden Code 01 ================================================'
SET STATISTICS TIME,IO ON;
--===== Conditionally drop the test table to make reruns easier.
IF OBJECT_ID('tempdb..#WorkTable','U') IS NOT NULL
DROP TABLE #WorkTable
;
--===== Supress the auto-display of rowcounts
SET NOCOUNT ON
;
--===== Solve the first part of the problem... get the correct dates
-- and the Balances for those dates. We stuff all of this into
-- a temp table (144 rows) because we want to do a self join later.
-- If we do that on a CTE, the CTE will run twice.
SELECT RowNum = IDENTITY(INT,1,1),
StartDate = dates.StartDate,
Balance = bal.Balance
INTO #WorkTable
FROM (
SELECT MIN(Dt)
FROM #Balances
-- WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
UNION ALL
SELECT MAX(Dt)
FROM #Balances
-- WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
GROUP BY DATEDIFF(mm,0,Dt)
) dates (StartDate)
INNER JOIN #Balances bal
ON dates.StartDate = bal.Dt
ORDER BY dates.StartDate
;
.. we get the following execution plan (just the important part shown):
Notice the CLUSTERED INDEX SCAN on the #Balances table in the graphic above. On a million row table, that will normally concern a fair number of people (including me, initially) and that will drive them (including me, initially :blush:) to trying something more sophisticated (which isn't actually a bad thing... gotta know which way is best).
We CAN get the CLUSTERED INDEX SCAN to change to a CLUSTERED INDEX SEEK on the execution plan by simply uncommenting the "WHERE Dt > 0" lines of code...
PRINT '========== Moden Code 01 ================================================'
SET STATISTICS TIME,IO ON;
--===== Conditionally drop the test table to make reruns easier.
IF OBJECT_ID('tempdb..#WorkTable','U') IS NOT NULL
DROP TABLE #WorkTable
;
--===== Supress the auto-display of rowcounts
SET NOCOUNT ON
;
--===== Solve the first part of the problem... get the correct dates
-- and the Balances for those dates. We stuff all of this into
-- a temp table (144 rows) because we want to do a self join later.
-- If we do that on a CTE, the CTE will run twice.
SELECT RowNum = IDENTITY(INT,1,1),
StartDate = dates.StartDate,
Balance = bal.Balance
INTO #WorkTable
FROM (
SELECT MIN(Dt)
FROM #Balances
WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
UNION ALL
SELECT MAX(Dt)
FROM #Balances
WHERE Dt > 0 --Will turn the scans into seeks but makes no difference. Execution plan lies.
GROUP BY DATEDIFF(mm,0,Dt)
) dates (StartDate)
INNER JOIN #Balances bal
ON dates.StartDate = bal.Dt
ORDER BY dates.StartDate
;
That will give us a CLUSTERED INDEX SEEK instead of the scan we got before.
But, is it really a SEEK? Yeah... according to SQL Server, it did a SEEK. But only to find the "first" row with a date greater than "0" (1900-01-01) and then it reverts to a SCAN behind the scenes. It still has to read all 1 million rows to do the job. If you look at the properties of the SEEK, you can see it uses RANGE SCAN instead of PREFIX like it does on the other INDEX SEEK for the #Balances table.
That's what I meant by the "execution plan lies" and as Paul said... it's not normally that the execution plan that lies, it's how some interpret it.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 6, 2011 at 4:31 pm
JonFox (11/6/2011)
That's great stuff Jeff! I love this forum, I keep learning better ways to do things. 🙂
Thanks Jon, but I have to really stop typing at 2 in the morning... I typed what I wanted to say exactly backwards. :blush: To make matters worse, I added the PK as a "final thought" and that made it impossible for anyone to follow what I was saying even if I had managed to type it correctly.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply