May 20, 2010 at 6:49 am
This is an extension to another topic I had posted but I decided to split it off as the problem is different than the original problem of replacing the cursor/while loop with a faster method.
So I am stuck on a problem I am having using the "quirky update" method for replacing a cursor/while loop. The root of the issue is that the update compares row n to row n+1 and updates the contents of row n+1. What I need is a comparison of row n to row n+1 but then update the contents of row n.
Here is the code that displays the problem.
create table #DMVRunningPerformance (
Symbol_vc varchar(255),
DayID_in int,
MV float,
DailyPerf float null,
RunningPerf float null)
CREATE CLUSTERED INDEX [indexdown] ON #DMVRunningPerformance
(
[Symbol_vc] ASC,
[DayID_in] DESC
)
WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
insert into #DMVRunningPerformance
values ('IBM',75006,201048.8987,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75005,200841.5658,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75004,200321.7043,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75003,201120.0467,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75002,201779.8805,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75001,201651.3917,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75000,201651.3917,NULL,NULL)
declare @NewGroup_bt bit=1,
@DailyPerformance float,
@RunningPerformance float,
@NDMarketValue float
update #DMVRunningPerformance
set
@DailyPerformance=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE
(@NDMarketValue-MV)/MV END,
@RunningPerformance=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE
((1 + @RunningPerformance) * (1 + @DailyPerformance)) - 1 END,
DailyPerf=@DailyPerformance,
RunningPerf=@RunningPerformance,
@NDMarketValue=MV,
@NewGroup_bt=0
FROM #DMVRunningPerformance OPTION (MAXDOP 1)
select * from #DMVRunningPerformance
drop table #DMVRunningPerformance
Note that I can't change the order as it would completely change the performance numbers. I am trying to avoid having to roll through the set a second time to shift the numbers up. So to clarify, the MV of row n is compared to MV of row n+1 to get daily performance. This performance number (and the running performance) need to be stored in row n, not row n+ 1.
Thanks in advance,
Robb
May 21, 2010 at 6:48 am
Try this
create table #DMVRunningPerformance (
Symbol_vc varchar(25),
DayID_in int,
MV float,
DailyPerf float null,
RunningPerf float null)
CREATE CLUSTERED INDEX [indexdown] ON #DMVRunningPerformance
(
[Symbol_vc] ASC,
[DayID_in] DESC
)
WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
insert into #DMVRunningPerformance
values ('IBM',75006,201048.8987,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75005,200841.5658,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75004,200321.7043,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75003,201120.0467,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75002,201779.8805,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75001,201651.3917,NULL,NULL)
insert into #DMVRunningPerformance
values ('IBM',75000,201651.3917,NULL,NULL)
declare @NewGroup_bt INT,
@DailyPerformance float,
@RunningPerformance float,
@NDMarketValue float
set @NewGroup_bt =0;
set @RunningPerformance=0;
update B
set
@NDMarketValue=B.MV,
@DailyPerformance = (@NDMarketValue-C.MV)/C.MV ,
B.DailyPerf=@DailyPerformance,
@RunningPerformance= ((1 + @DailyPerformance)*(1 + @RunningPerformance) ) - 1 ,
@NewGroup_bt=@NewGroup_bt+1,
B.RunningPerf=@RunningPerformance
FROM #DMVRunningPerformance b
INNER JOIN #DMVRunningPerformance C
ON b.DAYID_IN=C.DAYID_IN+1
select * from #DMVRunningPerformance
drop table #DMVRunningPerformance
May 21, 2010 at 7:43 am
Robb, can you please confirm that you are running on SQL Server 7 or 2k?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 21, 2010 at 2:41 pm
Running SQL2008
Also Ten Centuries, your solution violates the rules of using the "Quirky Update" by using a Join. I'll add a link to the article on "Quirky Update".
http://www.sqlservercentral.com/articles/T-SQL/68467/[\url]
I don't know that this method is ultimately what I will use as I am skeptical about using it in a production environment, but it is the quickest performing alternative to cursors that I have found.
May 21, 2010 at 3:54 pm
I'm still a little unclear on your requirements. Can you provide the results you are trying to get?
This is a guess and might be the whole thing you're trying to avoid, but what about changing the sort order to ASC on day_id and then using this:
update #DMVRunningPerformance
set
@DailyPerformance=DailyPerf=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE
(MV - @NDMarketValue)/ @NDMarketValue END,
@RunningPerformance=RunningPerf=CASE WHEN @NewGroup_bt=1 THEN 0 ELSE
((1 + @RunningPerformance) * (1 + @DailyPerformance)) - 1 END,
@NDMarketValue=MV,
@NewGroup_bt=0
FROM #DMVRunningPerformance OPTION (MAXDOP 1)
May 24, 2010 at 4:40 am
Robb Melancon (5/21/2010)
Running SQL2008Also Ten Centuries, your solution violates the rules of using the "Quirky Update" by using a Join. I'll add a link to the article on "Quirky Update".
http://www.sqlservercentral.com/articles/T-SQL/68467/[\url]
I don't know that this method is ultimately what I will use as I am skeptical about using it in a production environment, but it is the quickest performing alternative to cursors that I have found.
Hi Robb
A recursive CTE will do the job. Like Seth, I'm unsure of which direction (relative to DayID_in) you wish to run the calculation for RunningPerf. Here's an example which should get you started:
;WITH Calculator AS (
-- Anchor row, tr.Seq = 1
SELECT thisrow.Seq, thisrow.Symbol_vc, thisrow.DayID_in, thisrow.MV,
DailyPerf = CAST((nextrow.MV-thisrow.MV)/thisrow.MV AS FLOAT),
RunningPerf = CAST(NULL AS FLOAT)
FROM #OrderedData thisrow
INNER JOIN #OrderedData nextrow ON nextrow.Seq = thisrow.Seq + 1
WHERE thisrow.Seq = 1
UNION ALL
-- first recursion will be Seq = 2 (thisrow.Seq = 2, lastrow.Seq = 1)
SELECT thisrow.Seq, thisrow.Symbol_vc, thisrow.DayID_in, thisrow.MV,
DailyPerf = CAST((nextrow.MV-thisrow.MV)/thisrow.MV AS FLOAT),
RunningPerf = lastrow.RunningPerf
FROM Calculator lastrow
INNER JOIN #OrderedData thisrow ON thisrow.Seq = lastrow.Seq + 1
INNER JOIN #OrderedData nextrow ON nextrow.Seq = thisrow.Seq + 1
)
SELECT Seq, Symbol_vc, DayID_in, MV, DailyPerf, RunningPerf
FROM Calculator
Note that if you have a unique clustered index on the seq column, as in the example, the SELECT will run in a similar timeframe to a running totals update.
If you want to use the results of the SELECT to update another table, then you would be wise to run the results into another table rather than attempting an UPDATE FROM ... with the rCTE as a table source - performance can be poor.
The SELECT returns 6 of the 7 rows because of the INNER JOIN 'nextrow'. Logically you can't get a value for the last row in any case because it has no next row! However, you can make the row appear in your output by fiddling the join criteria.
Come back if you need help with this. The important point to remember about recursive CTE's is that only the result of the last iteration is 'exposed' to the next.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 24, 2010 at 8:20 am
Thanks for the input guys. The rows need to be calculated in dayid desc order. Here are the expected results.
AssetID_inSymbolDayID_inMV DailyPerf RunningPerf
100IBM75006NULL201048.89870.0010323210.001032321
100IBM75005NULL200841.56580.0025951330.003630133
100IBM75004NULL200321.7043-0.003969482-0.000353759
100IBM75003NULL201120.0467-0.003270067-0.003622669
100IBM75002NULL201779.88050.000637183-0.002987795
100IBM75001NULL201651.39170 -0.002987795
100IBM75000NULL201651.39170 -0.002987795
It's a bit odd because you need to start on the second row (Todays MV - Yesterdays MV divided by Yesterdays MV) to get the first rows performance but the running is done first row to second row. So 75006 Daily is (201047.8987-200841.5658)/200841.5658 and running is 1 + Current Running (which is 0 on day 75006) * 1 + Daily minus 1. I realize the Running Performance seems backwards meaning that the value for the entire period is actually stored on the first day of the period, but this is done for a reason which is more than you probably want to know about calculating market performance for periods.... anyway hope this calrifies a little more.
May 24, 2010 at 8:26 am
Robb, looks straightforward - but can you please edit and if necessary correct correct the line beginning -
So 75006 Daily is...
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 24, 2010 at 8:39 am
Ok, so I got it. My question now is how does this perform relative to doing the same thing with fast forward read only cursors? I can answer that by implementing this in some more tests but just wondering if you guys have a guess...
Here is the completed code:
create table #DMVRunningPerformance (
Seq int,
LongPosition_bt bit null,
AssetID_in int null,
Symbol_vc varchar(255),
DayID_in int,
Status_vc varchar(255) null,
MV float,
DailyPerf float null,
RunningPerf float null)
CREATE CLUSTERED INDEX [indexdown] ON #DMVRunningPerformance
(
[Seq] ASC,
[LongPosition_bt] ASC,
[AssetID_in] ASC,
[DayID_in] DESC
)
WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
insert into #DMVRunningPerformance
values (1,1,100,'IBM',75006,NULL,201048.8987,NULL,NULL)
insert into #DMVRunningPerformance
values (2,1,100,'IBM',75005,NULL,200841.5658,NULL,NULL)
insert into #DMVRunningPerformance
values (3,1,100,'IBM',75004,NULL,200321.7043,NULL,NULL)
insert into #DMVRunningPerformance
values (4,1,100,'IBM',75003,NULL,201120.0467,NULL,NULL)
insert into #DMVRunningPerformance
values (5,1,100,'IBM',75002,NULL,201779.8805,NULL,NULL)
insert into #DMVRunningPerformance
values (6,1,100,'IBM',75001,NULL,201651.3917,NULL,NULL)
insert into #DMVRunningPerformance
values (7,1,100,'IBM',75000,NULL,201651.3917,NULL,NULL)
declare @AssetID_in int=-1,
@NDAssetID_in int=-1,
@LongPosition_bt bit=0,
@Status_vc varchar(255)=NULL,
@NewGroup_bt bit=1,
@DailyPerformance float,
@RunningPerformance float,
@NDMarketValue float
;WITH Calculator AS (
-- Anchor row, tr.Seq = 1
SELECT thisrow.Seq, thisrow.Symbol_vc, thisrow.DayID_in, thisrow.MV,
DailyPerf=CAST((thisrow.MV-nextrow.MV)/nextrow.MV AS FLOAT),
RunningPerf = CAST((thisrow.MV-nextrow.MV)/nextrow.MV AS FLOAT)
FROM #DMVRunningPerformance thisrow
INNER JOIN #DMVRunningPerformance nextrow ON nextrow.Seq = thisrow.Seq + 1
WHERE thisrow.Seq = 1
UNION ALL
-- first recursion will be Seq = 2 (thisrow.Seq = 2, lastrow.Seq = 1)
SELECT thisrow.Seq, thisrow.Symbol_vc, thisrow.DayID_in, thisrow.MV,
DailyPerf = CAST((thisrow.MV-nextrow.MV)/nextrow.MV AS FLOAT),
RunningPerf = ((1 + lastrow.RunningPerf) * (1 + CAST((thisrow.MV-nextrow.MV)/nextrow.MV AS FLOAT))) - 1
FROM Calculator lastrow
INNER JOIN #DMVRunningPerformance thisrow ON thisrow.Seq = lastrow.Seq + 1
INNER JOIN #DMVRunningPerformance nextrow ON nextrow.Seq = thisrow.Seq + 1
)
SELECT Seq, Symbol_vc, DayID_in, MV, DailyPerf, RunningPerf
FROM Calculator
drop table #dmvrunningperformance
May 24, 2010 at 8:47 am
Hi Robb
There were some performance comparisons posted here earlier in the year. IIRC the running totals update did a million rows in about 6 seconds, the rCTE method took a little less than twice that.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 24, 2010 at 9:16 am
So I'm running into the following:
The statement terminated. The maximum recursion 100 has been exhausted before statement completion
And looks like the max is 32,000, did you run this on the million row table?
May 24, 2010 at 9:19 am
Although the max recursion is only a bit over 32K, any change in criteria will restart that recursion. You could easily run it on millions of rows... or it could crap out at 33k, it all depends on the data distribution.
May 24, 2010 at 9:26 am
Robb Melancon (5/24/2010)
So I'm running into the following:The statement terminated. The maximum recursion 100 has been exhausted before statement completion
And looks like the max is 32,000, did you run this on the million row table?
OPTION (MAXRECURSION 0)
I'm still looking for the thread, Robb...and yes, a million rows in about 9 seconds using a recursive CTE. BTW credit to Paul White for discovering the performance improvement returned by ensuring the clustered index is unique.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 25, 2010 at 9:51 am
Hi Robb, here's some stuff for you to play with. Note that the join to the recursive part in the rCTE is critical.
DROP TABLE #Numbers
SELECT TOP 1000000 --000
n = ROW_NUMBER() OVER (ORDER BY a.name),
CalcValue = CAST(NULL AS BIGINT)
INTO #Numbers
FROM master.dbo.syscolumns a, master.dbo.syscolumns b
CREATE UNIQUE CLUSTERED INDEX CIn ON #Numbers ([n] ASC)
-- (1,000,000 row(s) affected)
SET STATISTICS IO ON
SET STATISTICS TIME ON
-- 'Quirky' update
DECLARE @Lastval INT = 0, @CalcValue BIGINT
UPDATE #Numbers SET
@CalcValue = CalcValue = (@Lastval + n),
@Lastval = n
-- (1,000,000 row(s) affected) / CPU time = 2968 ms, elapsed time = 3079 ms.
-- Table #Numbers... Scan count 1, logical reads 3113, physical reads 6, read-ahead reads 3146, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- Recursive CTE
;WITH Calculator (n, CalcValue) AS (
SELECT n.n, CalcValue = CAST(n.n AS BIGINT)
FROM #Numbers n
WHERE n.n = 1
UNION ALL
SELECT n.n, CalcValue = n.n + c.n
FROM #Numbers n
INNER JOIN Calculator c ON n.n = c.n + 1 -- nice
--INNER JOIN Calculator c ON c.n + 1 = n.n -- nice
--INNER JOIN Calculator c ON c.n = n.n - 1 -- slow
--INNER JOIN Calculator c ON n.n - c.n = 1 -- slow
--INNER JOIN Calculator c ON c.n - n.n = -1 -- slow
)
SELECT n, CalcValue
FROM Calculator
OPTION (MAXRECURSION 0)
-- (1,000,000 row(s) affected) / CPU time = 33297 ms, elapsed time = 36161 ms.
-- Table 'Worktable'. Scan count 2, logical reads 6000001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SET STATISTICS IO Off
SET STATISTICS TIME Off
There's some more here.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply