September 24, 2012 at 10:00 pm
Hi, Im trying to create a running total based on FieldToSum that will restart everytime runningTotal <= 25 and count the number of times this happens in runningCount.
This needs to be done for both forwards and backwards.
This is currently what i have inherited (made some changes based on articles here already which has helped).
There are approximately 40 million records this needs to be done for.
If you run the following code then that will give the answer im after albeit quite slowly.
Im after any sort of help/advice with this. Also this is my first post so if im missing any information please let me know.
Current code
IF OBJECT_ID('TempDB..#runTotal','U') IS NOT NULL
DROP TABLE #runTotal
CREATE TABLE #runTotal
(
runTotalID INT IDENTITY(1,1),
tableID int,
FieldToSum FLOAT,
runningTotal float,
runningCount int
)
IF OBJECT_ID('TempDB..#table','U') IS NOT NULL
DROP TABLE #table
CREATE TABLE #table
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)
insert #table
select1, 0, 1, 4.733415944 union all
select2, 0, 8, 4.733415944 union all
select3, 0, 6, 4.733415944 union all
select4, 0, 19, 4.733415944 union all
select5, 0, 14, 6.950871733 union all
select6, 0, 3, 6.942781406 union all
select7, 0, 2, 6.956403557 union all
select8, 0, 20, 6.945977457 union all
select9, 0, 11, 6.956403557 union all
select10, 0, 13, 6.956403557 union all
select11, 0, 4, 6.945977457 union all
select12, 0, 10, 6.956403557 union all
select13, 0, 17, 6.945977457 union all
select14, 0, 12, 6.956403557 union all
select15, 0, 15, 6.968313001 union all
select16, 0, 5, 6.968313001 union all
select17, 0, 7, 7.024537923 union all
select18, 0, 18, 7.024537923 union all
select19, 0, 16, 6.968313001 union all
select20, 0, 9, 6.968313001
CREATE UNIQUE CLUSTERED INDEX [Idx1] ON #runTotal(runTotalID )
Declare @flag AS int=0
DECLARE @Counter As INT =1
declare @runningTotal as float
declare @anchor int
WHILE @Flag=0 BEGIN
IF (select COUNT(1) from #table where runningCount=0)<>0
BEGIN
TRUNCATE Table #runTotal
INSERT INTO #runTotal
SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol ASC
set @runningTotal = 0
update #runTotal
set @runningTotal = runningTotal = @runningtotal + FieldToSum,
@anchor = runtotalid
from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)
UPDATE #table
SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25
END
ELSE
SET @FLAG=1
IF (select COUNT(1) from #table where runningCount=0)<>0
BEGIN
TRUNCATE Table #runTotal
INSERT INTO #runTotal
SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol DESC
set @runningTotal = 0
update #runTotal
set @runningTotal = runningTotal = @runningtotal + FieldToSum,
@anchor = runtotalid
from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)
UPDATE #table
SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25
END
ELSE
SET @FLAG=1
SET @Counter=@Counter+1
END
September 25, 2012 at 4:45 am
This process uses the 'Quirky Update' without the essential requirements being in place. As such it is unreliable & will almost certainly give incorrect results in the future (or may have already done so).
Quirky Update is where variable & column are updated together:
update #runTotal
set @runningTotal = runningTotal = @runningtotal + FieldToSum
See this:
http://www.sqlservercentral.com/articles/T-SQL/68467/
for more details - the list of requirements for Quirky Update are towards the end.
It may seem a bit complicated - but unfortunately it is! 🙂
September 25, 2012 at 7:00 pm
I appreciate the help with my broken quirky update. I believed ive fixed that and edited my first post to reflect the new code.
Is there anything else anyone can recommend that could help ?
September 25, 2012 at 10:07 pm
You could try the following
I'm sure that this can be improved though and I don't know if it will scale very well as it is recursive.
IF OBJECT_ID('TempDB..#table','U') IS NOT NULL
DROP TABLE #table
CREATE TABLE #table
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)
IF OBJECT_ID('TempDB..#table2','U') IS NOT NULL
DROP TABLE #table2
CREATE TABLE #table2
(
tableID int NOT NULL,
runningCount int not null,
orderCol int,
FieldToSum float
)
insert #table
select1, 0, 1, 4.733415944 union all
select2, 0, 8, 4.733415944 union all
select3, 0, 6, 4.733415944 union all
select4, 0, 19, 4.733415944 union all
select5, 0, 14, 6.950871733 union all
select6, 0, 3, 6.942781406 union all
select7, 0, 2, 6.956403557 union all
select8, 0, 20, 6.945977457 union all
select9, 0, 11, 6.956403557 union all
select10, 0, 13, 6.956403557 union all
select11, 0, 4, 6.945977457 union all
select12, 0, 10, 6.956403557 union all
select13, 0, 17, 6.945977457 union all
select14, 0, 12, 6.956403557 union all
select15, 0, 15, 6.968313001 union all
select16, 0, 5, 6.968313001 union all
select17, 0, 7, 7.024537923 union all
select18, 0, 18, 7.024537923 union all
select19, 0, 16, 6.968313001 union all
select20, 0, 9, 6.968313001
CREATE UNIQUE INDEX [Idx1] ON #table(orderCol)
;WITH
maxordercol as (SELECT MAX(orderCol) endorder FROM #table),
a AS (
SELECT TableID, orderCol, FieldToSum,
FieldToSum RunTotal,
1 runningCount,
'A' direction
FROM #table
WHERE orderCol = 1
UNION ALL
SELECT t.TableID, t.orderCol, t.FieldToSum,
case when t.FieldToSum + a.runTotal <= 25 then
t.FieldToSum + a.runTotal
else
t.FieldToSum
end RunTotal,
case when t.FieldToSum + a.runTotal <= 25 then
a.runningCount
else
a.runningCount + 1
end runningCount,
'A' direction
FROM a
inner join #table t on t.orderCol = a.orderCol + 1
),
b as (
SELECT TableID, orderCol, FieldToSum,
FieldToSum RunTotal,
1 runningCount,
'D' direction
FROM #table
WHERE orderCol = (select endorder from maxordercol)
UNION ALL
SELECT t.TableID, t.orderCol, t.FieldToSum,
case when t.FieldToSum + b.runTotal <= 25 then
t.FieldToSum + b.runTotal
else
t.FieldToSum
end RunTotal,
case when t.FieldToSum + b.runTotal <= 25 then
b.runningCount
else
b.runningCount + 1
end runningCount,
'D' direction
FROM b
inner join #table t on t.orderCol = b.orderCol - 1
),
c AS (
select tableid, runningCount, ordercol, fieldtosum
from (
select *, row_number() OVER (partition by tableid order by runningCount) seq
from (select * from a union all select * from b) x
) y
where y.seq = 1
)
INSERT INTO #table2
SELECT *
FROM c
OPTION (MAXRECURSION 0)
UPDATE #table
SET runningCount = b.runningCount
FROM #table a INNER JOIN #table2 b ON a.tableid = b.tableid
I tried it over 1200 records as well and the performance was 40ms vs 4200ms
September 25, 2012 at 10:16 pm
Thanks !!
I will give it a go and let you know how it performs over our dataset.
September 26, 2012 at 3:30 am
I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).
Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean?
My terminology was a little off in places 🙂
It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.
So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.
Do you have a true sequence number for the something_values you want to total?
Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).
Can you keep changing it?
I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.
I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.
As always if my explanations dont make sense please let me know and i will try to rewrite it.
Thanks for the help.
September 26, 2012 at 3:47 am
matak (9/26/2012)
I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).
Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean?
My terminology was a little off in places 🙂
It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.
So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.
Do you have a true sequence number for the something_values you want to total?
Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).
Can you keep changing it?
I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.
I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.
As always if my explanations dont make sense please let me know and i will try to rewrite it.
Thanks for the help.
Be aware that the SUM.. OVER(ORDER BY ...) sub-clause requires SQL Server 2012
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply