May 7, 2014 at 10:51 pm
Comments posted to this topic are about the item Cumulative sum of previous rows
May 8, 2014 at 1:49 am
Nice article, Ben - does exactly what it says on the tin, although many folks lurking around here would be more familiar with the name Running Total.
The old version of your update is what we call a "Triangular Join". If the partitions are small and hence you're only summing up a few rows it can be fast enough for production code. There are other ways of performing a running totals update including cursor, recursive CTE and the so-called "Quirky Update", and these have been exhaustively researched and compared by Jeff Moden et al in this[/url] and other articles.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 8, 2014 at 4:43 am
Hey Ben,
Speaking of SQL 2012 and using "PARTITION OVER" have you tried specifying the rows relative to the current one and avoiding the UPDATE statement completely?
This will return the same results as at the end of your article.
eg)
SELECT D.demandID
,D.itemID
,D.orderNbr
,D.orderQty
,I.inventoryQty-SUM(D.orderQty) OVER (PARTITION BY D.itemID ORDER BY D.orderNbr RANGE UNBOUNDED PRECEDING) as netInventoryQty
FROM Demand D
INNER JOIN Inventory I
ON I.itemID=D.itemID
ORDER BY D.demandID
"PARTITION BY ItemID " will group them by item number and "ORDER BY orderNbr" will also put them in the correct order for the aggregation.
"RANGE UNBOUNDED PRECEDING" grabs all the preceding rows, grouped by itemID ordered by orderNbr.
I just found out about this new relative reference clause for OVER myself, very neat.
Simon 🙂
May 8, 2014 at 5:11 am
Thanks Chris for your post.
Ben
May 8, 2014 at 5:15 am
I will have to take a closer look at your code. Thanks for posting it.
Ben
May 8, 2014 at 5:59 am
I agree with SimonC, the RANGE clause is the better solution to the Running Total problem for read operations.
May 8, 2014 at 9:05 am
Here's a version using a CTE for an easy-to-read implimenation:
;with d as (
select netInventoryQty, demandID, d.itemID, i.inventoryQty,
SUM(orderQty) OVER ( PARTITION BY d.itemID ORDER BY demandid ) AS 'cumQty'
from demand d
join inventory i on d.itemID = i.itemID
)
update d set netInventoryQty = d.inventoryQty - d.cumqty
May 8, 2014 at 9:06 am
Good article.. Thanks!..
Here is how I would have solve the problem.
UPDATE a
SET a.netInventoryQty=c.inventoryQty-b.cQty
--SELECT *
FROM demand a
INNER JOIN (
select a.demandID, sum(b.orderQty) as cQty
FROM demand a
INNER JOIN demand b ON b.demandID<=a.demandID AND b.itemID=a.itemID
GROUP BY a.demandID
) b on b.demandID=a.demandID
INNER JOIN Inventory c ON c.itemID=a.itemID
I'll have to check out if your method is faster.
Thanks again
May 8, 2014 at 9:08 am
Not sure if you completely read my article, but the solution you posted is what I replaced to speed up the update.
Thanks for your post.
Ben
May 8, 2014 at 9:44 am
I did read your article and my solution is similar but there are differences - mainly that I'm joining to sub query that is grouping - where you originally are using nested query for summing - correct??.. Before I read your article I wouldn't have even thought to have used OVER ( PARTITION BY... -- As it is pretty new to me. 🙂
Thanks
May 8, 2014 at 9:51 am
Good article.
I'm not sure what you meant by this though
I found something new in SQL Server 2012. You can now add an order by to the over( partition by and it will do a cumulative sum of the previous rows.
The ORDER BY clause has been part of the OVER statement since 2005. Or do you mean that previously it wouldn't result in a cumulative sum?
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 8, 2014 at 9:52 am
rwatkins 25267 (5/8/2014)
I agree with SimonC, the RANGE clause is the better solution to the Running Total problem for read operations.
I disagree. For running aggregations (including totals), you need to use ROW, not RANGE. RANGE works by the value of the column, and where the value is the same, it shows the same total for both rows. Here's a quick example:
DECLARE @test-2 TABLE (
RowID INT IDENTITY,
FName varchar(20),
Salary smallint);
INSERT INTO @test-2 (FName, Salary)
VALUES ('George', 800),
('Sam', 950),
('Diane', 1100),
('Nicholas', 1250),
('Samuel', 1250),
('Patricia', 1300),
('Brian', 1500),
('Thomas', 1600),
('Fran', 2450),
('Debbie', 2850),
('Mark', 2975),
('James', 3000),
('Cynthia', 3000),
('Christopher', 5000);
SELECT RowID,
FName,
Salary,
SumByRows = SUM(Salary) OVER (ORDER BY Salary
ROWS UNBOUNDED PRECEDING),
SumByRange = SUM(Salary) OVER (ORDER BY Salary
RANGE UNBOUNDED PRECEDING)
FROM @test-2;
Which gives these results:
RowID FName Salary SumByRows SumByRange
----------- -------------------- ------ ----------- -----------
1 George 800 800 800
2 Sam 950 1750 1750
3 Diane 1100 2850 2850
4 Nicholas 1250 4100 5350
5 Samuel 1250 5350 5350
6 Patricia 1300 6650 6650
7 Brian 1500 8150 8150
8 Thomas 1600 9750 9750
9 Fran 2450 12200 12200
10 Debbie 2850 15050 15050
11 Mark 2975 18025 18025
12 James 3000 21025 24025
13 Cynthia 3000 24025 24025
14 Christopher 5000 29025 29025
Note the results for Nicholas/Samuel (RowID 4/5) and James/Cynthia (RowID 12/13). The use of RANGE does not give the proper running total for the first record returned for each pair.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 8, 2014 at 10:01 am
Stefan Krzywicki (5/8/2014)
Good article.I'm not sure what you meant by this though
I found something new in SQL Server 2012. You can now add an order by to the over( partition by and it will do a cumulative sum of the previous rows.
The ORDER BY clause has been part of the OVER statement since 2005. Or do you mean that previously it wouldn't result in a cumulative sum?
Prior to 2012, when using the OVER clause with an aggregate function, you could only specify the PARTITION BY clause, so every row within the partition would have the same value. With 2012, you can now specify the ORDER BY and ROW/RANGE clause. If you use the ORDER BY, you are using the default ROW/RANGE clause, which is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (from the first row in the partition to the current row). You can see more about the changes in my article[/url], or my blog post comparing running totals[/url].
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 8, 2014 at 10:02 am
Good point, Wayne. I had forgotten that there is a definite difference between ROWS and RANGE, despite the similar syntax.
May 8, 2014 at 10:07 am
Stefan Krzywicki (5/8/2014)
Good article.I'm not sure what you meant by this though
I found something new in SQL Server 2012. You can now add an order by to the over( partition by and it will do a cumulative sum of the previous rows.
The ORDER BY clause has been part of the OVER statement since 2005. Or do you mean that previously it wouldn't result in a cumulative sum?
Stefan,
If you create the tables in my article in a sql server 2008 R2 or lower version of sql server and try to run the select:
SELECT demandID
, itemID
, orderQty
, SUM(orderQty) OVER ( PARTITION BY itemID ORDER BY demandid ) AS 'cumQty'
FROM demand
You will get an error "Incorrect syntax near 'order'
So adding the order by with the partition by is something new you can only do in sql server 2012.
Thanks,
Ben
Viewing 15 posts - 1 through 15 (of 26 total)
You must be logged in to reply to this topic. Login to reply