Cumulative sum of previous rows

I was working on trying to improve performance of an update statement at work. It was taking three minutes to run. I noticed an interesting sub query going on in the SQL. The purpose of the code was to net out inventory against demand quantity. I spent a lot of time looking at the code trying to figure out a way I could speed it up.

Finally, I started to look at the over( partition by function in SQL. I found something new in SQL Server 2012. You can now add an order by to the over( partition by and it will do a cumulative sum of the previous rows. I had used the over( partition by with a row_number before, but not to do a cumulative sum over previous rows. When I implemented this new code my three minute update changed to a six second update.

Here is an example. First we create three tables.

CREATE TABLE Inventory
    (
      inventoryID INT IDENTITY(1, 1)
                      PRIMARY KEY
                      NOT NULL
    , itemID INT NOT NULL
    , inventoryQty NUMERIC(18, 6) NOT NULL
    )
CREATE TABLE Items
    (
      itemID INT IDENTITY(1, 1)
                 PRIMARY KEY
                 NOT NULL
    , itemName VARCHAR(50) NOT NULL
    )
CREATE TABLE Demand
    (
      demandID INT IDENTITY(1, 1)
                   PRIMARY KEY
                   NOT NULL
    , itemID INT NOT NULL
    , orderNbr VARCHAR(50) NOT NULL
    , orderQty NUMERIC(18, 6) NOT NULL
    , netInventoryQty NUMERIC(18, 6) NULL
    )

Now we will insert some data:

INSERT INTO Items
    VALUES ( 'Blue Chair' )
INSERT INTO Items
    VALUES ( 'Red Chair' )
INSERT INTO Items
    VALUES ( 'White Chair' )
INSERT INTO Inventory
    VALUES ( 1, 5 )
 --5 Blue Chairs in Inventory
INSERT INTO Inventory
    VALUES ( 2, 3 )
 --3 Red Chairs
INSERT INTO Inventory
    VALUES ( 3, 9 )
 --9 White Chairs
INSERT INTO Demand
    VALUES ( 1, '1001', 1, NULL )
 --order for 1 blue chair
INSERT INTO Demand
    VALUES ( 2, '1002', 1, NULL )
 --order for 1 red Chair
INSERT INTO Demand
    VALUES ( 1, '1003', 2, NULL )
 --order for 2 blue chairs
INSERT INTO Demand
    VALUES ( 3, '1004', 1, NULL )
 --order for 1 white chair
INSERT INTO Demand
    VALUES ( 2, '1005', 3, NULL )
 --order for 3 red chairs
INSERT INTO Demand
    VALUES ( 1, '1006', 4, NULL )
 --order for 4 blue chairs
INSERT INTO Demand
    VALUES ( 2, '1007', 2, NULL )
 --order for 2 red chairs
INSERT INTO Demand
    VALUES ( 3, '1008', 2, NULL )
 --order for 2 white chairs
INSERT INTO Demand
    VALUES ( 1, '1009', 1, NULL ) 
--order for 1 blue chair

Let’s see what we have for results so far:

Next let’s take a look at the update:

UPDATE d
    SET netInventoryQty = i.inventoryQty
        - ( SELECT SUM(orderQty)
                FROM demand d2
                WHERE d2.itemid = d.itemID
                    AND d2.demandid <= d.demandid
          )
    FROM demand d
        JOIN inventory i
        ON d.itemID = i.itemid

Notice that the sub query is summing up previous demand order quantity that have a demandID <= the current demandID. This sub query is trying to do a cumulative sum of the orderQty. When we subtract the cumulative orderQty from the current InventoryQty we can figure out when we will run out of Inventory.

Here are the results of the update:

When the netInventoryQty goes negative that tells us on what demand order we are going to run out of inventory.

Now here is the query that does the cumulative sum over previous rows using the over( partition by order by function.

SELECT demandID
    ,   itemID
    ,   orderQty
    ,   SUM(orderQty) OVER ( PARTITION BY itemID ORDER BY demandid ) AS 'cumQty'
    FROM demand

This gives these results:

You can see how the cumQty column is a cumulative sum of the orderQty for the previous demand orders of the same itemID.

So now I changed my update to use this new query in a join instead of a sub query.

UPDATE d
    SET netInventoryQty = i.inventoryQty - t.cumqty 
--(select sum(orderQty from demand d2 where d2.itemid = d.itemID and d2.demandid <= d.demandid)
    FROM demand d
        JOIN inventory i
        ON d.itemID = i.itemid
        JOIN ( SELECT demandID
                ,   itemID
                ,   SUM(orderQty) OVER ( PARTITION BY itemID ORDER BY demandid ) AS 'cumQty'
                FROM demand
             ) t
        ON t.demandID = d.demandID

We get the same results:

I hope this example might help you if you run across a need to sum up cumulative values. It was the solution I needed to fix some poor performing code and get it to run a lot faster.

Cumulative sum of previous rows

Rate

Share

Categories

Share

Rate

Cumulative sum of previous rows

Rate

Share

Categories

Share

Rate

Related content

SUM()

Convert Rows into Columns

Calculating the Median Value within a Partitioned Set Using T-SQL

Sum and Count

Demystifying the use of CASE in an ORDER BY statement