Blog Post

Comparing an Old Running Total to Window Functions

,

Often I see running totals that are written in SQL using a variety of techniques. Many pieces of code were written in pre-2012 techniques, prior to window functions being introduced.

After SQL Server 2012, we had better ways to write a total. In this case, let’s see how much better. This is based on an article showing how you might convert code from the first query to the second. This is a performance analysis of the two techniques are different scales..

Pre SQL Server 2012

The old way:

SELECT Acc.ID,CONVERT(varchar(50),TransactionDate,101) AS TransactionDate
  , Balance, isnull(RunningTotal,'') AS RunningTotal 
 FROM Accounts Acc  
   LEFT OUTER JOIN (SELECT ID,sum(Balance) AS RunningTotal 
                    FROM (SELECT A.ID AS ID,B.ID AS BID, B.Balance 
                           FROM Accounts A 
                             cross JOIN Accounts B 
                           WHERE B.ID BETWEEN A.ID-4 
                           AND A.ID AND A.ID>4 
                          )T
                    GROUP BY ID ) Bal 
     ON Acc.ID=Bal.ID

What were the statistics on this? After running a few times, with STATISTICS IO ON, I get this:

Table ‘Accounts’. Scan count 37, logical reads 37, physical reads 0

Not bad. I’ve truncated out the other values as they were all 0.

Window Functions

Here is the same query written with a Window function.

SELECT

id

, TransactionDate

, Balance

, CASE WHEN LAG(TransactionDate, 4, null) OVER (ORDER BY TransactionDate) IS NOT NULL

THEN SUM (Balance) OVER (ORDER BY TransactionDate ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)

ELSE 0

END AS runningotal

FROM dbo.accounts

The statistics?

Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0

Table ‘Accounts’. Scan count 1, logical reads 1, physical reads 0

The window function definitely does less work. A lot less. But how does this scale?

Performance Testing

There are numerous ways to create some test data for this. Since I have Redgate SQL Data Generator, I decided to use that. It’s simple and easy, and I added 100,000 rows first.

2024-09_0143

My results of the first query:

2024-09_0144

Lots of reads and scans. Let’s compare this to the window function.

2024-09_0145

Hmmm, both took essentially zero time less than a second. That might lead some developers to think either method is quick enough.

Let’s add 1mm more rows.

2024-09_0146

Now compare. The first takes about 15s with these results.

2024-09_0147

The window function? 3 sec, with these stats.

2024-09_0148

The comparison looks like this. First, let’s look at SSMS time

 

Old, Cross JoinWindow Function
20 rows0 sec0 sec
100,020 rows0 sec0 sec
1,100,020 rows15 sec3 sec

If we look at CPU time, then we see this:

 

Old, Cross JoinWindow Function
20 rows0 ms0 ms
100,020 rows748 ms250 ms
1,100,020 rows5845 ms2919 ms

If we look at the logical reads in total, we see this

 

Old, Cross JoinWindow Function
20 rows371
100,020 rows403,998359
1,100,020 rows6,450,8953945

Clearly the window function is better and the better grows as the size of data grows.

Summary

This post looks at two queries and compares the performance across a few queries. These aren’t the only ones, and you might choose other types of queries, but these are both examples of how you might approach a problem using old tech and new tech.

The window function is not slightly more efficient, but extremely efficient compared to the older style method of using a cross join. As the data scales up, the difference is pronounced. While 1mm rows might not be a great test here, and you may prefer to test at 10mm or 100mm rows to get an idea of load, the fact is the Window function is much quicker and uses less resources.

If you are using older style code to perform T-SQL calculations, make some time to refactor that code (and test it) to use modern window functions.

Setup Code

Here’s the initial setup code:

CREATE TABLE Accounts
(
ID int IDENTITY(1,1),
TransactionDate datetime,
Balance float
)
go
insert into Accounts(TransactionDate,Balance) values ('1/1/2000',100)
insert into Accounts(TransactionDate,Balance) values ('1/2/2000',101)
insert into Accounts(TransactionDate,Balance) values ('1/3/2000',102)
insert into Accounts(TransactionDate,Balance) values ('1/4/2000',103)
insert into Accounts(TransactionDate,Balance) values ('1/5/2000',104)
insert into Accounts(TransactionDate,Balance) values ('1/6/2000',105)
insert into Accounts(TransactionDate,Balance) values ('1/7/2000',106)
insert into Accounts(TransactionDate,Balance) values ('1/8/2000',107)
insert into Accounts(TransactionDate,Balance) values ('1/9/2000',108)
insert into Accounts(TransactionDate,Balance) values ('1/10/2000',109)
insert into Accounts(TransactionDate,Balance) values ('1/11/2000',200)
insert into Accounts(TransactionDate,Balance) values ('1/12/2000',201)
insert into Accounts(TransactionDate,Balance) values ('1/13/2000',202)
insert into Accounts(TransactionDate,Balance) values ('1/14/2000',203)
insert into Accounts(TransactionDate,Balance) values ('1/15/2000',204)
insert into Accounts(TransactionDate,Balance) values ('1/16/2000',205)
insert into Accounts(TransactionDate,Balance) values ('1/17/2000',206)
insert into Accounts(TransactionDate,Balance) values ('1/18/2000',207)
insert into Accounts(TransactionDate,Balance) values ('1/19/2000',208)
insert into Accounts(TransactionDate,Balance) values ('1/20/2000',209)
go

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating