Cumulative subtraction

Question

Cumulative subtraction

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

drew.allen SSC Guru Points: 76985 More actions · Answer 1

Jacob Wilkins - Friday, November 17, 2017 9:42 AM
Seems to be an odd recursive thing, where for each row you compute a SubtractionResult, with SubtractionResult=ColumnValue-(SubtractionResult from previous row, or 0 if no previous row).
Fairly simple to implement in whatever is the most appropriate recursive solution here (likely Quirky Update will be most efficient, with all its usual caveats).
Another way of saying the same thing is that you're computing these terms for each row, with RowN standing in for the column value in question for the row N:
Row1-0, or simply Row1
Row2-Row1
Row3-(Row2-Row1), or Row3-Row2+Row1
Row4-(Row3-(Row2-Row1)), or Row4-Row3+Row2-Row1
...
So, the other way you could do it with the allowance of ORDER BY in windowed aggregate functions in 2012 is to do a SUM of the column value in question for all preceding rows inclusive of the current row, switching the signs of rows with a different (row number in specified order)%2 value than the current row (note that pattern in the expanded forms of each expression above).
Of course, you have to have something that defines the order of each row; the OP will have to adapt to whatever defines order for his data (I've just used ascending numerical order as his sample seems to use).
Something like this:
IF OBJECT_ID ('tempdb.dbo.#somenumbers') IS NOT NULL DROP TABLE #somenumbers;
CREATE TABLE #somenumbers (somenumber INT);
INSERT INTO #somenumbers VALUES (10), (20), (30), (40);
WITH numbered AS ( SELECT rn=ROW_NUMBER() OVER (ORDER BY somenumber ASC), somenumber FROM #somenumbers ), odds_and_evens AS ( SELECT *, odds=CASE WHEN rn%2=0 THEN -1*somenumber ELSE somenumber END, evens=CASE WHEN rn%2=1 THEN -1*somenumber ELSE somenumber END FROM numbered )
SELECT somenumber ,cumulative_subtraction =CASE WHEN rn%2=0 THEN SUM(evens) OVER (ORDER BY rn ASC) ELSE SUM(odds) OVER (ORDER BY rn ASC) END FROM odds_and_evens ORDER BY rn ASC;
EDIT: Tidied up some word choices.

I used a similar approach, but it seems to require fewer scans and logical reads. I also added a sequence number (identity) to use for sorting.

; WITH Mults AS ( SELECT *, 2 * (ROW_NUMBER() OVER(ORDER BY Seq) % 2) - 1 AS mult FROM @t )

SELECT *, mult * SUM(Col1 * mult) OVER(ORDER BY Seq ROWS UNBOUNDED PRECEDING) FROM Mults ;

Here are the results:
/* Drew */ Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#B626E99C'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

/* Jacob */ Table 'Worktable'. Scan count 5, logical reads 25, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#somenumbers________________________________________________________________________________________________________00000001AA97'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As you can see, my solution requires no scans or logical reads of the worktable (CTE), but Jacob's requires 5 scans and 25 logical reads.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

drew.allen SSC Guru Points: 76985 More actions · Answer 2

drew.allen - Tuesday, November 21, 2017 9:05 AM
Jacob Wilkins - Friday, November 17, 2017 9:42 AM
Seems to be an odd recursive thing, where for each row you compute a SubtractionResult, with SubtractionResult=ColumnValue-(SubtractionResult from previous row, or 0 if no previous row).
Fairly simple to implement in whatever is the most appropriate recursive solution here (likely Quirky Update will be most efficient, with all its usual caveats).
Another way of saying the same thing is that you're computing these terms for each row, with RowN standing in for the column value in question for the row N:
Row1-0, or simply Row1
Row2-Row1
Row3-(Row2-Row1), or Row3-Row2+Row1
Row4-(Row3-(Row2-Row1)), or Row4-Row3+Row2-Row1
...
So, the other way you could do it with the allowance of ORDER BY in windowed aggregate functions in 2012 is to do a SUM of the column value in question for all preceding rows inclusive of the current row, switching the signs of rows with a different (row number in specified order)%2 value than the current row (note that pattern in the expanded forms of each expression above).
Of course, you have to have something that defines the order of each row; the OP will have to adapt to whatever defines order for his data (I've just used ascending numerical order as his sample seems to use).
Something like this:
IF OBJECT_ID ('tempdb.dbo.#somenumbers') IS NOT NULL DROP TABLE #somenumbers;
CREATE TABLE #somenumbers (somenumber INT);
INSERT INTO #somenumbers VALUES (10), (20), (30), (40);
WITH numbered AS ( SELECT rn=ROW_NUMBER() OVER (ORDER BY somenumber ASC), somenumber FROM #somenumbers ), odds_and_evens AS ( SELECT *, odds=CASE WHEN rn%2=0 THEN -1*somenumber ELSE somenumber END, evens=CASE WHEN rn%2=1 THEN -1*somenumber ELSE somenumber END FROM numbered )
SELECT somenumber ,cumulative_subtraction =CASE WHEN rn%2=0 THEN SUM(evens) OVER (ORDER BY rn ASC) ELSE SUM(odds) OVER (ORDER BY rn ASC) END FROM odds_and_evens ORDER BY rn ASC;
EDIT: Tidied up some word choices.
I used a similar approach, but it seems to require fewer scans and logical reads. I also added a sequence number (identity) to use for sorting.
; WITH Mults AS ( SELECT *, 2 * (ROW_NUMBER() OVER(ORDER BY Seq) % 2) - 1 AS mult FROM @t )
SELECT *, mult * SUM(Col1 * mult) OVER(ORDER BY Seq ROWS UNBOUNDED PRECEDING) FROM Mults ;
Here are the results:
/* Drew */ Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#B626E99C'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
/* Jacob */ Table 'Worktable'. Scan count 5, logical reads 25, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#somenumbers________________________________________________________________________________________________________00000001AA97'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As you can see, my solution requires no scans or logical reads of the worktable (CTE), but Jacob's requires 5 scans and 25 logical reads.
Drew

After reviewing, I noticed that Jacob's solution did not specify a frame, so it used the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which is known to perform worse than specifying ROWS.... When I specified ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW to his windowed functions, the scans and logic reads went to 0.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 3

Indeed. Changing the frame to ROWS UNBOUNDED PRECEDING is responsible for that difference (instead of just letting it use the default RANGE like I did).

If my window functions are changed to use the same frame then the IO is the same between the two queries.

I still prefer yours for its brevity.

Cheers!

EDIT: Heh, my usual weakness strikes, starting a post and wandering off to do other things. Drew already caught all this. I really ,really need to get into that habit of checking the page again before I actually post 🙂

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 4

pity the OP @selpoivre seems to have left the building !

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

The solutions are really nice, but I just want to add something to the mix. This is a simple approach, but should be handled with care (with great power comes great responsibility). Read the article that describes it: Solving the Running Total and Ordinal Rank Problems (Rewritten) - SQLServerCentral

USE Test

CREATE TABLE #Sample( id int PRIMARY KEY, Col1 int) INSERT INTO #Sample VALUES(1,10), (2,20), (3,30), (4,40);

DECLARE @id int, @Col1 int = 0;

UPDATE #Sample WITH (TABLOCKX) SET @Col1 = Col1 = Col1 - @Col1, @ID = ID OPTION ( MAXDOP 1);

SELECT * FROM #Sample

GO DROP TABLE #Sample;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2