How to improve this query

Question

Post reply

How to improve this query

Emil B

SSCertifiable

Points: 5568
More actions
September 4, 2015 at 3:11 am

#311398

Hi I have a temp table with 146525 records:
CREATE TABLE #temp
(
ID int,
amount bigint,
Type int,
)
and a query
SELECT T.ID, ISNULL(SUM(TT.amount),0)
FROM #temp T
LEFT JOIN #temp TT ON TT.ID < T.ID AND TT.Type = 1
WHERE T.Type=1
GROUP BY T.ID
And i need it to perform faster. I did try to rewrite it in different ways, tried indexing temp table but still not fast enough, It was running for 5 minutes and i stopped it. Any tips how this can be done better?

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

IF your IDs are unique, this should do:

SELECT T.ID, SUM(T.amount) OVER (ORDER BY T.ID) - T.amount

FROM #temp AS T

-- Gianluca Sartori

spaghettidba SSC Guru Points: 105732 More actions · Answer 2

If your IDs are not unique, this could work:

SELECT T.ID, SUM(T.amount) OVER (ORDER BY T.ID) - T.amount

FROM (

SELECT ID, SUM(amount) AS amount

FROM #temp

WHERE Type = 1

GROUP BY ID

) AS T

Also, in case you have duplicates, I don't think that your original code does what you're expecting, as it could be summing rows multiple times.

Can you please explain what you're after in more detail?

See http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/

-- Gianluca Sartori

Emil B SSCertifiable Points: 5568 More actions · Answer 3

Yes they are unique.

Anyways, THANKS ! Worked like a charm in 2 seconds.

I owe you a beer.

spaghettidba SSC Guru Points: 105732 More actions · Answer 4

Great! Just to clarify, the code takes advantage of the OVER clause to perform a running total.

You can read more about it here: http://www.sqlservercentral.com/articles/T-SQL/75466/

-- Gianluca Sartori

Emil B SSCertifiable Points: 5568 More actions · Answer 5

Thanks! I know how OVER works, used it loads of times before, but i don't know why I didn't think of it in this case. It must be Friday :hehe:

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 6

another idea

SELECT

t.ID

, SUM(t.AMOUNT) OVER(ORDER BY t.ID

ROWS BETWEEN UNBOUNDED PRECEDING

AND CURRENT ROW) - T.AMOUNT AS rsum

FROM #temp AS t

WHERE(t.TYPE = 1);

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Emil B SSCertifiable Points: 5568 More actions · Answer 7

Great J! that made it even faster! Never used that in Over clause

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Learned something new today. Do you have some nice reading about it?

This Friday is getting better and better!

Emil B SSCertifiable Points: 5568 More actions · Answer 8

Just to let you know the execution time of the stored procedure that was using this part of code reduced from over 13 minutes to 3 seconds

spaghettidba SSC Guru Points: 105732 More actions · Answer 9

Good to hear!

Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

-- Gianluca Sartori

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 10

a test for comparisons on 1M rows

see http://blog.waynesheffield.com/wayne/archive/2011/08/running-totals-in-denali-ctp3/

SELECT TOP 1000000

ID = IDENTITY(INT, 1, 1),

Amount = CAST(Rand(Checksum(Newid())) * 99 + 1 AS BIGINT),

Type = (Abs(Checksum(Newid())) % 3 + 1)

INTO #temp

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

CROSS JOIN sys.all_columns ac3

SET STATISTICS TIME ON;

SELECT

T.ID

, SUM(T.amount) OVER (ORDER BY T.ID) - T.amount AS rsum

INTO

#jlswaste

FROM #temp AS T

WHERE(Type = 1);

SET STATISTICS TIME OFF;

DROP TABLE #jlswaste;

SET STATISTICS TIME ON;

SELECT

t.ID

, SUM(t.AMOUNT) OVER(ORDER BY t.ID

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

) - T.AMOUNT AS rsum

INTO

#jlswaste

FROM #temp AS t

WHERE(t.TYPE = 1);

SET STATISTICS TIME OFF;

DROP TABLE #jlswaste;

SET STATISTICS TIME ON;

SELECT

t.ID

, SUM(t.AMOUNT) OVER(ORDER BY t.ID

ROWS BETWEEN UNBOUNDED PRECEDING

AND 1 PRECEDING

) AS rsum

INTO

#jlswaste

FROM #temp AS t

WHERE(t.TYPE = 1);

SET STATISTICS TIME OFF;

DROP TABLE #jlswaste;

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

drew.allen SSC Guru Points: 76990 More actions · Answer 11

spaghettidba (9/4/2015)
Good to hear!
Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Actually. the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Using ROWS performs better than the default RANGE. You can improve this even further, because you're including your current row in the sum and then subtracting the value. Just don't include it in your sum.

SELECT

t.ID

, SUM(t.AMOUNT) OVER(ORDER BY t.ID

ROWS BETWEEN UNBOUNDED PRECEDING

AND 1 PRECEDING) AS rsum

FROM #temp AS t

WHERE(t.TYPE = 1);

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

spaghettidba SSC Guru Points: 105732 More actions · Answer 12

drew.allen (9/4/2015)
spaghettidba (9/4/2015)
Good to hear!
Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
Actually. the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Using ROWS performs better than the default RANGE. You can improve this even further, because you're including your current row in the sum and then subtracting the value. Just don't include it in your sum.
SELECT
t.ID
, SUM(t.AMOUNT) OVER(ORDER BY t.ID
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING) AS rsum
FROM #temp AS t
WHERE(t.TYPE = 1);
Drew

Thanks for the correction Drew! I was completely overlooking that.

-- Gianluca Sartori

Emil B SSCertifiable Points: 5568 More actions · Answer 13

was this ROWS/RANGE "hint" available in 2008/R2?

also the default behaviour of over without ROWS BETWEEN UNBOUNDED PRECEDING caused a large number of reads on Worktable, after adding it it went to 0.

With ROWS BETWEEN UNBOUNDED PRECEDING

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#temp______________________________________________________________________________________________________________00000002A387'. Scan count 1, logical reads 673, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Without

Table 'Worktable'. Scan count 122381, logical reads 737031, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#temp______________________________________________________________________________________________________________00000002A387'. Scan count 1, logical reads 673, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 14

Emil B (9/4/2015)
was this ROWS/RANGE "hint" available in 2008/R2?

No, neither the ORDER BY for window functions.

But the Quirky Update is supposed to be even faster. For more information: http://www.sqlservercentral.com/articles/T-SQL/68467/

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2