How to improve this query

  • Hi I have a temp table with 146525 records:

    CREATE TABLE #temp

    (

    ID int,

    amount bigint,

    Type int,

    )

    and a query

    SELECT T.ID, ISNULL(SUM(TT.amount),0)

    FROM #temp T

    LEFT JOIN #temp TT ON TT.ID < T.ID AND TT.Type = 1

    WHERE T.Type=1

    GROUP BY T.ID

    And i need it to perform faster. I did try to rewrite it in different ways, tried indexing temp table but still not fast enough, It was running for 5 minutes and i stopped it. Any tips how this can be done better?

  • IF your IDs are unique, this should do:

    SELECT T.ID, SUM(T.amount) OVER (ORDER BY T.ID) - T.amount

    FROM #temp AS T

    -- Gianluca Sartori

  • If your IDs are not unique, this could work:

    SELECT T.ID, SUM(T.amount) OVER (ORDER BY T.ID) - T.amount

    FROM (

    SELECT ID, SUM(amount) AS amount

    FROM #temp

    WHERE Type = 1

    GROUP BY ID

    ) AS T

    Also, in case you have duplicates, I don't think that your original code does what you're expecting, as it could be summing rows multiple times.

    Can you please explain what you're after in more detail?

    See http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/

    -- Gianluca Sartori

  • Yes they are unique.

    Anyways, THANKS ! Worked like a charm in 2 seconds.

    I owe you a beer.

  • Great! Just to clarify, the code takes advantage of the OVER clause to perform a running total.

    You can read more about it here: http://www.sqlservercentral.com/articles/T-SQL/75466/

    -- Gianluca Sartori

  • Thanks! I know how OVER works, used it loads of times before, but i don't know why I didn't think of it in this case. It must be Friday :hehe:

  • another idea

    SELECT

    t.ID

    , SUM(t.AMOUNT) OVER(ORDER BY t.ID

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND CURRENT ROW) - T.AMOUNT AS rsum

    FROM #temp AS t

    WHERE(t.TYPE = 1);

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Great J! that made it even faster! Never used that in Over clause

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

    Learned something new today. Do you have some nice reading about it?

    This Friday is getting better and better!

  • Just to let you know the execution time of the stored procedure that was using this part of code reduced from over 13 minutes to 3 seconds

  • Good to hear!

    Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

    -- Gianluca Sartori

  • a test for comparisons on 1M rows

    see http://blog.waynesheffield.com/wayne/archive/2011/08/running-totals-in-denali-ctp3/

    SELECT TOP 1000000

    ID = IDENTITY(INT, 1, 1),

    Amount = CAST(Rand(Checksum(Newid())) * 99 + 1 AS BIGINT),

    Type = (Abs(Checksum(Newid())) % 3 + 1)

    INTO #temp

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    SET STATISTICS TIME ON;

    SELECT

    T.ID

    , SUM(T.amount) OVER (ORDER BY T.ID) - T.amount AS rsum

    INTO

    #jlswaste

    FROM #temp AS T

    WHERE(Type = 1);

    SET STATISTICS TIME OFF;

    DROP TABLE #jlswaste;

    SET STATISTICS TIME ON;

    SELECT

    t.ID

    , SUM(t.AMOUNT) OVER(ORDER BY t.ID

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

    ) - T.AMOUNT AS rsum

    INTO

    #jlswaste

    FROM #temp AS t

    WHERE(t.TYPE = 1);

    SET STATISTICS TIME OFF;

    DROP TABLE #jlswaste;

    SET STATISTICS TIME ON;

    SELECT

    t.ID

    , SUM(t.AMOUNT) OVER(ORDER BY t.ID

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND 1 PRECEDING

    ) AS rsum

    INTO

    #jlswaste

    FROM #temp AS t

    WHERE(t.TYPE = 1);

    SET STATISTICS TIME OFF;

    DROP TABLE #jlswaste;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • spaghettidba (9/4/2015)


    Good to hear!

    Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

    Actually. the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Using ROWS performs better than the default RANGE. You can improve this even further, because you're including your current row in the sum and then subtracting the value. Just don't include it in your sum.

    SELECT

    t.ID

    , SUM(t.AMOUNT) OVER(ORDER BY t.ID

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND 1 PRECEDING) AS rsum

    FROM #temp AS t

    WHERE(t.TYPE = 1);

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (9/4/2015)


    spaghettidba (9/4/2015)


    Good to hear!

    Please note that my form of the query and J Livingston's are the same: the default for the ORDER BY clause in OVER is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

    Actually. the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Using ROWS performs better than the default RANGE. You can improve this even further, because you're including your current row in the sum and then subtracting the value. Just don't include it in your sum.

    SELECT

    t.ID

    , SUM(t.AMOUNT) OVER(ORDER BY t.ID

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND 1 PRECEDING) AS rsum

    FROM #temp AS t

    WHERE(t.TYPE = 1);

    Drew

    Thanks for the correction Drew! I was completely overlooking that.

    -- Gianluca Sartori

  • was this ROWS/RANGE "hint" available in 2008/R2?

    also the default behaviour of over without ROWS BETWEEN UNBOUNDED PRECEDING caused a large number of reads on Worktable, after adding it it went to 0.

    With ROWS BETWEEN UNBOUNDED PRECEDING

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#temp______________________________________________________________________________________________________________00000002A387'. Scan count 1, logical reads 673, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Without

    Table 'Worktable'. Scan count 122381, logical reads 737031, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#temp______________________________________________________________________________________________________________00000002A387'. Scan count 1, logical reads 673, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

  • Emil B (9/4/2015)


    was this ROWS/RANGE "hint" available in 2008/R2?

    No, neither the ORDER BY for window functions.

    But the Quirky Update is supposed to be even faster. For more information: http://www.sqlservercentral.com/articles/T-SQL/68467/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply