Performance issue with tally solution

  • Jeff Moden (4/21/2009)


    It's not the presence of TOP nor ORDER BY. It's the presence of UNION ALL. In the testing I did with Flo, I actually created a situation where the first SELECT in the UNION ALL returned 0 rows and it knocked an 800 column split on 1000 rows on a VARCHAR(MAX) down from 44 seconds to only 6. It somehow forces the optimizer to, as Flo says, use a working table that wouldn't normally be utilized to keep from building 799,000 rows using the Tally table. Instead, it only reads 800 rows from the Tally table and spawns the rest of the rows...

    Jeff,

    My apologies. If you wondered what the heck I was on about - let me explain: Both parts of the script are headed 'Traditional Tally Solution'. In my excitement to try it out, I assumed that the second one was the new and improved version. That was the one I was referring to - oops!

    The giveaway was the fact that there is no UNION ALL in it - so I though "what is Jeff talking about UNION ALL for?" when I saw the response. So sorry, serves me right for rushing it. Throw meat portions at me or lash me with noodles - whatever the currently applicable response is.

    With some trepidation then, I ran the correct version - the one with the UNION ALL.

    It's all cool (really!), but it still suffers with the Lazy Spool which generated 801,000 intermediate rows on the run I just did (admittedly with a 100K tally table). The QO seems to assume that the Lazy Spool will never rewind - in fact it does for every row in JBMTest. I wonder if it would produce a different plan if it could know that?

    It does however

    The Holy Grail - which I have singularly failed to find - is a plan which (in order of desirability):

    • doesn't generate a spool (table or index)
    • eliminates the inequality predicate
    • calls charindex & substring once per found delimeter

    Removing the inequality (if it's even possible) would allow a hash or merge join, or even some kind of streaming aggregate operation rather than a loop join with table spool.

    Calling charindex/substring once per delimeter would mean I could never use the "brute force" tag again 🙂

    I attach the actual execution plan (sqlplan and text form) in case it differs from yours.

    Really sorry about my false start there...:blush:

    Paul

    P.S. Couldn't upload the .sqlplan file directly - so renamed it as Jeff-UNION-ALL.txt

  • Flo and Jeff, I just had to try.

    This beast features double-barelled carburetors.... errrr tally tables.... one with two columns, and nested CROSS APPLYs.

    Can you help me tune her up a bit?

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    GO

    USE tempdb

    GO

    IF (OBJECT_ID('dbo.Tally') IS NOT NULL) drop table dbo.tally

    IF (OBJECT_ID('dbo.Tally2') IS NOT NULL) drop table dbo.tally2

    IF (OBJECT_ID('dbo.Result1') IS NOT NULL) drop table dbo.result1

    CREATE TABLE dbo.Tally (N INT NOT NULL, PRIMARY KEY CLUSTERED (N))

    INSERT INTO dbo.Tally

    SELECT TOP 11000

    ROW_NUMBER() OVER (ORDER BY c1.column_id)

    FROM master.sys.all_columns c1

    CROSS JOIN master.sys.all_columns c2

    CREATE TABLE dbo.Tally2 (N INT NOT NULL, O INT NOT NULL, PRIMARY KEY CLUSTERED (N))

    INSERT INTO dbo.Tally2

    SELECT TOP 11000

    ROW_NUMBER() OVER (ORDER BY c1.column_id),ROW_NUMBER() OVER (ORDER BY c1.column_id)+1

    FROM master.sys.all_columns c1

    CROSS JOIN master.sys.all_columns c2

    --=====================================================================================

    set statistics time on;

    set statistics io on;

    select rowNum, item

    into result1

    from jbmTest

    cross apply(select substring(csv,t2.O,Z) as item

    from tally2 t2

    cross apply (select min(t1.N)- t2.O as Z

    from Tally t1

    where t1.N > t2.N

    and substring(csv,t1.N,1) = ',') ca2

    where t2.N < len(csv) and substring(csv,t2.N,1) = ','

    ) ca1

    set statistics time off;

    set statistics io off;

    I got a 69 Chevy with a 396

    Fuelie heads and a Hurst on the floor....

    - Bruce Springsteen

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Bob Hovious (4/21/2009)


    Flo and Jeff, I just had to try.

    This beast features double-barelled carburetors.... errrr tally tables.... one with two columns, and nested CROSS APPLYs.

    Can you help me tune her up a bit?

    How well does it run with no tuneup?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Bob Hovious (4/21/2009)


    Can you help me tune her up a bit?

    Hey Bob,

    I *was* going to suggest rewriting it in CLR...?

    :laugh:

    Paul

  • Jeff: Just a hair slower than you guys... on my laptop at least.

    Hard to isolate because of other processes, but it seems to be about 5-8% slower on CPU. It's all about the I/O. This scans the tally table 800 times.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hey Paul, this thread may be the motivator for me to finally learn that awful note (pun) of a language. But it also convinces me that CLR isn't a magic bullet. A mediocre CLR routine might not be the fastest technique either. 🙂

    As always, it depends.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Something to ponder.

    This thread has also made me broaden my horizons about tally tables. Why should there be just one flavor? If we are going to parse 8k fields, why not make one that uses smallint for N and only enough rows to do the parser? If we always want to start with the column right after the separator appears (and we know the separator will be only 1 position long), why not precalculate that?

    I even thought about a table with all possible combinations of starting (X) and ending (Y) postions, and the length between them. This would be used to set up a substring from X to min(Y) where both X and Y = ',' but it would take tens of millions of rows. It's almost 1:30, so I'm not going to explore that tonight, I'm just throwing out exhausted ramblings in hopes of turning the light bulb on for someone else.

    As always, it's been a pleasure and an education.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Bob Hovious (4/22/2009)


    But it also convinces me that CLR isn't a magic bullet. A mediocre CLR routine might not be the fastest technique either. 🙂

    Absolutely. I have been spending most of my time playing with T-SQL to find a better solution. CLR is rarely my first choice; and then only when T-SQL doesn't have the the required functionality (or if it is obviously a non-starter).

    Customised tally tables are an interesting thought - though I might consider creating one within the query (using the normal tricks) in some siutations. You guys should rename this site to SS:ID - SQL Server: It Depends.

    Paul

  • Sorry to have missed this topic, I will spend some time the next week to take a good look at all data and knowledge displayed in this thread. Right now I am quite busy and were it not for a link that Jeff made in another thread I would not even found this one!

    Thanks for that Jeff, and stay tuned I am going to like next weeks exercise for sure 🙂

  • Paul, I am drawing on the experience that oftentimes a precalculated date or calendar table produces faster solutions than doing DATEADD,DATEDIFF on the fly. It would seem the same principle should apply. We'll see.

    Jeff,

    the presence of UNION ALL

    Maybe this is the tune-up I needed. Back to the garage.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Bob Hovious (4/21/2009)


    Flo and Jeff, I just had to try.

    This beast features double-barelled carburetors.... errrr tally tables.... one with two columns, and nested CROSS APPLYs.

    Can you help me tune her up a bit?

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    GO

    USE tempdb

    GO

    IF (OBJECT_ID('dbo.Tally') IS NOT NULL) drop table dbo.tally

    IF (OBJECT_ID('dbo.Tally2') IS NOT NULL) drop table dbo.tally2

    IF (OBJECT_ID('dbo.Result1') IS NOT NULL) drop table dbo.result1

    CREATE TABLE dbo.Tally (N INT NOT NULL, PRIMARY KEY CLUSTERED (N))

    INSERT INTO dbo.Tally

    SELECT TOP 11000

    ROW_NUMBER() OVER (ORDER BY c1.column_id)

    FROM master.sys.all_columns c1

    CROSS JOIN master.sys.all_columns c2

    CREATE TABLE dbo.Tally2 (N INT NOT NULL, O INT NOT NULL, PRIMARY KEY CLUSTERED (N))

    INSERT INTO dbo.Tally2

    SELECT TOP 11000

    ROW_NUMBER() OVER (ORDER BY c1.column_id),ROW_NUMBER() OVER (ORDER BY c1.column_id)+1

    FROM master.sys.all_columns c1

    CROSS JOIN master.sys.all_columns c2

    --=====================================================================================

    set statistics time on;

    set statistics io on;

    select rowNum, item

    into result1

    from jbmTest

    cross apply(select substring(csv,t2.O,Z) as item

    from tally2 t2

    cross apply (select min(t1.N)- t2.O as Z

    from Tally t1

    where t1.N > t2.N

    and substring(csv,t1.N,1) = ',') ca2

    where t2.N < len(csv) and substring(csv,t2.N,1) = ','

    ) ca1

    set statistics time off;

    set statistics io off;

    I got a 69 Chevy with a 396

    Fuelie heads and a Hurst on the floor....

    - Bruce Springsteen

    Hi Bob!

    Tune this great solution?? You are kidding, aren't you? 🙂

    Your double-barelled Tally even beats the UNION ALL function! I tried to replace the Tally2 with a second join to the traditional Tally but the performance decreases for ten percent. I tried to add an unique index on N including O to maybe avoid one step; with and without index hint. I tried to include a UNION... Every character I changed in your awesome function degrades its performance.

    I just have to congratulate to the next version of a spooky tally. Why the hell is a CROSS APPLY with a tally scan :alien: faster than CHARINDEX??

    Here my performance results:

    ----==========================================================

    ---- UNION ALL solution

    SQL Server Execution Times:

    CPU time = 748 ms, elapsed time = 741 ms.

    (801000 row(s) affected)

    ----==========================================================

    ---- Traditional Tally solution

    SQL Server Execution Times:

    CPU time = 2294 ms, elapsed time = 2350 ms.

    (799000 row(s) affected)

    ----==========================================================

    ---- Bob Hovious double-barelled Tally

    SQL Server Execution Times:

    CPU time = 624 ms, elapsed time = 665 ms.

    (799000 row(s) affected)

    Thanks again!

    Flo

  • Paul White (4/21/2009)


    Jeff Moden (4/21/2009)


    It's not the presence of TOP nor ORDER BY. It's the presence of UNION ALL. In the testing I did with Flo, I actually created a situation where the first SELECT in the UNION ALL returned 0 rows and it knocked an 800 column split on 1000 rows on a VARCHAR(MAX) down from 44 seconds to only 6. It somehow forces the optimizer to, as Flo says, use a working table that wouldn't normally be utilized to keep from building 799,000 rows using the Tally table. Instead, it only reads 800 rows from the Tally table and spawns the rest of the rows...

    Jeff,

    My apologies. If you wondered what the heck I was on about - let me explain: Both parts of the script are headed 'Traditional Tally Solution'. In my excitement to try it out, I assumed that the second one was the new and improved version. That was the one I was referring to - oops!

    The giveaway was the fact that there is no UNION ALL in it - so I though "what is Jeff talking about UNION ALL for?" when I saw the response. So sorry, serves me right for rushing it. Throw meat portions at me or lash me with noodles - whatever the currently applicable response is.

    All blame on me...! My first error was twice the same PRINT for both functions. The second, greater, error was that I forgot to note the UNION ALL which Jeff already told me. So all the meat and noodles on me...

    I don't try to join the execution plan discussion because I know my limits.

    Thanks to you, both!

    Flo

  • A very minor point for Number/Tally tables is to create the clustered index after population and to specify 100% fill factor. This will ensure minimal IO for subsequent use. This is negligible for 11000 row table but for larger ones becomes increasingly important I think.

    That double cross-apply is wicked. I am still trying to wrap my mind around how it actually works. :w00t:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hey Guru: I borrowed Flo's code to create the tally table instead of doing it the way Jeff taught me, because I wanted to keep an apples-to-apples comparison against the results he was reporting when I tested on my laptop. Thanks for pointing out something that is important to any tally table application. 🙂

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Florian Reischl (4/22/2009)


    Bob Hovious (4/21/2009)


    Flo and Jeff, I just had to try.

    This beast features double-barelled carburetors.... errrr tally tables.... one with two columns, and nested CROSS APPLYs.

    Can you help me tune her up a bit?

    ...

    Hi Bob!

    Tune this great solution?? You are kidding, aren't you? 🙂

    Your double-barelled Tally even beats the UNION ALL function! I tried to replace the Tally2 with a second join to the traditional Tally but the performance decreases for ten percent. I tried to add an unique index on N including O to maybe avoid one step; with and without index hint. I tried to include a UNION... Every character I changed in your awesome function degrades its performance.

    I just have to congratulate to the next version of a spooky tally. Why the hell is a CROSS APPLY with a tally scan :alien: faster than CHARINDEX??

    Here my performance results:

    ----==========================================================

    ---- UNION ALL solution

    SQL Server Execution Times:

    CPU time = 748 ms, elapsed time = 741 ms.

    (801000 row(s) affected)

    ----==========================================================

    ---- Traditional Tally solution

    SQL Server Execution Times:

    CPU time = 2294 ms, elapsed time = 2350 ms.

    (799000 row(s) affected)

    ----==========================================================

    ---- Bob Hovious double-barelled Tally

    SQL Server Execution Times:

    CPU time = 624 ms, elapsed time = 665 ms.

    (799000 row(s) affected)

    Thanks again!

    Flo

    !!!

    Those are some freakin' incredible numbers!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 166 through 180 (of 522 total)

You must be logged in to reply to this topic. Login to reply