The "Numbers" or "Tally" Table: What it is and how it replaces a loop

  • Jeff Moden (5/8/2008)


    You wrote the following in your blog...

    SQL Server will happily update the same row over and over again if it matches more than one row in the joined table, with only the result of the last of those updates sticking.

    Hi Jeff,

    I now realize that this sentence can be misinterpreted. I did not mean that, for instance, the same row will be in the inserted and deleted pseudo-tables of a trigger multiple times. This is not the case.

    A better description would have been that SQL Server first constructs the results of the complete join, thus including multiple copies of the same row in the target table. It then selects one of these multiple target rows to use for the update operation and discards the rest - and which one is selected depends on the execution plan.

    ... Do you have any code that proves that?

    Yep. It's not easy to reproduce execution plan changes because of service packs, hotfixes, or change in current load of the server - but changing indexes is a good way to prove this.

    CREATE TABLE T1

    (ID1 int NOT NULL PRIMARY KEY,

    Col int NOT NULL);

    CREATE TABLE T2

    (ID2 int NOT NULL PRIMARY KEY,

    ID1 int NOT NULL FOREIGN KEY REFERENCES T1(ID1),

    Col int NOT NULL,

    Filler char(8000) NOT NULL DEFAULT ('Just a filler'));

    go

    INSERT INTO T1 (ID1, Col)

    SELECT 1, 0 UNION ALL

    SELECT 2, 0;

    INSERT INTO T2 (ID2, ID1, Col)

    SELECT 1, 1, 1 UNION ALL

    SELECT 2, 1, 2 UNION ALL

    SELECT 3, 1, 3 UNION ALL

    SELECT 4, 1, 4 UNION ALL

    SELECT 5, 2, 4 UNION ALL

    SELECT 6, 2, 3 UNION ALL

    SELECT 7, 2, 2 UNION ALL

    SELECT 8, 2, 1;

    go

    CREATE INDEX ix1 ON T2(ID1);

    --CREATE INDEX ix1 ON T2(ID1, Col);

    go

    SET STATISTICS IO ON;

    go

    UPDATE T1

    SET Col = T2.Col

    FROM T1

    INNER JOIN T2

    ON T2.ID1 = T1.ID1;

    go

    SET STATISTICS IO OFF;

    go

    SELECT * FROM T1;

    go

    DROP TABLE T2, T1;

    go

    If you change the index definition (uncomment one line, comment the other), results will change.

    Also, both the statistics IO and the graphical execution plan (if you turn it on) will show that the update does in fact first construct the complete result of the join before using a TOP operator to reduce it to one result per target row.

    Shifting gears a bit... the "bin packing by weight" problem sounds like a very interesting challenge for set based processing. I'll give it a whirl.

    I already have a setbased solution (slower than a turtle in drying cement), and a solution that blends setbased and iterative elements (beats the living hell out of all pure iterative solutions I found). I just need to find the time to polish them up, weed out some details, and write up the explanation - you don't happpen to have one of Hermione's Time-Turners I can borrow, have you? πŸ˜€

    I'm looking forward to seing your solution and comparing it to mine! Please drop me an email if you publlish it anywhere, otherwise I might miss it.

    Edit - forgot to write the last paragraph before sending.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Jeff Moden (5/8/2008)


    Anirban Paul (5/8/2008)


    Jeff

    We had a discussion regarding this article couple of days back in another forum. I was waiting for this article. I have no words to praise this one. I just wants to thank you for the article. You are rally taking this forum to a new height.

    πŸ™‚

    Anirban

    Awesome compliment, Anirban... thanks! Yep, I remember the "conversation"... I think it was on Karthik's "Tally Table" post.

    You are welcome :).

    Yes it was indeed Karthik's "Tally Table" post.

  • :humble:

    I'm not worthy to even read this article, but I've been a bad boy and did just that. πŸ˜€ Mea culpa.

    Clear, to the point, no rocket science involved, practical, very recognizable situations, ...... 😎

    Love it all :w00t:

    Everything comes with a cost.... With this article, the cost is very negative ! :blink: ..... :ermm: ...... :w00t: ....... 😎

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution πŸ˜€

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • What can I say, excellent article, again! Prompted me to put a link to here[/url] on my site, just to make sure I don't lose these...

    Well done Jeff. πŸ™‚

    Dave J


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Jeff,

    As always an excellent, excellent article!:smooooth: I have always seen you use the tally table in your posts and you usually explain it, but this article provided me the best explanation. I enjoyed how you included the comparisons to the loops.

    There was a lot of discussion of the name in the post to your tally table script that appeared last week (and a little in this thread). I like tally. Besides, if work is going really good (or you are in a dull meeting), you could always cry out β€œTally Ho”!:w00t:

    Ian.

    "If you are going through hell, keep going."
    -- Winston Churchill

  • Thanks for the article. I've been using something similar for awhile. I created a function that returns a table of entries. It doesn't run quite as quick as the examples, but it can be run within a JOIN.

    Here's the code:

    --------------------------------------------------------------

    create function dbo.FN_Count (@Recs int)

    returns @RetTable table (RecNo int not null)

    as

    begin

    declare @Iteration int

    set @Iteration = 1

    while @Iteration <= @Recs

    begin

    insert into @RetTable

    select @Iteration

    set @Iteration = @Iteration + 1

    end

    return

    end

    -----------------------------------------------

    This way, if you need 5000 records for whatever you're doing.

    SELECT * FROM dbo.Fn_Count(5000) MyRecs

    Hope ya'll enjoy this.

    Thanks,

    CReeds

  • Jeff Moden (5/7/2008)


    GSquared (5/7/2008)


    I ran some speed and load tests on this code.

    Summary: While the CTE works, and is fast by any normal standard, the Numbers version is even faster, and requires less IO.

    Note: All tests run 5 or more times on an isolated box running no concurrent queries.

    Awesome as usual, Gus. Do me a favor... run the following on that same box, please... let us know how it turns out... Thanks.

    --===== Gus' original test parameter

    DECLARE @params varchar(8000), @Res varchar(10)

    --SET @params = '1,2,3,4,5,6,7,8,9,10';

    select @params = coalesce(@params + ',' + cast(number as varchar(10)),

    cast(number as varchar(10)))

    from dbo.numbers

    where number between 1 and 1820

    DECLARE @Top INT

    SET @Top = LEN(@Params)-1

    ;WITH

    cteTally AS

    (--==== "Modenized" CTE Tally table

    SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Object_ID) AS N

    FROM Master.sys.All_Columns sc1,

    Master.sys.All_Columns sc2

    )

    --Gus' orignal code with a tweek

    SELECT @res =

    SUBSTRING(@params+',', N,

    CHARINDEX(',', @params+',', N) - N) --as Parsed

    FROM cteTally

    WHERE SUBSTRING(',' + @params, N, 1) = ','

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'syscolrdb'. Scan count 2, logical reads 97, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'syscolpars'. Scan count 2, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 10 ms.

    Got as low as 0 CPU one time out of 5, but was 15 or 16 the other 4 times. Not quite as fast as a physical Numbers table (cached, of course), but definitely quite fast.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Jeff Moden (5/7/2008)


    Carl Federl (5/7/2008)


    Another great article ! Thanks

    The first reference for a tally table that I can recall is in "Guide to Sybase and SQL Server" by C. J. Date and D.McGoveran published in June of 1992.

    Does anyone know of an earlier reference ?

    Thanks for the tip, Carl... I've gotta get me a copy of that... anyone who came up with a tool that useful has gotta have other "goodies" in there, as well.

    1992... wasn't that before they made cursors available?

    Believe it or not, we have cursors back then. We didn't use them, at least in my shop, but it was in the code base. The ISBN is 0-201-55710-X and you can get it on Half.com for $0.75! (http://product.half.ebay.com/_W0QQcpidZ2437565QQprZ1144253)

    It was an excellent book for its time, needless to say it's rather dated. Yet I still have a copy on my bookshelf at work.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • This article fails to explicitly qualify the very essence of the foundation of this programming from the get go . . .

    That the author is way more knowledgeable than most the rest of us.

    Great article. I already implemented it into an existing solution that was doing 250000+ loops. My boss asked me if I had notice some reports were a bit faster. "No, not recently. I'll keep a tally on it, though.":cool:;):D

  • Hi Hugo,

    Thanks for the feedback on both the "update" problem and the "bin stacking" problem. I'll be sure to post a solution if I come up with one.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ALZDBA (5/8/2008)


    :humble:

    I'm not worthy to even read this article, but I've been a bad boy and did just that. πŸ˜€ Mea culpa.

    Clear, to the point, no rocket science involved, practical, very recognizable situations, ...... 😎

    Love it all :w00t:

    Everything comes with a cost.... With this article, the cost is very negative ! :blink: ..... :ermm: ...... :w00t: ....... 😎

    Awesome... thanks, Johan.

    Hey! What kind of bike is in your avatar?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David Jackson (5/8/2008)


    What can I say, excellent article, again! Prompted me to put a link to here[/url] on my site, just to make sure I don't lose these...

    Well done Jeff. πŸ™‚

    Dave J

    Wow! Pretty nice site you're building there, David! Thanks for letting me be a part of it! Being put into a "must read" category is one heck of an honor. Thanks! :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ian Crandell (5/8/2008)


    Jeff,

    As always an excellent, excellent article!:smooooth: I have always seen you use the tally table in your posts and you usually explain it, but this article provided me the best explanation. I enjoyed how you included the comparisons to the loops.

    There was a lot of discussion of the name in the post to your tally table script that appeared last week (and a little in this thread). I like tally. Besides, if work is going really good (or you are in a dull meeting), you could always cry out β€œTally Ho”!:w00t:

    Yeah... I like the name "Tally" quite a bit, as well. And it's not likely to become a reserved word, either. πŸ˜€ If you lookup the definition of "tally", some of the definitions, like "to count", fit.

    I appreciate your comments on the comparisons to loops. I just had to include loop code... I wanted to make it really easy for people to test to see what a difference the Tally table makes even in simple code as well as understanding that, deep down, a Tally table makes a "loop", as well...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • creeds (5/8/2008)


    Thanks for the article. I've been using something similar for awhile. I created a function that returns a table of entries. It doesn't run quite as quick as the examples, but it can be run within a JOIN.

    Here's the code:

    Thanks for the feedback and the code you posted. What version of SQL Server are you using? I might be able to turn a treat for you...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GSquared (5/8/2008)


    Got as low as 0 CPU one time out of 5, but was 15 or 16 the other 4 times. Not quite as fast as a physical Numbers table (cached, of course), but definitely quite fast.

    Outstanding. Thanks for the tests, Gus!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 91 through 105 (of 511 total)

You must be logged in to reply to this topic. Login to reply