The "Numbers" or "Tally" Table: What it is and how it replaces a loop

  • Jeff Moden (4/7/2013)


    With the understanding that I've not seen nor have I been able to produce such a failure except when an indeterminate formula was used as the target of the ORDER BY, I guess the next question would be, is there a code example somewhere that demonstrates failure of the TOP (MaxIntValue)/ORDER BY? The one that MS published never seems to fail.

    As long as you are not querying tables with more than maxint rows, this method currently always works. Just as the version with TOP 100 PERCENT always worked until it stopped working in SQL Server 2005. But as far as I know, it's not documented, nor guaranteed.

    Also, is there a work around that will produce the desired behavior other than the obvious method of shifting to FOR XML PATH? Or, is that also subject to ORDER BY failing (which would really screw up some real XML)?

    There are two issues here, and we should take care not to confuse them.

    1: ORDER BY in a subquery (or view) - except for some specific exceptions, this is not allowed. In the situations where it is allowed (off the top of my head only TOP and FOR XML), the ORDER BY is guaranteed to be used to specify what rows to qualify/disqualify for the TOP and is also guaranteed to process the rows in the specified order when forming the XML. But not guaranteed to do anything else. If you have TOP 5 ... ORDER BY FirstName in a subquery or view, you will only get names like Aaron and Abby, but unless there is an ORDER BY in the outer query as well, you can still get them in any order. They usually stay in alphabetical order, but this is not guaranteed. For the FOR XML construct, the result is just a single row, so ordering does not apply at all anymore.

    2: Concatenation by variable assignment (SELECT @x = @x + Column FROM ...), either with or without ORDER BY. I have once seen a website with a number of examples that were reproducible and that showed this technique to be unreliable. Unfortunately, I lost the URL and have never been able to find it again. And frankly, I have no idea if any of those examples would have reproduced on newer versions (like SQL Server 2005, or up - yes, it is THAT long ago since I found that site).

    There is a very weird article in the Knowledge Base: http://support.microsoft.com/kb/287515. It first says that "the correct behaviour for an aggregate concatenation query is undefined" (which as far as I know is still true). However, it then further down does provide a workaround "to achieve the expected results from an aggregate concatenation query", without bothering to specify what the "expected results" for this undefined behaviour should be.

    Also, is there a work around that will produce the desired behavior other than the obvious method of shifting to FOR XML PATH? Or, is that also subject to ORDER BY failing (which would really screw up some real XML)?

    The only guaranteed methods I know to correctly concatenate strings are:

    1: FOR XML

    2: Procedural code (i.e. looping over the result set)

    3: Creating your own CLR user-defined aggregate function. When this feature was new, there was no option to preserve order in CLR user-defined aggregates (there was syntax for it, but it had no effect). I have no idea if this has changed since; I think it has not in SQL Server 2008, but I stopped paying attention after that.

    If you use ORDER BY and FOR XML at the same level of the query, you can rely on the results being added to the XML result in the desired order. If you use ORDER BY at a different subquery-level, it's anyone's guess.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I have a quick question, in the original post the final output is row by row. How do you get it into columns using id? Thanks.

    Disregard. I figured it out I needed to key off the tabby value! What an amazingly quick solution! Thanks again.

  • rayoftennessee (6/13/2013)


    I have a quick question, in the original post the final output is row by row. How do you get it into columns using id? Thanks.

    Disregard. I figured it out I needed to key off the tabby value! What an amazingly quick solution! Thanks again.

    Hi Ray,

    Thanks for stopping by and the thoughtful comment.

    Do you have a coded solution that you'd like to share for the problem you mentioned above? It might help someone else out with a similar problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • *uuh*

    Hi, Jeff,

    my thanks to you for your great articles here on SSC as well as your comments outside.

    And it was good of you to answer the idea regarding the use of recursion in detail: I learned a lot today!

    Best regards

    JP

    ________________________________________________________
    If you set out to do something, something else must be done first.

  • Hi Jeff,

    I apologize if you've addressed this elsewhere, but I was sort of curious if you'd done any testing of tally/inline tally tables vs. sequences in 2012?

    http://technet.microsoft.com/en-us/library/ff878091.aspx

    I'd be interested to see them thrown into the mix.

    Thanks,

    Erik

  • I'm not sure that sequences work the same way as a Tally table. It seems that their purposes are different.

    Could you explain how would you use a sequence instead of a tally table?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi Luis,

    I recently saw a PostgreSQL example where a series of dates was generated that reminded me a bit of tally table code. That's what got me curious about sequences in MS SQL. I'm not entirely sure how I'd use it here, but wanted to ask in case one of you vastly smarter people had already thought of something 🙂

    http://pgexercises.com/questions/aggregates/rollingavg.html

    select dategen.date,

    (

    -- correlated subquery that, for each day fed into it,

    -- finds the average revenue for the last 15 days

    select sum(case

    when memid = 0 then slots * facs.guestcost

    else slots * membercost

    end) as rev

    from cd.bookings bks

    inner join cd.facilities facs

    on bks.facid = facs.facid

    where bks.starttime > dategen.date - interval '14 days'

    and bks.starttime < dategen.date + interval '1 day'

    )/15 as revenue

    from

    (

    -- generates a list of days in august

    select cast(generate_series(timestamp '2012-08-01',

    '2012-08-31','1 day') as date) as date

    ) as dategen

    order by dategen.date;

  • Thanks for stopping by, Erik.

    From what I understand, the SEQUENCEs of 2012 aren't the same as the "Generate_Series" functionality of PostGre. SEQUENCEs still rely on rows being present where "Generate_Series" generates a row set. Not quite the same thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Generate_Series sounds like a great function, but it's not the same as Sequences in SQL Server (or Oracle FWIW).

    Sequences are used to fetch values but they will keep track of the last value used to return the following value the next time the sequence is used. Sequences are similar to identity column values but are used different and are independent objects.

    For something similar to Generate_Series, some people have constructed iTVFunctions for tally or calendar tables built on the fly.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Yeah, I saw the limitations last night after I had a chance to mess with it. I was hoping there would be a quick way to use the sequence values as positions in data or something. Ah well.

    Thanks for the replies

  • Disney uses a AS/400 based PMS which uses sequences on every record. I never started a conversation as to it purpose, but I'm assuming that it provides a means of (almost) guaranteeing that each transaction record (regardless of table) is (almost) uniquely identified. The sequence reset at 10,000,000,000. Depending on the retention level, the same sequence number could appear more than once at the same time in the db regardless of table. The probability of the same sequence number occurring more than once in the same table being miniscule. If memory serves the primary key was composed of the transaction date plus the sequence number. I can't remember if the sequence was stored separately or combined with the trans date. It's only been 13 years.

  • Thank you for this article.

    I'm very new in SQL-Server field and it was very useful for me, but I have one question which not clear enough for me:

    Suppose I want to use "Tally table" in my procedure, and my co-worker planning to use "Tally table" in his procedure.

    Do we have to use different Tally table for each procedure.

    If it so, I think it will be so many different "Tally tables" in our database (one Tally table on each procedure).

    If we could use the only Tally table for each procedure then how to fill this table sequential data? I need 1000 rows, somebody else can decide that it's too much and decrease the amount...

    Please clear this moment for me...

  • You don't need a different tally table for each procedure. You only need one and you'll just determine the range when you query it.

    You could as well create a view instead of a physical tally table

    CREATE VIEW vTally

    WITH SCHEMABINDING

    AS

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b) --10E+4 or 10,000 rows max

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) n

    FROM E4

    or a table valued function

    CREATE FUNCTION fTally(@From int, @To int)

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b) --10E+4 or 10,000 rows max

    SELECT TOP( @To - @From + 1) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) + @From - 1 n

    FROM E4

    Which run with 0 reads as they're calculated on the fly.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Personally, I'd much prefer the iTVF (inline Table Valued Function) over the view because it runs as fast as the view and is much easier to control the range of numbers returned.

    Shifting gears a bit, I don't try to calculate ranges that start with other than 0 or 1 in my Tally function because it adds a bit (small but still present) of unnecessary overhead. Most of the work I do starts at 0 or 1 and when dealing with large ranges and large amount of data, every bit helps even at the speed of today's machines. If I need a range to be offset, I do that calculation externally. I also do non-whole number and non-sequential sequences to having just a base 0 or 1 to whatever function works the best for me.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • my2ndbox (7/4/2014)


    Thank you for this article.

    I'm very new in SQL-Server field and it was very useful for me, but I have one question which not clear enough for me:

    Suppose I want to use "Tally table" in my procedure, and my co-worker planning to use "Tally table" in his procedure.

    Do we have to use different Tally table for each procedure.

    If it so, I think it will be so many different "Tally tables" in our database (one Tally table on each procedure).

    If we could use the only Tally table for each procedure then how to fill this table sequential data? I need 1000 rows, somebody else can decide that it's too much and decrease the amount...

    Please clear this moment for me...

    You only need one Tally Table or one Tally function such as that which Luis posted. I use both.

    My Tally Table is setup for 11,000 rows because I do a fair bit of work with mortgages. 11,000 rows is a number just a little larger than 30 years worth of days. The article explains how to build one. Make sure that you don't forget the unique clustered index in the form of a PK.

    The Tally function that I use looks like this. It will allow a starting value of either 0 or 1 with virtually no overhead for making that decision.

    /****** Object: UserDefinedFunction [dbo].[fnTally] Script Date: 07/04/2014 15:02:49 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE FUNCTION [dbo].[fnTally]

    /**********************************************************************************************************************

    Purpose:

    Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Billion.

    As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.

    Usage:

    --===== Syntax example (Returns BIGINT)

    SELECT t.N

    FROM dbo.fnTally(@ZeroOrOne, @MaxN) t

    ;

    Notes:

    1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.

    Refer to the following URL for how it works. http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

    2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type

    will cause the sequence to start at 1.

    3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.

    5. If @MaxN is negative or NULL, a "TOP" error will be returned.

    6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger

    number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with

    that many values, you should consider using a different tool. ;-)

    7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending

    sort is required, use code similar to the following. Performance will decrease by about 27% but it's still

    very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.

    If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

    DECLARE @MaxN BIGINT;

    SELECT @MaxN = 1000;

    SELECT DescendingN = @MaxN-N+1

    FROM dbo.fnTally(1,@MaxN);

    8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    Revision History:

    Rev 00 - Unknown - Jeff Moden

    - Initial creation with error handling for @MaxN.

    Rev 01 - 09 Feb 2013 - Jeff Moden

    - Modified to start at 0 or 1.

    Rev 02 - 16 May 2013 - Jeff Moden

    - Removed error handling for @MaxN because of exceptional cases.

    **********************************************************************************************************************/

    (@ZeroOrOne BIT, @MaxN BIGINT)

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN WITH

    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) --10E1 or 10 rows

    , E3(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c) --10E3 or 1 Thousand rows

    , E9(N) AS (SELECT 1 FROM E3 a, E3 b, E3 c) --10E9 or 1 Billion rows

    SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.

    UNION ALL

    SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E9 -- Values from 1 to @MaxN

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 391 through 405 (of 511 total)

You must be logged in to reply to this topic. Login to reply