The "Numbers" or "Tally" Table: What it is and how it replaces a loop

  • How about using recursion :hehe: to replace loops? (One of the great new functions in SQL2005)

    No need for a tally or temporary tables.

    Heres a params example:

    DECLARE @params varchar(8000)

    SET @params = '1,2,3,4,5,6,7,8,9,10';

    WITH Params AS (

    -- start with last item

    SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value], len(@Params) as start

    UNION ALL

    -- base case

    SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value], charindex(',', @params) + 1 as start

    UNION ALL

    -- recursive case

    SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value], charindex(',', @params, start) + 1 as start

    FROM Params

    WHERE charindex(',', @params, start) > 0

    )

    SELECT *

    FROM Params

    OPTION (MAXRECURSION 999); -- must be greater than max number of items

  • Another great article ! Thanks

    The first reference for a tally table that I can recall is in "Guide to Sybase and SQL Server" by C. J. Date and D.McGoveran published in June of 1992.

    Does anyone know of an earlier reference ?

    SQL = Scarcely Qualifies as a Language

  • This is a super cool article. Let me repeat, this is a super cool article. What a refreshing read!

    I look back on all the TSQL I have written using loops or cursors - arrrgh! Oh, and all the arguments/discussions on cursors, loops, variables of type table....

    I am hooked on Tally tables. It is an extremely obvious in-your-face concept and I can't imagine why I haven't used this technique before.

    Thanks again.

    -Mike

  • :smooooth: This is slick...... Great work Jeff!

    This is a great example of implementing set-based programming, three thumbs up from me.

    Now to apply this to problems going forward!

    Thanks for the excellent article and insight into this subject.

  • From the article, you wrote " Tally table .. starting at 0 or 1 (mine start at 1)"

    There are advantages for having the Tally table include 0 and the "Freight by Date" problem becomes easier if 0 is included.

    The original SQL solution contains:

    SELECT t.N-1+@DateStart AS ShippedDate

    FROM dbo.Tally t

    WHERE t.N-1+@DateStart <= @DateEnd

    This SQL snippet has "N - 1", which derives the tally to start with 0. Wouldl it not be simplier to have the tally table already contain 0?

    The original SQL solution has a problem that if the tally table does have a row for zero, the results are different (there is an extra day before the oldest shipment).

    The original SQL solution has been modified to eliminate all variables so that the solution can be used in a view, does not assume that the tally table has 0 but allows the tally table to contain 0:

    SELECT dates.ShippedDate

    , COALESCE( SUM(o.Freight) , 0) AS TotalFreight

    FROM dbo.Orders o

    RIGHT OUTER JOIN

    (SELECT t.N - 1 + Shipments.DateStart AS ShippedDate

    FROM dbo.Tally t

    JOIN(SELECTMIN(ShippedDate) AS DateStart

    ,MAX(ShippedDate) AS DateEnd

    FROMdbo.Orders

    ) AS Shipments

    ON t.N BETWEEN 1 AND ( Shipments.DateEnd - Shipments.DateStart + 1 )

    ) AS dates

    ON o.ShippedDate = dates.ShippedDate

    GROUP BY dates.ShippedDate

    ORDER BY dates.ShippedDate

    Here is revised SQL that assumes that the tally table has 0. The solution is slight simplier with the original SQL as comments and bold.

    SELECT dates.ShippedDate

    , COALESCE( SUM(o.Freight) , 0) AS TotalFreight

    FROM dbo.Orders o

    RIGHT OUTER JOIN

    -- (SELECT t.N - 1 + Shipments.DateStart AS ShippedDate

    (SELECT t.N + Shipments.DateStart AS ShippedDate

    FROMdbo.Tally t

    JOIN(SELECT MIN(ShippedDate) AS DateStart

    ,MAX(ShippedDate) AS DateEnd

    FROMdbo.Orders

    ) AS Shipments

    -- ON t.N BETWEEN 1 AND ( Shipments.DateEnd - Shipments.DateStart + 1 )

    ON t.N BETWEEN 0 AND (Shipments.DateEnd - Shipments.DateStart )

    ) AS dates

    ON o.ShippedDate = dates.ShippedDate

    GROUP BY dates.ShippedDate

    ORDER BY dates.ShippedDate

    SQL = Scarcely Qualifies as a Language

  • Fantastic article, Jeff.

    Many thanks.

  • Bravo - Very good article. It is bookmarked and printed.

    -- Cory

  • Jeff you have done it again! Faaaantastic!:P

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • The CTE solution, while elegent, does suffer from a higher resource utilization as the number of element increases. Here is an example with 887 elements.

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @params varchar(8000)

    SET @params =

    '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887';

    WITH Params AS (

    SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value], len(@Params) as start

    UNION ALL

    SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value], charindex(',', @params) + 1 as start

    UNION ALL

    SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value], charindex(',', @params, start) + 1 as start

    FROM Params

    WHERE charindex(',', @params, start) > 0

    )

    select value FROM Params ORDER BY value OPTION (MAXRECURSION 999);

    PRINT 'CTE Done'

    SELECT substring(',' + @params + ',', Tally.N + 1

    ,charindex(',', ',' + @params + ',', Tally.N + 1) - Tally.N - 1)

    AS Value

    FROM tempdb.dbo.Tally AS TAlly

    WHERE Tally.N <= len(',' + @params + ',') - 1

    AND substring(',' + @params + ',', Tally.N, 1) = ','

    ORDER BY value

    PRINT 'Tally Done'

    CPU and I/O statistics are:

    CTE Solution:

    Table 'Worktable'. Scan count 2, logical reads 5321, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times: CPU time = 32 ms, elapsed time = 240 ms.

    Tally Solution:

    Table 'Tally'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times: CPU time = 0 ms, elapsed time = 23 ms.

    SQL = Scarcely Qualifies as a Language

  • Hugo:

    Just SP is fine (though I personally dislike the habit - as if anything other than a stored procedure might follow an EXEC keyword). It's SP_ (with an underscore directly following the SP) that turns it into the special prefix reserved for system stored procedures.

    Actually, you can execute scalar UDFs.

    create function ExecTest()

    returns int

    as

    begin

    return (1)

    end

    go

    declare @a int

    exec @a = dbo.exectest

    select @a

    Will select 1

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Awesome article. Alot of useful examples even though it's just the tip of the iceberg

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • 🙂 Great work Jeff! Five stars!

  • chris (5/7/2008)


    How about using recursion :hehe: to replace loops? (One of the great new functions in SQL2005)

    No need for a tally or temporary tables.

    Heres a params example:

    DECLARE @params varchar(8000)

    SET @params = '1,2,3,4,5,6,7,8,9,10';

    WITH Params AS (

    -- start with last item

    SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value],

    len(@Params) as start

    UNION ALL

    -- base case

    SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],

    charindex(',', @params) + 1 as start

    UNION ALL

    -- recursive case

    SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],

    charindex(',', @params, start) + 1 as start

    FROM Params

    WHERE charindex(',', @params, start) > 0

    )

    SELECT *

    FROM Params

    OPTION (MAXRECURSION 999); -- must be greater than max number of items

    I ran some speed and load tests on this code.

    DECLARE @params varchar(8000)

    --SET @params = '1,2,3,4,5,6,7,8,9,10';

    select @params = coalesce(@params + ',' + cast(number as varchar(10)), cast(number as varchar(10)))

    from dbo.numbers

    where number between 1 and 1820 -- 1820 was most I could fit in varchar(8000) with this pattern

    ;WITH Params AS (

    -- start with last item

    SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params))))

    as [value], len(@Params) as start

    UNION ALL

    -- base case

    SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],

    charindex(',', @params) + 1 as start

    UNION ALL

    -- recursive case

    SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],

    charindex(',', @params, start) + 1 as start

    FROM Params

    WHERE charindex(',', @params, start) > 0

    )

    SELECT *

    FROM Params

    OPTION (MAXRECURSION 2000);

    Average runtime = 90 ms, CPU time = 47 ms, 2 scans & 10920 reads for 'Worktable'.

    Numbers version 1 (uses UDF):

    select parsed

    from common.dbo.stringparser(@params, ',')

    /*

    /****** Object: UserDefinedFunction [dbo].[StringParser] Script Date: 05/07/2008 09:10:53 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE function [dbo].[StringParser]

    (@String_in varchar(max),

    @Delimiter_in char(1))

    returns table

    as

    return(

    SELECT top 100 percent

    SUBSTRING(@String_in+@Delimiter_in, number,

    CHARINDEX(@Delimiter_in, @String_in+@Delimiter_in, number) - number)

    as Parsed,

    row_number() over (order by number) as Row

    FROM numbers

    WHERE number <= LEN(@String_in)

    AND SUBSTRING(@Delimiter_in + @String_in, number, 1) = @Delimiter_in

    ORDER BY number

    );

    */

    @params defined and populated with the same code.

    Avg runtime = 64 ms, 16 ms CPU time, 1 scan & 15 reads from dbo.Numbers.

    Removing the UDF overhead and querying against Numbers directly in the code (this also eliminates the Order By and RowNumber columns from the UDF):

    SELECT

    SUBSTRING(@params+',', number,

    CHARINDEX(',', @params+',', number) - number) as Parsed

    FROM dbo.numbers

    WHERE number <= LEN(@params)

    AND SUBSTRING(',' + @params, number, 1) = ','

    Same method of populating @params.

    Avg runtime = 54 ms, 0 ms CPU time (too short to measure), 1 scan & 15 reads from dbo.Numbers.

    The recursive CTE is definitely fast, but both versions of the Numbers table parser were faster, and required less IO. On the Numbers version that has the code directly in the query (instead of a UDF), the only real time spent on the thing is displaying the results.

    If I change it to:

    DECLARE @params varchar(8000), @Res varchar(10)

    --SET @params = '1,2,3,4,5,6,7,8,9,10';

    select @params = coalesce(@params + ',' + cast(number as varchar(10)),

    cast(number as varchar(10)))

    from dbo.numbers

    where number between 1 and 1820

    SELECT @res =

    SUBSTRING(@params+',', number,

    CHARINDEX(',', @params+',', number) - number) --as Parsed

    FROM dbo.numbers

    WHERE number <= LEN(@params)

    AND SUBSTRING(',' + @params, number, 1) = ','

    Which eliminates returning the results to the client (as if it were an internal query instead of a user query), the total run-time goes down to 4 ms.

    If I change the CTE as follows:

    DECLARE @params varchar(8000), @Res varchar(10)

    --SET @params = '1,2,3,4,5,6,7,8,9,10';

    select @params = coalesce(@params + ',' + cast(number as varchar(10)),

    cast(number as varchar(10)))

    from dbo.numbers

    where number between 1 and 1820

    ;WITH Params AS (

    -- start with last item

    SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value],

    len(@Params) as start

    UNION ALL

    -- base case

    SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],

    charindex(',', @params) + 1 as start

    UNION ALL

    -- recursive case

    SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],

    charindex(',', @params, start) + 1 as start

    FROM Params

    WHERE charindex(',', @params, start) > 0

    )

    SELECT @res = [value]

    FROM Params

    OPTION (MAXRECURSION 2000);

    CPU time is 47 ms and run-time is reduced to 54 ms.

    Summary: While the CTE works, and is fast by any normal standard, the Numbers version is even faster, and requires less IO.

    Note: All tests run 5 or more times on an isolated box running no concurrent queries.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • As always, a great article from Jeff.

    As for performance testing of various techniques by anyone, I am continually amazed as to the incorrect methods for generating sample data.

    Example (taken from this thread)

    SET @params =

    '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19

    For the use of a tally table or other mechanism, populating a variable with sample data like the above is completely wrong! Those delimited values would never be passed by an application. The proper approach for generating sample data is to use RANDOM values.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Excellent article, Jeff. Good introduction.

    I've been using Numbers tables for a couple of years now, and I just keep finding more uses for them all the time.

    Examples include: String and Column parsing, Dates/Times lists, generating test data, cleaning up strings, random number generation, finding missing rows in ID columns

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 16 through 30 (of 511 total)

You must be logged in to reply to this topic. Login to reply