The "Numbers" or "Tally" Table: What it is and how it replaces a loop

Question

The "Numbers" or "Tally" Table: What it is and how it replaces a loop

Viewing 15 posts - 391 through 405 (of 511 total)

You must be logged in to reply to this topic. Login to reply

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 1

Jeff Moden (4/7/2013)
With the understanding that I've not seen nor have I been able to produce such a failure except when an indeterminate formula was used as the target of the ORDER BY, I guess the next question would be, is there a code example somewhere that demonstrates failure of the TOP (MaxIntValue)/ORDER BY? The one that MS published never seems to fail.

As long as you are not querying tables with more than maxint rows, this method currently always works. Just as the version with TOP 100 PERCENT always worked until it stopped working in SQL Server 2005. But as far as I know, it's not documented, nor guaranteed.

Also, is there a work around that will produce the desired behavior other than the obvious method of shifting to FOR XML PATH? Or, is that also subject to ORDER BY failing (which would really screw up some real XML)?

There are two issues here, and we should take care not to confuse them.

1: ORDER BY in a subquery (or view) - except for some specific exceptions, this is not allowed. In the situations where it is allowed (off the top of my head only TOP and FOR XML), the ORDER BY is guaranteed to be used to specify what rows to qualify/disqualify for the TOP and is also guaranteed to process the rows in the specified order when forming the XML. But not guaranteed to do anything else. If you have TOP 5 ... ORDER BY FirstName in a subquery or view, you will only get names like Aaron and Abby, but unless there is an ORDER BY in the outer query as well, you can still get them in any order. They usually stay in alphabetical order, but this is not guaranteed. For the FOR XML construct, the result is just a single row, so ordering does not apply at all anymore.

2: Concatenation by variable assignment (SELECT @x = @x + Column FROM ...), either with or without ORDER BY. I have once seen a website with a number of examples that were reproducible and that showed this technique to be unreliable. Unfortunately, I lost the URL and have never been able to find it again. And frankly, I have no idea if any of those examples would have reproduced on newer versions (like SQL Server 2005, or up - yes, it is THAT long ago since I found that site).

There is a very weird article in the Knowledge Base: http://support.microsoft.com/kb/287515. It first says that "the correct behaviour for an aggregate concatenation query is undefined" (which as far as I know is still true). However, it then further down does provide a workaround "to achieve the expected results from an aggregate concatenation query", without bothering to specify what the "expected results" for this undefined behaviour should be.

Also, is there a work around that will produce the desired behavior other than the obvious method of shifting to FOR XML PATH? Or, is that also subject to ORDER BY failing (which would really screw up some real XML)?

The only guaranteed methods I know to correctly concatenate strings are:

1: FOR XML

2: Procedural code (i.e. looping over the result set)

3: Creating your own CLR user-defined aggregate function. When this feature was new, there was no option to preserve order in CLR user-defined aggregates (there was syntax for it, but it had no effect). I have no idea if this has changed since; I think it has not in SQL Server 2008, but I stopped paying attention after that.

If you use ORDER BY and FOR XML at the same level of the query, you can rely on the results being added to the XML result in the desired order. If you use ORDER BY at a different subquery-level, it's anyone's guess.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

rayoftennessee Newbie Points: 1 More actions · Answer 2

I have a quick question, in the original post the final output is row by row. How do you get it into columns using id? Thanks.

Disregard. I figured it out I needed to key off the tabby value! What an amazingly quick solution! Thanks again.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 3

rayoftennessee (6/13/2013)
I have a quick question, in the original post the final output is row by row. How do you get it into columns using id? Thanks.
Disregard. I figured it out I needed to key off the tabby value! What an amazingly quick solution! Thanks again.

Hi Ray,

Thanks for stopping by and the thoughtful comment.

Do you have a coded solution that you'd like to share for the problem you mentioned above? It might help someone else out with a similar problem.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dscheypie SSCommitted Points: 1545 More actions · Answer 4

*uuh*

Hi, Jeff,

my thanks to you for your great articles here on SSC as well as your comments outside.

And it was good of you to answer the idea regarding the use of recursion in detail: I learned a lot today!

Best regards

JP

________________________________________________________
If you set out to do something, something else must be done first.

sqldriver SSChampion Points: 10440 More actions · Answer 5

Hi Jeff,

I apologize if you've addressed this elsewhere, but I was sort of curious if you'd done any testing of tally/inline tally tables vs. sequences in 2012?

http://technet.microsoft.com/en-us/library/ff878091.aspx

I'd be interested to see them thrown into the mix.

Thanks,

Erik

Luis Cazares SSC Guru Points: 183706 More actions · Answer 6

I'm not sure that sequences work the same way as a Tally table. It seems that their purposes are different.

Could you explain how would you use a sequence instead of a tally table?

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

sqldriver SSChampion Points: 10440 More actions · Answer 7

Hi Luis,

I recently saw a PostgreSQL example where a series of dates was generated that reminded me a bit of tally table code. That's what got me curious about sequences in MS SQL. I'm not entirely sure how I'd use it here, but wanted to ask in case one of you vastly smarter people had already thought of something 🙂

http://pgexercises.com/questions/aggregates/rollingavg.html

select dategen.date,

(

-- correlated subquery that, for each day fed into it,

-- finds the average revenue for the last 15 days

select sum(case

when memid = 0 then slots * facs.guestcost

else slots * membercost

end) as rev

from cd.bookings bks

inner join cd.facilities facs

on bks.facid = facs.facid

where bks.starttime > dategen.date - interval '14 days'

and bks.starttime < dategen.date + interval '1 day'

)/15 as revenue

from

(

-- generates a list of days in august

select cast(generate_series(timestamp '2012-08-01',

'2012-08-31','1 day') as date) as date

) as dategen

order by dategen.date;

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 8

Thanks for stopping by, Erik.

From what I understand, the SEQUENCEs of 2012 aren't the same as the "Generate_Series" functionality of PostGre. SEQUENCEs still rely on rows being present where "Generate_Series" generates a row set. Not quite the same thing.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Luis Cazares SSC Guru Points: 183706 More actions · Answer 9

Generate_Series sounds like a great function, but it's not the same as Sequences in SQL Server (or Oracle FWIW).

Sequences are used to fetch values but they will keep track of the last value used to return the following value the next time the sequence is used. Sequences are similar to identity column values but are used different and are independent objects.

For something similar to Generate_Series, some people have constructed iTVFunctions for tally or calendar tables built on the fly.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

sqldriver SSChampion Points: 10440 More actions · Answer 10

Yeah, I saw the limitations last night after I had a chance to mess with it. I was hoping there would be a quick way to use the sequence values as positions in data or something. Ah well.

Thanks for the replies

david.holley Hall of Fame Points: 3627 More actions · Answer 11

Disney uses a AS/400 based PMS which uses sequences on every record. I never started a conversation as to it purpose, but I'm assuming that it provides a means of (almost) guaranteeing that each transaction record (regardless of table) is (almost) uniquely identified. The sequence reset at 10,000,000,000. Depending on the retention level, the same sequence number could appear more than once at the same time in the db regardless of table. The probability of the same sequence number occurring more than once in the same table being miniscule. If memory serves the primary key was composed of the transaction date plus the sequence number. I can't remember if the sequence was stored separately or combined with the trans date. It's only been 13 years.

my2ndbox Newbie Points: 1 More actions · Answer 12

Thank you for this article.

I'm very new in SQL-Server field and it was very useful for me, but I have one question which not clear enough for me:

Suppose I want to use "Tally table" in my procedure, and my co-worker planning to use "Tally table" in his procedure.

Do we have to use different Tally table for each procedure.

If it so, I think it will be so many different "Tally tables" in our database (one Tally table on each procedure).

If we could use the only Tally table for each procedure then how to fill this table sequential data? I need 1000 rows, somebody else can decide that it's too much and decrease the amount...

Please clear this moment for me...

Luis Cazares SSC Guru Points: 183706 More actions · Answer 13

You don't need a different tally table for each procedure. You only need one and you'll just determine the range when you query it.

You could as well create a view instead of a physical tally table

CREATE VIEW vTally

WITH SCHEMABINDING

AS

WITH E1(N) AS (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), --10E+1 or 10 rows

E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

E4(N) AS (SELECT 1 FROM E2 a, E2 b) --10E+4 or 10,000 rows max

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) n

FROM E4

or a table valued function

CREATE FUNCTION fTally(@From int, @To int)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

WITH E1(N) AS (

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

), --10E+1 or 10 rows

E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

E4(N) AS (SELECT 1 FROM E2 a, E2 b) --10E+4 or 10,000 rows max

SELECT TOP( @To - @From + 1) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) + @From - 1 n

FROM E4

Which run with 0 reads as they're calculated on the fly.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 14

Personally, I'd much prefer the iTVF (inline Table Valued Function) over the view because it runs as fast as the view and is much easier to control the range of numbers returned.

Shifting gears a bit, I don't try to calculate ranges that start with other than 0 or 1 in my Tally function because it adds a bit (small but still present) of unnecessary overhead. Most of the work I do starts at 0 or 1 and when dealing with large ranges and large amount of data, every bit helps even at the speed of today's machines. If I need a range to be offset, I do that calculation externally. I also do non-whole number and non-sequential sequences to having just a base 0 or 1 to whatever function works the best for me.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 15

my2ndbox (7/4/2014)
Thank you for this article.
I'm very new in SQL-Server field and it was very useful for me, but I have one question which not clear enough for me:
Suppose I want to use "Tally table" in my procedure, and my co-worker planning to use "Tally table" in his procedure.
Do we have to use different Tally table for each procedure.
If it so, I think it will be so many different "Tally tables" in our database (one Tally table on each procedure).
If we could use the only Tally table for each procedure then how to fill this table sequential data? I need 1000 rows, somebody else can decide that it's too much and decrease the amount...
Please clear this moment for me...

You only need one Tally Table or one Tally function such as that which Luis posted. I use both.

My Tally Table is setup for 11,000 rows because I do a fair bit of work with mortgages. 11,000 rows is a number just a little larger than 30 years worth of days. The article explains how to build one. Make sure that you don't forget the unique clustered index in the form of a PK.

The Tally function that I use looks like this. It will allow a starting value of either 0 or 1 with virtually no overhead for making that decision.

/****** Object: UserDefinedFunction [dbo].[fnTally] Script Date: 07/04/2014 15:02:49 ******/

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE FUNCTION [dbo].[fnTally]

/**********************************************************************************************************************

Purpose:

Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Billion.

As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.

Usage:

--===== Syntax example (Returns BIGINT)

SELECT t.N

FROM dbo.fnTally(@ZeroOrOne, @MaxN) t

;

Notes:

1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.

Refer to the following URL for how it works. http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type

will cause the sequence to start at 1.

3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.

5. If @MaxN is negative or NULL, a "TOP" error will be returned.

6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger

number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with

that many values, you should consider using a different tool. ;-)

7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending

sort is required, use code similar to the following. Performance will decrease by about 27% but it's still

very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.

If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

DECLARE @MaxN BIGINT;

SELECT @MaxN = 1000;

SELECT DescendingN = @MaxN-N+1

FROM dbo.fnTally(1,@MaxN);

8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by

ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

Revision History:

Rev 00 - Unknown - Jeff Moden

- Initial creation with error handling for @MaxN.

Rev 01 - 09 Feb 2013 - Jeff Moden

- Modified to start at 0 or 1.

Rev 02 - 16 May 2013 - Jeff Moden

- Removed error handling for @MaxN because of exceptional cases.

**********************************************************************************************************************/

(@ZeroOrOne BIT, @MaxN BIGINT)

RETURNS TABLE WITH SCHEMABINDING AS

RETURN WITH

E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) --10E1 or 10 rows

, E3(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c) --10E3 or 1 Thousand rows

, E9(N) AS (SELECT 1 FROM E3 a, E3 b, E3 c) --10E9 or 1 Billion rows

SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.

UNION ALL

SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E9 -- Values from 1 to @MaxN

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)