May 7, 2008 at 9:55 pm
Carl Federl (5/7/2008)
From the article, you wrote " Tally table .. starting at 0 or 1 (mine start at 1)"There are advantages for having the Tally table include 0 and the "Freight by Date" problem becomes easier if 0 is included.
Yep... I absolutely agree. It depends on what most of your work with the Tally table will be. If most of that work is with dates, then the zero based Tally table is the better way to go. Most of my work doesn't involve dates so I use a Unit based Tally table... keeps me from having to write AND t.N BETWEEN 1 AND...
Thanks for posting all the code, Carl. Lots of folks talk about alternatives... when folks like you, GSquared, and all the others offer up code, it really adds to the discussion. Even when some folks offer up code that doesn't quite sit right, it shows that they're thinking and others jump in and think with them... and it helps everyone.
Thanks again! 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 9:59 pm
GSquared (5/7/2008)
Hugo:Just SP is fine (though I personally dislike the habit - as if anything other than a stored procedure might follow an EXEC keyword). It's SP_ (with an underscore directly following the SP) that turns it into the special prefix reserved for system stored procedures.
Actually, you can execute scalar UDFs.
create function ExecTest()
returns int
as
begin
return (1)
end
go
declare @a int
exec @a = dbo.exectest
select @a
Will select 1
Interesting sidebar, there... you beat me to it, too! Thanks Gus!
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 10:15 pm
GSquared (5/7/2008)
I ran some speed and load tests on this code.Summary: While the CTE works, and is fast by any normal standard, the Numbers version is even faster, and requires less IO.
Note: All tests run 5 or more times on an isolated box running no concurrent queries.
Awesome as usual, Gus. Do me a favor... run the following on that same box, please... let us know how it turns out... Thanks.
--===== Gus' original test parameter
DECLARE @params varchar(8000), @Res varchar(10)
--SET @params = '1,2,3,4,5,6,7,8,9,10';
select @params = coalesce(@params + ',' + cast(number as varchar(10)),
cast(number as varchar(10)))
from dbo.numbers
where number between 1 and 1820
DECLARE @Top INT
SET @Top = LEN(@Params)-1
;WITH
cteTally AS
(--==== "Modenized" CTE Tally table
SELECT TOP(@Top) ROW_NUMBER() OVER(ORDER BY sc1.Object_ID) AS N
FROM Master.sys.All_Columns sc1,
Master.sys.All_Columns sc2
)
--Gus' orignal code with a tweek
SELECT @res =
SUBSTRING(@params+',', N,
CHARINDEX(',', @params+',', N) - N) --as Parsed
FROM cteTally
WHERE SUBSTRING(',' + @params, N, 1) = ','
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 10:26 pm
JohnG (5/7/2008)
As always, a great article from Jeff.As for performance testing of various techniques by anyone, I am continually amazed as to the incorrect methods for generating sample data.
Example (taken from this thread)
SET @params =
'1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
For the use of a tally table or other mechanism, populating a variable with sample data like the above is completely wrong! Those delimited values would never be passed by an application. The proper approach for generating sample data is to use RANDOM values.
Thanks for the compliment John...
Setting up the @params the way they did is good for "discovery" testing because each parameter element IS numbered sequentially and very easy to troubleshoot. But, I agree, you have to do some random testing and it should be a stress test, as well. Here's some code that'll do both...;)
DECLARE @params VARCHAR(MAX)
SELECT TOP 10000
@params = COALESCE(@Params+',','') + CAST(NEWID() AS VARCHAR(MAX))
FROM Master.sys.All_Columns sc1,
Master.sys.All_Columns sc2
SELECT LEN(@Params), @params
Heh... 369,999 characters across 10,000 totally random 36 character parameters should be just what the doctor ordered. :hehe:
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 10:30 pm
GSquared (5/7/2008)
Excellent article, Jeff. Good introduction.I've been using Numbers tables for a couple of years now, and I just keep finding more uses for them all the time.
Examples include: String and Column parsing, Dates/Times lists, generating test data, cleaning up strings, random number generation, finding missing rows in ID columns
Thanks for the compliment, Gus. I gotta say it again, I thank folks like you a Carl for taking the time to get involved and post code in these disussions. Really get's people thinking.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 10:33 pm
Hugo Kornelis (5/7/2008)
buktseat (5/7/2008)
This is probably a rediculous question, but how can any of your "using the tally table" examples work if you don't declare N? I guess I just don't see how N is getting created/assigned? Thanks!N is a column (or rather: the column) in the Tally table.
Hugo... thanks for answering so many questions some folks had on this discussion. I really appreciate it.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 10:54 pm
RyanRandall (5/7/2008)
Mark (5/7/2008)
Jeff Marcus (5/7/2008)
If you are using SQL Server 2005 you can use ROW_NUMBER instead.
If you're not using SQL 2005, I guess you could simulate ROW_NUMBER like this...
declare @param varchar(8000)
set @param = '101,2008.04.02,1.1,102,2008.04.03,2.2,103,2008.04.04,3.3'
select
max(case when RowNumber % 3 = 0 then v end) as val,
max(case when RowNumber % 3 = 1 then v end) as ord,
max(case when RowNumber % 3 = 2 then v end) as fval
from (
select
len(left(@param, N)) - len(replace(left(@param, N), ',', '')) as RowNumber,
substring(@param, N, charindex(',',','+@param+',',N+1)-N-1) v
from dbo.Tally where N <= len(@param) and substring(','+@param+',',N,1)=',') a
group by RowNumber / 3
/* Results
val ord fval
--------------- ----------------- -----------------
101 2008.04.02 1.1
102 2008.04.03 2.2
103 2008.04.04 3.3
*/
... but at a terrible cost in performance... the use of a Temp table makes it more than 100 times faster...
SET NOCOUNT ON
--===== Make an expanded parameter based on Ryan's
declare @param varchar(8000)
SELECT @param = COALESCE(@param+',','')+'101,2008.04.02,1.1,102,2008.04.03,2.2,103,2008.04.04,3.3'
FROM Tally WHERE N <= 140 --Makes 140 copies of the above in a single parameter
--===== Run Ryan's "tableless" code with a timer
SET STATISTICS TIME ON
select
max(case when RowNumber % 3 = 0 then v end) as val,
max(case when RowNumber % 3 = 1 then v end) as ord,
max(case when RowNumber % 3 = 2 then v end) as fval
from (
select
len(left(@param, N)) - len(replace(left(@param, N), ',', '')) as RowNumber,
substring(@param, N, charindex(',',','+@param+',',N+1)-N-1) v
from dbo.Tally where N <= len(@param) and substring(','+@param+',',N,1)=',') a
group by RowNumber / 3
SET STATISTICS TIME OFF
PRINT REPLICATE('=',100)
--===== Run the temp table code with a timer (uses lots of Ryan's code)
SET STATISTICS TIME ON
SELECT RowNumber = IDENTITY(INT,0,1),
substring(@param, N, charindex(',',','+@param+',',N+1)-N-1) v
INTO #MyHead
FROM dbo.Tally WHERE N <= len(@param) AND substring(','+@param+',',N,1)=','
SELECT max(case when RowNumber % 3 = 0 then v end) as val,
max(case when RowNumber % 3 = 1 then v end) as ord,
max(case when RowNumber % 3 = 2 then v end) as fval
FROM #MyHead
GROUP BY RowNumber / 3
SET STATISTICS TIME OFF
DROP TABLE #MyHead
No... not picking on Ryan... he was just showing a way around using the Temp table... I'm not sure why folks try to avoid them so much. Look at all the Microsoft procs... they use Temp tables just like this...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 11:06 pm
Steve Rosenbach (5/7/2008)
Jeff, once again, you are my hero!!Beautifully-written article about such a useful topic.
I've used a "dates" table for quite a while, and actually have a tally table in the current database that I'm working with, but really didn't understand how to use it until your article.
The "dates" table has been a great help. It's amazing how such seemingly simple things, like these auxilliary tables, can be so useful.
Best regards,
SteveR
Steve, thanks for the great compliment. I really had some hesitation in writing the article because I thought some of the "old dogs" would get after me for writing about something they already knew. Thought they'd hit me with the "why did you state the obvious" bat. 😛 Now that I see the posts in this discussion, I'm tickled to have been able to help those that didn't know. And, the "old dogs" jumped in and helped a lot with fielding questions!!! Lordy, I love this forum!
Glad to have helped and thanks again, Steve. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
May 7, 2008 at 11:14 pm
Wayne West (5/7/2008)
Excellent article, Jeff! I came across an interesting date table article on Tech Republic that was linked here yesterday after it re-surfaced at BuilderAU.Com.au, I updated the loader to run from a tally table and the speed diff was just amazing.I have not yet gotten in to string parsing with a tally table, but I'll get there eventually. I did have an amusing little time re-writing the Bizz/Buzz test using it:
[font="Courier New"]select case when (Number % 15 = 0) then 'BizzBuzz'
when (Number % 3 = 0) then 'Bizz'
when (Number % 5 = 0) then 'Buzz'
else cast(Number as char)
end
from zNumbers
where Number < 100[/font]
Keep up the good work, Jeff!
Heh! Oh No! Not the "SQL Test", again! 😛 I started a heck of a fight when Grant first posted the test... someone had written a loop to solve it and they didn't have any comments, etc, and I started ranting about how all code, even the simple stuff, should be documented and always written with performance and scalability in mind. I even wrote and example using the Tally table similar to the one you posted above. It turned out to be one huge thread! It's still going, too! I get a new email about someone posting to it every once in a while.
Anyway... yeah, Tally tables make a huge difference in performance compared to most any form of loop (I did recently build one exception for Proper Casing, it's a fluke) and will blow the doors off of most forms of recursive CTE's.
Thanks for the feedback and the great compliment! 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2008 at 12:05 am
Eric Wilson (5/7/2008)
Great article.I used an "Integers" table for years, but starting with 2005 found that CLR functions to stream out integers is better (faster and takes virtually no space in the DB).
Or here's a pure T-SQL solution instead of having a table:
http://www.sqlservercentral.com/articles/Datetime+Manipulation/61822/
Cheers.
Thanks for the compliment and the feedback, Eric!
In regards to the article, check this out...
--===== Common presets
SET NOCOUNT ON
DECLARE @Bitbucket INT
DECLARE @MyCount INT
SET @MyCount = 1000000
SET STATISTICS TIME ON
--===== The "other" method
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
num AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
SELECT @Bitbucket = N FROM NUM WHERE N <= @MyCount
--===== Simple available cross-join method
;WITH
cteTally AS
(
SELECT TOP (@MyCount)
N = ROW_NUMBER() OVER(ORDER BY sc1.Object_ID)
FROM sys.all_columns sc1,
sys.all_columns sc2
)
SELECT @Bitbucket = N FROM cteTally
SET STATISTICS TIME OFF
The other thing is, most things only up to 11k or so and an 11k Tally table only takes 200K bytes including the Clustered index.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2008 at 12:18 am
Kevin Kennedy (5/7/2008)
Really enjoyed the article.However... on the parameter split if parameter list happens to be exactually 8000 characters the substring in the select will fail. For example:
DECLARE @Parameter VARCHAR(8000)
SET @Parameter = REPLICATE('1,2,3,4,5,6,7,8,9,10',1000)
will produce @Parameter with a length of 8000 and cause the error. I got around this buy checking the length and if it was 8000 using a select with a case statement looking for zero to be returned by charindex.
Correct... I didn't include any error checking code in my examples... didn't want to detract from what I was trying to show and I'd prefer it to "blow up" rather than returning something like a null. In fact, it'll blow up at 7999 characters because 2 commas are added (you can actually write a split to only add 1 comma or even to handle all 8000 characters, but again, would have detracted from what I was trying to show).
That, not withstanding, thanks for the compliment and the feedback, both! 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2008 at 12:23 am
Matt Miller (5/7/2008)
What - only 12 useful tips? in a single article?:hehe:Stop it now - you're setting the bar too high for the rest of us....
(superlative as usual).....
Ah, my ol' testing buddy! Glad you could make it! Yeah, I know... I'm slippin' in my old age 😛
Thanks for the compliment, Matt.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2008 at 12:27 am
Well, folks... I guess that's just about it for tonight. I'd also like to say thanks to...
Christopher Ford
David McKinney
humbleDBA
Cory Ellingson
Manie Verster
curbina
tbeadle
JJ B
... for the wonderful comments. All of you have made it an absolute joy to write and be a member of this great forum! Thanks again! :):):):)
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2008 at 12:37 am
Hi Jeff,
My preference is actually the CLR function I coded up a while back. It's range is the (-) to (+) limit of the bigint type and it takes no room.
I have used the Integer Domain table/function (which is what it really is) for many uses including ones that span hundreds of thousands of numbers. And an advantage of the CLR wide range is that I never have to do funky offset math, which is a large advantage for simpler code.
(My funct. has one other tiny advantage: pass it (10,5) and it gives a descending list instead of ascending. Sometimes helpful. Although in a truly relational system order shouldn't matter, sometimes knowing it helps.)
Cheers,
--Eric
May 8, 2008 at 12:41 am
Cool... thanks for the feedback, Eric. Sounds like a great CLR! Any chance of twisting your arm into posting the source code for it?
Also... what did you use the hundreds of thousands of integers for?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 61 through 75 (of 511 total)
You must be logged in to reply to this topic. Login to reply