May 23, 2007 at 7:57 am
My old laptop at work with 500 MB of RAM that will be replaced this week took 9 seconds. My newer laptop at home took less than 1 second. Actually, the time is just for displaying the numbers.
Aunt Kathi Data Platform MVP
Author of Expert T-SQL Window Functions
Simple-Talk Editor
May 23, 2007 at 8:17 am
If you have a subscription for SQL Server Magazine there is an article by Ben-Gan:
http://www.sqlmag.com/Article/ArticleID/94376/sql_server_94376.html
where he compares several different methods of creating a sequence of integers placed in a table and the "multiplicative" method mentioned above seems to be the fastest.
May 23, 2007 at 2:32 pm
Nice method by Ben-Gan! He "wins", 'cause I just bought his book.
You could do this as a recursive CTE; that would be cool.
http://msdn2.microsoft.com/en-us/library/ms186243.aspx
Signature is NULL
May 23, 2007 at 6:47 pm
Than is impressive. Thanks, Paul.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 17, 2007 at 6:33 pm
Not sure why I came back to this article, but I have to take back some of what I said about the code that Kathi found in Itzik Ben-Gan's book...
WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
num AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
SELECT N FROM NUM WHERE N <= 1000000;
From what I can see, the general concensus is that the code above takes about 7-9 seconds to run on most machines... it's also a compartively lot of code to "remember"...
The basis of the code I use to generate Tally tables and large volumes of test data is much shorter and only takes a little over 2 seconds to do the same thing that Ben-Gan's code does (1,000,000 rows of sequential numbers)... I suppose something similar could be done in a CTE with Row_Number over without all the individual CTE's in the above code, but the Tally table method is almost 4 times faster...
--===== Create and populate the test table on the fly
-- (Takes just over 2 seconds to populate on my slow 6 year old computer)
SELECT TOP 1000000
IDENTITY(INT,1,1) AS Number
INTO dbo.TestTable
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
Matt Miller and I have been doing some extreme testing of some of the new functionality available in 2k5 and above... although it's highly convenient, it's not always better or faster (in fact, it's usually neither). Choose your "weapons" wisely 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
March 31, 2008 at 7:22 am
Daniel Wilson (5/22/2007)
thanks for the intro to CTE's! I know I've written the same SELECT statement into 2 parts of a query before. No more ... ah ... once we get all our customers to switch to 2005!
Temp tables make for an effective 2000 workaround. I don't yet have 2005 (looking forward to October), so I haven't gotten to use CTEs yet.
Advantages of temp tables (relative to CTE):
Persists until the end of the calling context, useful if you need to use the same information for more than one query
Can be indexed separately from the source table
Disadvantages of temp tables:
Does not inherit indices from source table(s)
Persists until the end of the calling context, so you may need to remember to drop it when you're done with it. Memory resident longer than strictly necessary in some circumstances.
Edited for clarity
March 31, 2008 at 7:39 pm
srienstr (3/31/2008)
Daniel Wilson (5/22/2007)
thanks for the intro to CTE's! I know I've written the same SELECT statement into 2 parts of a query before. No more ... ah ... once we get all our customers to switch to 2005!Temp tables make for an effective 2000 workaround. I don't yet have 2005 (looking forward to October), so I haven't gotten to use CTEs yet.
Advantages:
Persists, useful if you need to use the same information for more than one query
Can be indexed separately from the source table
Disadvantages:
Does not inherit indices from source table(s)
Persists, so you need to remember to drop it when you're done with it. Memory resident longer than strictly necessary in some circumstances.
Hard to tell whether your Advantages/Disadvantages are about Temp Tables or CTE's. I think you're talking about temp tables. If so, I absolutely agree with the advantages and the first disadvantage.
On the "Persists" disadvantage, I kinda of agree if the sproc is big and the Temp table is used only at the beginning. I just want everyone to be clear that Temp tables will automatically drop when the proc that created it ends or the session that created it disonnects.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 1, 2008 at 6:57 am
Jeff Moden (3/31/2008)
On the "Persists" disadvantage, I kinda of agree if the sproc is big and the Temp table is used only at the beginning. I just want everyone to be clear that Temp tables will automatically drop when the proc that created it ends or the session that created it disonnects.
True, but it could build up quite a bit if someone builds the table in QA and uses the same QA window throughout the day. I agree it's not nearly so much of an issue once the process is made into a sproc.
April 1, 2008 at 7:41 am
srienstr (4/1/2008)
Jeff Moden (3/31/2008)
On the "Persists" disadvantage, I kinda of agree if the sproc is big and the Temp table is used only at the beginning. I just want everyone to be clear that Temp tables will automatically drop when the proc that created it ends or the session that created it disonnects.True, but it could build up quite a bit if someone builds the table in QA and uses the same QA window throughout the day. I agree it's not nearly so much of an issue once the process is made into a sproc.
Well - garbage collection has always sucked, in pretty much every setting and every language I've ever run into. I personally think the SQL Server does a reasonable job at it, but still - handle your own trash, an automated system can never be as aggressive as you or know precisely when it's okay to get rid of it. I personally try to never leave anything up to the automated system to clean up.
As to the advantages/disadvantages: it's actually rather easy to build CTE's that essentially "hide" temp tables (meaning - they materialize as worktables, etc...). don't fall for the CTE = no temp table. If the scenario calls for it, sometimes it's best to just bite the bullet and control the object you need to have: create the temp table, and have the ability to index it, alter it and drop it.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 12, 2008 at 4:49 am
srienstr (4/1/2008)
Jeff Moden (3/31/2008)
On the "Persists" disadvantage, I kinda of agree if the sproc is big and the Temp table is used only at the beginning. I just want everyone to be clear that Temp tables will automatically drop when the proc that created it ends or the session that created it disonnects.True, but it could build up quite a bit if someone builds the table in QA and uses the same QA window throughout the day. I agree it's not nearly so much of an issue once the process is made into a sproc.
That's a horse of a differnet color... humans have to remember to cleanup their own mess. Easy to do, to... just start a new window and close the other one.
Having data persist in a temp table during troubleshooting can be a huge advantage because you may not need to run ALL the code like you might with a CTE, Derived Table, or Table variable.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 21, 2009 at 10:11 am
Dear Jeff: Please pardon me if this seems to be a naive question - however I wanted to understand why you have chosen to use the Newid() function here when generating the random numbers? Couldn't we have just used Rand() without any parameters as such?
Saurabh Dwivedy
___________________________________________________________
My Blog: http://tinyurl.com/dwivedys
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537
November 21, 2009 at 10:53 am
Saurabh Dwivedy (11/21/2009)
Dear Jeff: Please pardon me if this seems to be a naive question - however I wanted to understand why you have chosen to use the Newid() function here when generating the random numbers? Couldn't we have just used Rand() without any parameters as such?
Excellent question...
The problem with RAND() is that it's like GETDATE() in a query... it's only calculated once per query. Oh, it'll change every time you run the query, but it will return all the same number for all rows in the query.
The irony of it all is that the RAND() function requires a "random seed" in order to return different values in the same query. The only "random seed" available is NEWID().
Run this and see...
SELECT TOP 100
RAND() AS UnseededRAND,
RAND(CHECKSUM(NEWID())) AS SeededRAND
FROM Master.sys.All_Columns ac1
CROSS JOIN Master.sys.All_Columns ac2
RAND can't take NEWID() as a seed... it has to be converted first. I used to convert it to VARBINARY when I first started using it. Some folks smarter than me observed that CHECKSUM will convert to INT quite a bit faster than a converstion to BINARY or VARBINARY.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 21, 2009 at 11:01 am
Dear Jeff:
You are truly awesome. Heartfelt thank you for explaining it so succinctly and beautifully. I was able to understand every syllable of it.
Regards,
Saurabh.
Saurabh Dwivedy
___________________________________________________________
My Blog: http://tinyurl.com/dwivedys
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537
November 21, 2009 at 11:33 am
Understanding is the key... thanks, Saurabh. :blush:
--Jeff Moden
Change is inevitable... Change for the better is not.
November 21, 2009 at 8:14 pm
Jeff Moden (11/21/2009)
Understanding is the key... thanks, Saurabh. :blush:
Actually, Jeff, in a query such as this:
select getdate(), getdate();
the getdate() function is actually called twice. Found this out during the duscussion regarding bitbuckets QotD a few days ago.
Viewing 15 posts - 16 through 30 (of 32 total)
You must be logged in to reply to this topic. Login to reply