January 28, 2009 at 8:01 am
smunson (1/28/2009)
Jeff,Can you supply the code necessary to generate the test data you used? I have a quad-core 2.83GHz machine with SQL 2005 DE SP2 that I'd like to test this scenario against, so I can start to learn how to generate test data, as well as start to see how to test these kinds of things. Thanks!
Steve
(aka smunson)
:):):)
You bet... it's my ol' standard million row test table generator... you've probably seen me use it dozens of times...
DROP TABLE JBMTest
GO
--===== Create and populate a 1,000,000 row test table.
-- Column "RowNum" has a range of 1 to 1,000,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
-- Jeff Moden
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1),
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTest
FROM Master.dbo.SysColumns t1
CROSS JOIN Master.dbo.SysColumns t2
--===== A table is not properly formed unless a Primary Key has been assigned
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (RowNum)
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 8:15 am
Jean-François Bergeron (1/28/2009)
Well, I got a simple 3.0Ghz Dev machine with 4GB of Ram. It's not that quick, trust me. And I ran the query on 200k rows that has around 30 characters in it.
That explains it... less than half the characters I used, almost twice the speed... explains why "it" ran 5 times faster on your machine.
I've been following your posts, articles, and everything made sense to actually remove those damn cursors or while loops, can you tell me why the loop beats the tally in this case? Do I have to test the cursor, and the tally everytime I develop a new solution, to see which one can be quicker?
First, did the code I provide actually get beat? On some machines, it'll win the foot race compared to the memory only solution. Well, except on Tuesdays. π
Also, let me try another thing before I try to answer that question.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 8:56 am
Jeff Moden (1/28/2009)
First, did the code I provide actually get beat? On some machines, it'll win the foot race compared to the memory only solution. Well, except on Tuesdays. π
Yes it did. It runs about as fast as the solution Garadin provided. Hope someone can explain why it behaves that way, because the query plan is not that explicit, it only says it does a constant scan, and I have no clue what it is.
Cheers,
J-F
January 28, 2009 at 9:50 am
Ok... the reason why the Tally table looses is because of the extra calculation I had to put into the STUFF function to make the Tally table appear to count down instead of up so it gets rid of bad characters at the end of the string first. This is important because it's set based and if characters disappear at the beginning first, things start mismatching because the join at the character level has already been done before the STUFF starts processing. And doing a DESCending Order By is a killer because it's the opposite of what the physical order of the Tally table has been put in by the clustered index.
The WHILE loop doesn't care about the order because it doesn't care exactly where a bad character is, order wise, because the next bad character position is recalculated after every STUFF.
Like I said, it's one of those very wierd situations where the order of the process for the Tally table needs to be the exact reverse of the clustered index order. Oddly enough, having a Tally table in reverse order according to the clustered index just doesn't help.
Now, the only way I'd put the faster (in this case) WHILE loop into production is if I documented the hell out of it explaining why it's better to use a Tally table in almost every other case.... I wouldn't want folks to get the impression that all memory only WHILE loops beat the Tally table (they usually don't).
Generally speaking, if the order doesn't matter or the order of the process must be ascending, the Tally table will smoke most other methods. It's only when the process MUST be in the reverse order of the Tally table that you'll to test both methods (Tally vs While). Even on something like the Luhn Mod-10 checksum which also requires processing from the right, a bit of SQL prestidigitation can overcome any performance hits...
CREATE FUNCTION dbo.fnCheckLuhn10
/**********************************************************************************************************************
The function accepts a credit card or other number either as a VARCHAR or an INT and returns a 1 if the numbers match
the LUHN 10 checksum specification and 0 if not.
The input number does NOT need to be digits only. Numbers like 1234-5678-9012-3456 or 1234 5678 9012 3456 are
acceptable.
Revision history:
Rev 00 - 03/08/2005 - Jeff Moden - Initial creation and test.
Rev 01 - 12/28/2006 - Jeff Moden - Performance enhancement using @Multiple thanks to Peter Larson
**********************************************************************************************************************/
--===== Declare I/O parameters
(
@pLuhn VARCHAR(8000)
)
RETURNS INT
AS
BEGIN
--=====================================================================================================================
--===== Declare local variables
DECLARE @CleanedLuhn VARCHAR(8000), --The Luhn number stripped of non-numeric characters
@DigitProduct INT, --The result of multiplying the digit times the multiplier
@Multiplier INT, --1 for odd position digits, 2 for even position digits
@Sum INT, --The Luhn 10 sum
@WorkLuhn VARCHAR(8000) --The clean Luhn number
--===== If present, remove all non-digit characters
IF PATINDEX('%[^0-9]%',@pLuhn) > 0 --If any non-digit characters exist, then...
SELECT @CleanedLuhn = ISNULL(@CleanedLuhn,'')
+ SUBSTRING(@pLuhn,t.N,1)
FROM dbo.Tally t --Contains a list of whole numbers
WHERE t.N <= LEN(@pLuhn) --Limits the join/set based "loop" to the length of the Luhn
AND SUBSTRING(@pLuhn,t.N,1) LIKE '[0-9]' --Non-digits are ignored, only 0-9 are included
--===== Presets
-- Note: Use the cleaned Luhn if it needed cleaning or the original Luhn if not
SELECT @Sum = 0,
@Multiplier = 1,
@WorkLuhn = ISNULL(@CleanedLuhn,@pLuhn)
--===== Calculate the Luhn 10 sum
SELECT @DigitProduct = @Multiplier --1 for odd numbers, 2 for even numbers
* SUBSTRING(@WorkLuhn, t.N, 1), --A given digit in the Luhn
@Sum = @Sum --Luhn 10 sum starts at 0
+ @DigitProduct / 10 --The 1st digit for products > 9, 0 for product < 10
+ @DigitProduct % 10, --The 2nd digit for products > 9 or only digit for product < 10
@Multiplier = 3 - @Multiplier --3-1=2, then 3-2=1, repeats
FROM dbo.Tally t WITH (NOLOCK) --Contains a list of whole numbers
WHERE t.N <= LEN(@WorkLuhn) --Limits the join/set based "loop" to the length of the cleaned Luhn
ORDER BY t.N DESC
--===== If the sum is evenly divisible by 10, then check is ok... return 1.
-- Otherwise, return 0 as "Failed" check
RETURN 1-SIGN(@SUM%10)
--=====================================================================================================================
END
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 9:50 am
Jeff Moden (1/27/2009)
noeld (1/27/2009)
I have found that the best solution for this is to use persisted calculated columns!Haven't tried those yet, Noel... can you give us a quick "how to" on something like this? I'd seriously like to know.
This is a quick example:
--- Jeff Table: Note that because initially you will take a "hit" while creating the column I just used 100,000.
--- Sorry Not a lot of time available today π
IF OBJECT_ID('JBMTest') IS NOT NULL
DROP TABLE JBMTest
GO
--===== Create and populate a 1,000,000 row test table.
-- Column "RowNum" has a range of 1 to 1,000,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
-- Jeff Moden
SELECT TOP 100000
RowNum = IDENTITY(INT,1,1),
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTest
FROM Master.dbo.SysColumns t1
CROSS JOIN Master.dbo.SysColumns t2
--===== A table is not properly formed unless a Primary Key has been assigned
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (RowNum)
--- The Function:
--- Note: you have to make it DETERMINISTIC!!
--- Which in this case it is!
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[fnGICSPFParseGICCatalog]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [fnGICSPFParseGICCatalog]
GO
CREATE FUNCTION dbo.fnGICSPFParseGICCatalog
(@string VARCHAR(8000),
@TrimAll BIT)
RETURNS VARCHAR(8000)
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @IncorrectCharLoc SMALLINT
IF @TrimAll = 1
BEGIN
SET @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%',@string)
WHILE @IncorrectCharLoc > 0
BEGIN
SET @string = STUFF(@string,@IncorrectCharLoc,1,'')
SET @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%',@string)
END
SET @string = @string
END
ELSE
BEGIN
SET @IncorrectCharLoc = PATINDEX('%[^-0-9A-Za-z&.$/]%',@string)
WHILE @IncorrectCharLoc > 0
BEGIN
SET @string = STUFF(@string,@IncorrectCharLoc,1,'')
SET @IncorrectCharLoc = PATINDEX('%[^-0-9A-Za-z&.$/]%',@string)
END
SET @string = @string
END
RETURN @string
END
GO
/SELECT OBJECTPROPERTY( OBJECT_ID('fnGICSPFParseGICCatalog'), 'IsDeterministic')
--- Yeah = 1
--Add the column:
ALTER TABLE JBMTest
ADD MyAlreadyPersistedCol_1 AS dbo.fnGICSPFParseGICCatalog (SomeCSV,1) PERSISTED;
Nowyou will be able to run your queries with "persisted" (already calculated values)
IF you change the source columns the persisted values will change too!
And what's more if you ADD an index on the calculated columns you can even search for them without taking the "hit"
Hope it helps!
* Noel
January 28, 2009 at 9:54 am
Your last couple of sentences put that all together very nicely! I was aware of being able to put indexes on calculated columns, but didn't know you could make them "persist" (almost like an indexed view). I learn something new every day. Thanks Noel.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 10:46 am
Jeff,
I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:
.
.
.
FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2
.
.
.
doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!
Steve
(aka smunson)
:):):)
Steve (aka sgmunson) π π π
Rent Servers for Income (picks and shovels strategy)
January 28, 2009 at 10:51 am
smunson (1/28/2009)
Jeff,I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:
.
.
.
FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2
.
.
.
doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!
Steve
(aka smunson)
:):):)
You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).
January 28, 2009 at 11:12 am
Lynn Pettis (1/28/2009)
smunson (1/28/2009)
Jeff,I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:
.
.
.
FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2
.
.
.
doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!
Steve
(aka smunson)
:):):)
You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).
Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.
If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 11:17 am
Jeff Moden (1/28/2009)
Lynn Pettis (1/28/2009)
smunson (1/28/2009)
Jeff,I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:
.
.
.
FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2
.
.
.
doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!
Steve
(aka smunson)
:):):)
You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).
Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.
If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.
One other thing, Jeff's code doesn't work as is if you have a case-sensitive installation. Found that out as that is how I set mine up. Sorry, I like it that way (I know: sick, weird, any other terms you would like to add).
January 28, 2009 at 11:29 am
Ah... sorry about that... just add the lower case a-z back into the pattern in both spots.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2009 at 11:35 am
No apologies needed Jeff. I just know I have to edit your code before I run it. Happens all the time with code from OP's as well.
January 28, 2009 at 12:31 pm
Jeff Moden (1/28/2009)
Your last couple of sentences put that all together very nicely! I was aware of being able to put indexes on calculated columns, but didn't know you could make them "persist" (almost like an indexed view). I learn something new every day. Thanks Noel.
You are welcome!
... and I guess we all learn something new every day π
* Noel
January 28, 2009 at 1:00 pm
Thanks Lynn, the case sensitivity was indeed my problem. I forget all too often, but I too much prefer the case-sensitive setup, and set mine up that way as well. By the way, here's my final script for testing the functions that have been created in response to the OP's original request. I had to edit out of the Messages tab those times created by setting stats io off, as I had both IO as well as TIME both turned on.
DROP TABLE dbo.SGMTest
GO
--===== Create and populate a 200,000 row test table.
-- Column "RowNum" has a range of 1 to 200,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeText" has a range of "AA..." to "ZZ..." non-unique 69 character strings
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
-- Original Test Table script by Jeff Moden, mods by Steve Munson
CREATE TABLE dbo.SGMTest (
RowNumint IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
SomeIntint,
SomeText69varchar(69),
SomeCSVvarchar(80),
SomeMoneymoney,
SomeDateDateTime,
SomeHex12varchar(12)
)
DECLARE @VC AS char(72)
SET @VC = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$.&*-/ 0123456789abcdefghijklmnopqrstuvwxyz'
PRINT '===================================================================================='
PRINT 'CREATE 200,000 ROW TEST TABLE'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
INSERT INTO dbo.SGMTest (SomeInt, SomeText69, SomeCSV, SomeMoney, SomeDate, SomeHex12)
SELECT TOP 200000
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,
SomeText69 = SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)
+ SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
SomeHex12 = RIGHT(NEWID(),12)
FROM master.dbo.spt_values t1 CROSS JOIN master.dbo.spt_values t2
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
PRINT '===================================================================================='
PRINT 'TEST: dbo.fnGICSPFParseGICCatalog(SomeText69,1)'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT *, dbo.fnGICSPFParseGICCatalog(SomeText69,1)
FROM dbo.SGMTest
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
PRINT '===================================================================================='
PRINT 'TEST: dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT *, dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)
FROM dbo.SGMTest
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
PRINT '===================================================================================='
PRINT 'BASELINE - SELECT * FROM dbo.SGMTest'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT *
FROM dbo.SGMTest
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
PRINT '===================================================================================='
PRINT 'ADDING PERSISTED COLUMN'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
--Add the column:
ALTER TABLE dbo.SGMTest
ADD TRIMMED AS dbo.fnGICSPFParseGICCatalog (SomeText69,1) PERSISTED;
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
PRINT '===================================================================================='
PRINT 'SELECT * FROM dbo.SGMTest'
PRINT '===================================================================================='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT *
FROM dbo.SGMTest
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
The machine these tests ran on has an Intel Q9550 2.83 GHz Quad-Core cpu, with 8GB of RAM, and SQL Server 2005 Developer Edition 64-bit, set to use a max RAM of 4096MB (I like to do video stuff (blu-ray playback and record, and video capture via FireWire)). The operating system is Vista Ultimate 64-bit, with SP1. Here's the SQL @@VERSION:
SELECT @@VERSION
Microsoft SQL Server 2005 - 9.00.3073.00 (X64) Aug 5 2008 14:31:47 Copyright (c) 1988-2005 Microsoft Corporation Developer Edition (64-bit) on Windows NT 6.0 (Build 6001: Service Pack 1)
Steve
(aka smunson)
:):):)
Lynn Pettis (1/28/2009)
Jeff Moden (1/28/2009)
Lynn Pettis (1/28/2009)
smunson (1/28/2009)
Jeff,I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:
.
.
.
FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2
.
.
.
doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!
Steve
(aka smunson)
:):):)
You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).
Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.
If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.
One other thing, Jeff's code doesn't work as is if you have a case-sensitive installation. Found that out as that is how I set mine up. Sorry, I like it that way (I know: sick, weird, any other terms you would like to add).
Steve (aka sgmunson) π π π
Rent Servers for Income (picks and shovels strategy)
January 28, 2009 at 1:10 pm
Here are the test timings I referred to:
====================================================================================
CREATE 200,000 ROW TEST TABLE
====================================================================================
Table 'spt_values'. Scan count 9, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SGMTest'. Scan count 0, logical reads 611388, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5834 ms, elapsed time = 11747 ms.
(200000 row(s) affected)
====================================================================================
TEST: dbo.fnGICSPFParseGICCatalog(SomeText69,1)
====================================================================================
(200000 row(s) affected)
Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 83117 ms, elapsed time = 85873 ms.
====================================================================================
TEST: dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)
====================================================================================
(200000 row(s) affected)
Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 38860 ms, elapsed time = 39423 ms.
====================================================================================
BASELINE - SELECT * FROM dbo.SGMTest
====================================================================================
(200000 row(s) affected)
Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 2447 ms.
====================================================================================
ADDING PERSISTED COLUMN
====================================================================================
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
Table 'SGMTest'. Scan count 1, logical reads 52405, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 86019 ms, elapsed time = 88964 ms.
====================================================================================
SELECT * FROM dbo.SGMTest
====================================================================================
(200000 row(s) affected)
Table 'SGMTest'. Scan count 1, logical reads 9545, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 234 ms, elapsed time = 2504 ms.
By the way, I found that master.dbo.spt_values only has 2346 records, so letting it create a cartesian product only generates some 4,000,000 records, as opposed to master.dbo.syscolumns, which has 11,000+ records, for a cartesian product of 121 million records. Not sure if that will affect performance at all or not, as it would depend on how SQL Server implements TOP under the covers.
Steve
(aka smunson)
:):):)
Steve (aka sgmunson) π π π
Rent Servers for Income (picks and shovels strategy)
Viewing 15 posts - 16 through 30 (of 36 total)
You must be logged in to reply to this topic. Login to reply