Trying to rewrite RBAR Function

  • smunson (1/28/2009)


    Jeff,

    Can you supply the code necessary to generate the test data you used? I have a quad-core 2.83GHz machine with SQL 2005 DE SP2 that I'd like to test this scenario against, so I can start to learn how to generate test data, as well as start to see how to test these kinds of things. Thanks!

    Steve

    (aka smunson)

    :):):)

    You bet... it's my ol' standard million row test table generator... you've probably seen me use it dozens of times...

    DROP TABLE JBMTest

    GO

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Jeff Moden

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jean-FranΓ§ois Bergeron (1/28/2009)


    Well, I got a simple 3.0Ghz Dev machine with 4GB of Ram. It's not that quick, trust me. And I ran the query on 200k rows that has around 30 characters in it.

    That explains it... less than half the characters I used, almost twice the speed... explains why "it" ran 5 times faster on your machine.

    I've been following your posts, articles, and everything made sense to actually remove those damn cursors or while loops, can you tell me why the loop beats the tally in this case? Do I have to test the cursor, and the tally everytime I develop a new solution, to see which one can be quicker?

    First, did the code I provide actually get beat? On some machines, it'll win the foot race compared to the memory only solution. Well, except on Tuesdays. πŸ˜›

    Also, let me try another thing before I try to answer that question.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/28/2009)

    First, did the code I provide actually get beat? On some machines, it'll win the foot race compared to the memory only solution. Well, except on Tuesdays. πŸ˜›

    Yes it did. It runs about as fast as the solution Garadin provided. Hope someone can explain why it behaves that way, because the query plan is not that explicit, it only says it does a constant scan, and I have no clue what it is.

    Cheers,

    J-F

  • Ok... the reason why the Tally table looses is because of the extra calculation I had to put into the STUFF function to make the Tally table appear to count down instead of up so it gets rid of bad characters at the end of the string first. This is important because it's set based and if characters disappear at the beginning first, things start mismatching because the join at the character level has already been done before the STUFF starts processing. And doing a DESCending Order By is a killer because it's the opposite of what the physical order of the Tally table has been put in by the clustered index.

    The WHILE loop doesn't care about the order because it doesn't care exactly where a bad character is, order wise, because the next bad character position is recalculated after every STUFF.

    Like I said, it's one of those very wierd situations where the order of the process for the Tally table needs to be the exact reverse of the clustered index order. Oddly enough, having a Tally table in reverse order according to the clustered index just doesn't help.

    Now, the only way I'd put the faster (in this case) WHILE loop into production is if I documented the hell out of it explaining why it's better to use a Tally table in almost every other case.... I wouldn't want folks to get the impression that all memory only WHILE loops beat the Tally table (they usually don't).

    Generally speaking, if the order doesn't matter or the order of the process must be ascending, the Tally table will smoke most other methods. It's only when the process MUST be in the reverse order of the Tally table that you'll to test both methods (Tally vs While). Even on something like the Luhn Mod-10 checksum which also requires processing from the right, a bit of SQL prestidigitation can overcome any performance hits...

    CREATE FUNCTION dbo.fnCheckLuhn10

    /**********************************************************************************************************************

    The function accepts a credit card or other number either as a VARCHAR or an INT and returns a 1 if the numbers match

    the LUHN 10 checksum specification and 0 if not.

    The input number does NOT need to be digits only. Numbers like 1234-5678-9012-3456 or 1234 5678 9012 3456 are

    acceptable.

    Revision history:

    Rev 00 - 03/08/2005 - Jeff Moden - Initial creation and test.

    Rev 01 - 12/28/2006 - Jeff Moden - Performance enhancement using @Multiple thanks to Peter Larson

    **********************************************************************************************************************/

    --===== Declare I/O parameters

    (

    @pLuhn VARCHAR(8000)

    )

    RETURNS INT

    AS

    BEGIN

    --=====================================================================================================================

    --===== Declare local variables

    DECLARE @CleanedLuhn VARCHAR(8000), --The Luhn number stripped of non-numeric characters

    @DigitProduct INT, --The result of multiplying the digit times the multiplier

    @Multiplier INT, --1 for odd position digits, 2 for even position digits

    @Sum INT, --The Luhn 10 sum

    @WorkLuhn VARCHAR(8000) --The clean Luhn number

    --===== If present, remove all non-digit characters

    IF PATINDEX('%[^0-9]%',@pLuhn) > 0 --If any non-digit characters exist, then...

    SELECT @CleanedLuhn = ISNULL(@CleanedLuhn,'')

    + SUBSTRING(@pLuhn,t.N,1)

    FROM dbo.Tally t --Contains a list of whole numbers

    WHERE t.N <= LEN(@pLuhn) --Limits the join/set based "loop" to the length of the Luhn

    AND SUBSTRING(@pLuhn,t.N,1) LIKE '[0-9]' --Non-digits are ignored, only 0-9 are included

    --===== Presets

    -- Note: Use the cleaned Luhn if it needed cleaning or the original Luhn if not

    SELECT @Sum = 0,

    @Multiplier = 1,

    @WorkLuhn = ISNULL(@CleanedLuhn,@pLuhn)

    --===== Calculate the Luhn 10 sum

    SELECT @DigitProduct = @Multiplier --1 for odd numbers, 2 for even numbers

    * SUBSTRING(@WorkLuhn, t.N, 1), --A given digit in the Luhn

    @Sum = @Sum --Luhn 10 sum starts at 0

    + @DigitProduct / 10 --The 1st digit for products > 9, 0 for product < 10

    + @DigitProduct % 10, --The 2nd digit for products > 9 or only digit for product < 10

    @Multiplier = 3 - @Multiplier --3-1=2, then 3-2=1, repeats

    FROM dbo.Tally t WITH (NOLOCK) --Contains a list of whole numbers

    WHERE t.N <= LEN(@WorkLuhn) --Limits the join/set based "loop" to the length of the cleaned Luhn

    ORDER BY t.N DESC

    --===== If the sum is evenly divisible by 10, then check is ok... return 1.

    -- Otherwise, return 0 as "Failed" check

    RETURN 1-SIGN(@SUM%10)

    --=====================================================================================================================

    END

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/27/2009)


    noeld (1/27/2009)


    I have found that the best solution for this is to use persisted calculated columns!

    Haven't tried those yet, Noel... can you give us a quick "how to" on something like this? I'd seriously like to know.

    This is a quick example:

    --- Jeff Table: Note that because initially you will take a "hit" while creating the column I just used 100,000.

    --- Sorry Not a lot of time available today πŸ˜‰

    IF OBJECT_ID('JBMTest') IS NOT NULL

    DROP TABLE JBMTest

    GO

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Jeff Moden

    SELECT TOP 100000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    --- The Function:

    --- Note: you have to make it DETERMINISTIC!!

    --- Which in this case it is!

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[fnGICSPFParseGICCatalog]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))

    DROP FUNCTION [fnGICSPFParseGICCatalog]

    GO

    CREATE FUNCTION dbo.fnGICSPFParseGICCatalog

    (@string VARCHAR(8000),

    @TrimAll BIT)

    RETURNS VARCHAR(8000)

    WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT

    AS

    BEGIN

    DECLARE @IncorrectCharLoc SMALLINT

    IF @TrimAll = 1

    BEGIN

    SET @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%',@string)

    WHILE @IncorrectCharLoc > 0

    BEGIN

    SET @string = STUFF(@string,@IncorrectCharLoc,1,'')

    SET @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%',@string)

    END

    SET @string = @string

    END

    ELSE

    BEGIN

    SET @IncorrectCharLoc = PATINDEX('%[^-0-9A-Za-z&.$/]%',@string)

    WHILE @IncorrectCharLoc > 0

    BEGIN

    SET @string = STUFF(@string,@IncorrectCharLoc,1,'')

    SET @IncorrectCharLoc = PATINDEX('%[^-0-9A-Za-z&.$/]%',@string)

    END

    SET @string = @string

    END

    RETURN @string

    END

    GO

    /SELECT OBJECTPROPERTY( OBJECT_ID('fnGICSPFParseGICCatalog'), 'IsDeterministic')

    --- Yeah = 1

    --Add the column:

    ALTER TABLE JBMTest

    ADD MyAlreadyPersistedCol_1 AS dbo.fnGICSPFParseGICCatalog (SomeCSV,1) PERSISTED;

    Nowyou will be able to run your queries with "persisted" (already calculated values)

    IF you change the source columns the persisted values will change too!

    And what's more if you ADD an index on the calculated columns you can even search for them without taking the "hit"

    Hope it helps!


    * Noel

  • Your last couple of sentences put that all together very nicely! I was aware of being able to put indexes on calculated columns, but didn't know you could make them "persist" (almost like an indexed view). I learn something new every day. Thanks Noel.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:

    .

    .

    .

    FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

    .

    .

    .

    doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!

    Steve

    (aka smunson)

    :):):)

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • smunson (1/28/2009)


    Jeff,

    I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:

    .

    .

    .

    FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

    .

    .

    .

    doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!

    Steve

    (aka smunson)

    :):):)

    You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).

  • Lynn Pettis (1/28/2009)


    smunson (1/28/2009)


    Jeff,

    I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:

    .

    .

    .

    FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

    .

    .

    .

    doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!

    Steve

    (aka smunson)

    :):):)

    You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).

    Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.

    If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/28/2009)


    Lynn Pettis (1/28/2009)


    smunson (1/28/2009)


    Jeff,

    I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:

    .

    .

    .

    FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

    .

    .

    .

    doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!

    Steve

    (aka smunson)

    :):):)

    You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).

    Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.

    If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.

    One other thing, Jeff's code doesn't work as is if you have a case-sensitive installation. Found that out as that is how I set mine up. Sorry, I like it that way (I know: sick, weird, any other terms you would like to add).

  • Ah... sorry about that... just add the lower case a-z back into the pattern in both spots.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No apologies needed Jeff. I just know I have to edit your code before I run it. Happens all the time with code from OP's as well.

  • Jeff Moden (1/28/2009)


    Your last couple of sentences put that all together very nicely! I was aware of being able to put indexes on calculated columns, but didn't know you could make them "persist" (almost like an indexed view). I learn something new every day. Thanks Noel.

    You are welcome!

    ... and I guess we all learn something new every day πŸ˜€


    * Noel

  • Thanks Lynn, the case sensitivity was indeed my problem. I forget all too often, but I too much prefer the case-sensitive setup, and set mine up that way as well. By the way, here's my final script for testing the functions that have been created in response to the OP's original request. I had to edit out of the Messages tab those times created by setting stats io off, as I had both IO as well as TIME both turned on.

    DROP TABLE dbo.SGMTest

    GO

    --===== Create and populate a 200,000 row test table.

    -- Column "RowNum" has a range of 1 to 200,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeText" has a range of "AA..." to "ZZ..." non-unique 69 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Original Test Table script by Jeff Moden, mods by Steve Munson

    CREATE TABLE dbo.SGMTest (

    RowNumint IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,

    SomeIntint,

    SomeText69varchar(69),

    SomeCSVvarchar(80),

    SomeMoneymoney,

    SomeDateDateTime,

    SomeHex12varchar(12)

    )

    DECLARE @VC AS char(72)

    SET @VC = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$.&*-/ 0123456789abcdefghijklmnopqrstuvwxyz'

    PRINT '===================================================================================='

    PRINT 'CREATE 200,000 ROW TEST TABLE'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    INSERT INTO dbo.SGMTest (SomeInt, SomeText69, SomeCSV, SomeMoney, SomeDate, SomeHex12)

    SELECT TOP 200000

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeText69 = SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1)

    + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1) + SUBSTRING(@VC,ABS(CHECKSUM(NEWID()))%73,1),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    FROM master.dbo.spt_values t1 CROSS JOIN master.dbo.spt_values t2

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT '===================================================================================='

    PRINT 'TEST: dbo.fnGICSPFParseGICCatalog(SomeText69,1)'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *, dbo.fnGICSPFParseGICCatalog(SomeText69,1)

    FROM dbo.SGMTest

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT '===================================================================================='

    PRINT 'TEST: dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *, dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)

    FROM dbo.SGMTest

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT '===================================================================================='

    PRINT 'BASELINE - SELECT * FROM dbo.SGMTest'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *

    FROM dbo.SGMTest

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT '===================================================================================='

    PRINT 'ADDING PERSISTED COLUMN'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    --Add the column:

    ALTER TABLE dbo.SGMTest

    ADD TRIMMED AS dbo.fnGICSPFParseGICCatalog (SomeText69,1) PERSISTED;

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT '===================================================================================='

    PRINT 'SELECT * FROM dbo.SGMTest'

    PRINT '===================================================================================='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *

    FROM dbo.SGMTest

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    The machine these tests ran on has an Intel Q9550 2.83 GHz Quad-Core cpu, with 8GB of RAM, and SQL Server 2005 Developer Edition 64-bit, set to use a max RAM of 4096MB (I like to do video stuff (blu-ray playback and record, and video capture via FireWire)). The operating system is Vista Ultimate 64-bit, with SP1. Here's the SQL @@VERSION:

    SELECT @@VERSION

    Microsoft SQL Server 2005 - 9.00.3073.00 (X64) Aug 5 2008 14:31:47 Copyright (c) 1988-2005 Microsoft Corporation Developer Edition (64-bit) on Windows NT 6.0 (Build 6001: Service Pack 1)

    Steve

    (aka smunson)

    :):):)

    Lynn Pettis (1/28/2009)


    Jeff Moden (1/28/2009)


    Lynn Pettis (1/28/2009)


    smunson (1/28/2009)


    Jeff,

    I have a relatively fresh installation of SQL 2005 Developer Edition, but your object in the following:

    .

    .

    .

    FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2

    .

    .

    .

    doesn't exist on my server. Is this an SQL 2000 object? Is there an equivalent for SQL 2005 if so? Let me know. Thanks!

    Steve

    (aka smunson)

    :):):)

    You can try master.sys.syscolumns (but that may go away in future version of SQL Server) or master.sys.columns (you may find it has fewer rows, my own experience at home).

    Actually, it does work in 2k5... Master.dbo.SysColumns is a "legacy" view and it works just fine in my Development Edition of 2k5 at sp2. I do login with SA privs.

    If not, then just use Master.sys.sysColumns like Lynn suggested. Do be careful not to use the underwhelming rowcount of Master.sys.Columns, though... it's an easy "mistrake" to make.

    One other thing, Jeff's code doesn't work as is if you have a case-sensitive installation. Found that out as that is how I set mine up. Sorry, I like it that way (I know: sick, weird, any other terms you would like to add).

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • Here are the test timings I referred to:

    ====================================================================================

    CREATE 200,000 ROW TEST TABLE

    ====================================================================================

    Table 'spt_values'. Scan count 9, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'SGMTest'. Scan count 0, logical reads 611388, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 5834 ms, elapsed time = 11747 ms.

    (200000 row(s) affected)

    ====================================================================================

    TEST: dbo.fnGICSPFParseGICCatalog(SomeText69,1)

    ====================================================================================

    (200000 row(s) affected)

    Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 83117 ms, elapsed time = 85873 ms.

    ====================================================================================

    TEST: dbo.fnGICSPFParseGICCatalog_Test(SomeText69,1)

    ====================================================================================

    (200000 row(s) affected)

    Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 38860 ms, elapsed time = 39423 ms.

    ====================================================================================

    BASELINE - SELECT * FROM dbo.SGMTest

    ====================================================================================

    (200000 row(s) affected)

    Table 'SGMTest'. Scan count 1, logical reads 4781, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 2447 ms.

    ====================================================================================

    ADDING PERSISTED COLUMN

    ====================================================================================

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 3 ms.

    Table 'SGMTest'. Scan count 1, logical reads 52405, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 86019 ms, elapsed time = 88964 ms.

    ====================================================================================

    SELECT * FROM dbo.SGMTest

    ====================================================================================

    (200000 row(s) affected)

    Table 'SGMTest'. Scan count 1, logical reads 9545, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 234 ms, elapsed time = 2504 ms.

    By the way, I found that master.dbo.spt_values only has 2346 records, so letting it create a cartesian product only generates some 4,000,000 records, as opposed to master.dbo.syscolumns, which has 11,000+ records, for a cartesian product of 121 million records. Not sure if that will affect performance at all or not, as it would depend on how SQL Server implements TOP under the covers.

    Steve

    (aka smunson)

    :):):)

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply