I need some help with a test, please.

  • Jeff Moden (9/18/2010)


    ColdCoffee (9/18/2010)


    Jeff Moden (9/18/2010)


    ColdCoffee (9/18/2010)


    Jeff, PFA the results my DESKTOP..

    Mine runs on

    OS : Windows 7 Ultimate,

    SQL : SQL Server 2005 Developer Edition RTM (9.0.1399) ,

    Processor: Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz, 2800 Mhz, 2 Core(s), 2 Logical Processor(s)

    Total RAM : 2 GB

    Please tell me if u need further info from my machine.. i will run the code on my office machine which is higher power machine, and publish the results...

    CC... could you do a rerun but in the TEXT mode, please? I'm all setup to read one big text output. Thanks.

    Anything for u Jeff.. But do i have some 3 hrs ? got to meet my friend now who is in hospital.. so can your consolidation hold for another 3 hrs? Thanks in advance..

    Anytime is a good time. Absolutely no rush, CC. I'm happy to have this much help, so far. I didn't think I'd have this much help until Monday or so. Thanks for your help and I hope your friend is OK.

    My friend is doing OK, Jeff.. Thanks for your words ๐Ÿ™‚

    And i have attached the "Text Mode" Results .. I promise atleast 5 distinct environment's results tomorrow morning once i reach office..Please tel me if i have to provide any further information...

  • Darn time zones. Jeff, the script has a comment in it to say that Profiler should be running:

    --=====================================================================================================================

    -- Run the functions (Profiler turned on for this given SPID)

    --=====================================================================================================================

    I'm going to assume that isn't in fact required, but if it is, can you provide a server-side trace definition so we're all running the same thing? As you know, traces can have a huge impact on scalar and multi-statement TVFs (e.g. dbo.Split8KXML1) so that would tend to favour the in-line TVFs unfairly.

    Paul

  • Hi Jeff

    Happy to help.

    Desktop

    SQL Server 2008 R2 Dev ed. 10.50.1600.1

    OS NameMicrosoftยฎ Windows Vista Business

    Version6.0.6002 Service Pack 2 Build 6002

    System Manufacturer Dell Inc.

    System ModelPrecision WorkStation 390

    System Typex64-based PC

    ProcessorIntel(R) Core(TM)2 CPU 6300 @ 1.86GHz, 1862 Mhz, 2 Core(s), 2 Logical Processor(s)

    Installed Physical Memory (RAM)2.00 GB.

    Good luck

    Regards Graham

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Hi Jeff,

    Results attached for my creaky Fujitsu Amilo laptop running Vista Home Premium, 2GB RAM, Intel Core 2 Duo

    with SQL Server 2008 Express R2.

    Query took 27 mins 49 secs to complete.

    Cheers

    Mark

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • My results attached. Total run time 5 min 04 sec.

    (Pentium 4m Processor @2GHz - SQL Server 2008 x86 Dev)

    Sorry I've done this slightly backwards: this run includes one of Brad's optimizations missing from the original rig:

    CREATE FUNCTION dbo.Split8KXML3

    (@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,

    R.Item.value('text()[1]', 'varchar(8000)') AS ItemValue

    FROM (SELECT CAST('<r>'+REPLACE(@Parameter, @Delimiter, '</r><r>')+'</r>' AS XML).query('.')) X(N)

    CROSS APPLY N.nodes('//r') R(Item)

    ;

    The difference is the ".query('.')" after the CAST...AS XML.

    I'll post the results from the original rig in a bit.

    Paul

  • Ok, these are the results for the unmodified script. Total execution time 19 min 11 sec.

    Paul

  • For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)

    CREATE ASSEMBLY Utility

    AUTHORIZATION dbo

    FROM 

    WITH PERMISSION_SET = SAFE;

    GO

    CREATE FUNCTION dbo.SplitString_Multi

    (

    @Input NVARCHAR(MAX),

    @Delimiter NVARCHAR(255)

    )

    RETURNS TABLE

    (

    sequence INTEGER NULL,

    item NVARCHAR(4000) NULL

    )

    WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME

    Utility.UserDefinedFunctions.SplitString_Multi;

    GO

    This function handles input strings up to 2GB and multi-character delimiters (notice also that the test dataset is not Unicode, so this function has to convert to and from Unicode on every row.) The test script is:

    DECLARE @RowNum INTEGER,

    @ItemNumber INTEGER,

    @ItemValue INTEGER;

    SET STATISTICS IO, TIME ON;

    SELECT @RowNum = CSV.RowNum,

    @ItemNumber = iTVF.sequence,

    @ItemValue = iTVF.item

    FROM dbo.CsvTest CSV

    CROSS

    APPLY dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF

    SET STATISTICS IO, TIME OFF;

    On my machine, these are the results:

    Tally:

    Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 9578 ms, elapsed time = 10013 ms.

    SQLCLR:

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 1953 ms, elapsed time = 2039 ms.

    That's a factor of five :smooooth:

    Paul

  • Paul White NZ (9/19/2010)


    On my machine, these are the results:

    Tally:

    Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 9578 ms, elapsed time = 10013 ms.

    SQLCLR:

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 1953 ms, elapsed time = 2039 ms.

    That's a factor of five :smooooth:

    Paul

    Yes, but if you compare the two to shift data either with a SELECT INTO or INSERT query you won't see that sort of difference

  • Paul White NZ (9/19/2010)


    For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)

    CREATE ASSEMBLY Utility

    AUTHORIZATION dbo

    FROM 0x

    WITH PERMISSION_SET = SAFE;

    GO

    CREATE FUNCTION dbo.SplitString_Multi

    (

    @Input NVARCHAR(MAX),

    @Delimiter NVARCHAR(255)

    )

    RETURNS TABLE

    (

    sequence INTEGER NULL,

    item NVARCHAR(4000) NULL

    )

    WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME

    Utility.UserDefinedFunctions.SplitString_Multi;

    GO

    This function handles input strings up to 2GB and multi-character delimiters (notice also that the test dataset is not Unicode, so this function has to convert to and from Unicode on every row.) The test script is:

    DECLARE @RowNum INTEGER,

    @ItemNumber INTEGER,

    @ItemValue INTEGER;

    SET STATISTICS IO, TIME ON;

    SELECT @RowNum = CSV.RowNum,

    @ItemNumber = iTVF.sequence,

    @ItemValue = iTVF.item

    FROM dbo.CsvTest CSV

    CROSS

    APPLY dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF

    SET STATISTICS IO, TIME OFF;

    On my machine, these are the results:

    Tally:

    Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 9578 ms, elapsed time = 10013 ms.

    SQLCLR:

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 1953 ms, elapsed time = 2039 ms.

    That's a factor of five :smooooth:

    Paul

    The difference on mine is far less pronounced

    Tally:

    Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 7719 ms, elapsed time = 7791 ms.

    SQLCLR:

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 5610 ms, elapsed time = 5815 ms.

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Paul White NZ (9/19/2010)


    My results attached. Total run time 5 min 04 sec.

    (Pentium 4m Processor @2GHz - SQL Server 2008 x86 Dev)

    Sorry I've done this slightly backwards: this run includes one of Brad's optimizations missing from the original rig:

    CREATE FUNCTION dbo.Split8KXML3

    (@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,

    R.Item.value('text()[1]', 'varchar(8000)') AS ItemValue

    FROM (SELECT CAST('<r>'+REPLACE(@Parameter, @Delimiter, '</r><r>')+'</r>' AS XML).query('.')) X(N)

    CROSS APPLY N.nodes('//r') R(Item)

    ;

    The difference is the ".query('.')" after the CAST...AS XML.

    I'll post the results from the original rig in a bit.

    Paul

    Thanks Paul... Yes, I found Brad's and I'll add it to the test rig. Thanks for posting your results.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White NZ (9/19/2010)


    For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)

    Heh... alright... guess it's finally time for me to give that bad boy a try. Thanks, Paul. ๐Ÿ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White NZ (9/19/2010)


    Darn time zones. Jeff, the script has a comment in it to say that Profiler should be running:

    --=====================================================================================================================

    -- Run the functions (Profiler turned on for this given SPID)

    --=====================================================================================================================

    I'm going to assume that isn't in fact required, but if it is, can you provide a server-side trace definition so we're all running the same thing? As you know, traces can have a huge impact on scalar and multi-statement TVFs (e.g. dbo.Split8KXML1) so that would tend to favour the in-line TVFs unfairly.

    Paul

    Nah... my apologies, Paul. I wouldn't make anyone read the whole script to figure out what they need to do to do ME a favor. That's an artifact from my previous test code and I'll remove it so it doesn't confuse anyone. Thanks for reading the code, though! I love a good peer review.

    Also, I'm going to add two of Brad's... the one you posted will be commented as "XML-Brad1 (Split8KXMLBrad1 iTVF)". The other one is what I believe Brad meant to be the fastest and will be commented as XML-Brad (Split8KXMLBrad iTVF). Almost done with that. Just running a sanity check before I update the code.

    And thanks for jumping in on this thread. I always appreciate your comments and your code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Mark-101232 (9/19/2010)


    The difference on mine is far less pronounced

    Tally:

    Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 7719 ms, elapsed time = 7791 ms.

    SQLCLR:

    Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0

    CPU time = 5610 ms, elapsed time = 5815 ms.

    1. The logical reads difference seems about the same ๐Ÿ˜›

    2. CLR uses JIT (just-in-time) compilation, so the first few times you (ever) use the SQLCLR function it won't have been fully compiled to native machine code. Run the test several times to ensure the code is fully optimized.

    3. If running on a laptop, ensure that your CPU is running at full clock speed (not on battery/ check power plan settings etc.) If in doubt, run something like CPU-Z to check.

    Paul

  • Alright... both of Brad's good XML examples have been added. I left the others in simply because they were there before and I'll continue to collect data on those points. ๐Ÿ™‚

    Thanks again for everyone who participates/participated in this test. I don't know if you have the gumption to do so but for those folks that already ran it, it would be an additional huge help to me if you reran the code and posted the results now that we have Brad's more performant XML functions in the code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • steve-893342 (9/19/2010)


    Yes, but if you compare the two to shift data either with a SELECT INTO or INSERT query you won't see that sort of difference

    I think I see what you mean, and the answer is very much "It Depends" ๐Ÿ˜‰

    For example, let's use a bcp export to a file (this removes a lot of variable factors):

    Tally:

    bcp "SELECT csv.RowNum, split.ItemNumber, item = CONVERT(INTEGER, split.ItemValue) FROM tempdb.dbo.CsvTest csv CROSS APPLY tempdb.dbo.Split8KTally(csv.CsvParameter,',') AS split" queryout tally.bcp -n -S .\SQL2008 -T

    Results:

    1000000 rows copied.

    Clock Time (ms.) Total : 11406 Average : (87673.15 rows per sec.)

    SQLCLR:

    bcp "SELECT CSV.RowNum, iTVF.sequence, item = CONVERT(INTEGER, iTVF.item) FROM tempdb.dbo.CsvTest CSV CROSS APPLY tempdb.dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF" queryout sqlclr.bcp -n -S .\SQL2008 -T

    Results:

    1000000 rows copied.

    Clock Time (ms.) Total : 3610 Average : (277008.31 rows per sec.)

    Paul

Viewing 15 posts - 16 through 30 (of 214 total)

You must be logged in to reply to this topic. Login to reply