September 19, 2010 at 12:54 am
Jeff Moden (9/18/2010)
ColdCoffee (9/18/2010)
Jeff Moden (9/18/2010)
ColdCoffee (9/18/2010)
Jeff, PFA the results my DESKTOP..Mine runs on
OS : Windows 7 Ultimate,
SQL : SQL Server 2005 Developer Edition RTM (9.0.1399) ,
Processor: Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz, 2800 Mhz, 2 Core(s), 2 Logical Processor(s)
Total RAM : 2 GB
Please tell me if u need further info from my machine.. i will run the code on my office machine which is higher power machine, and publish the results...
CC... could you do a rerun but in the TEXT mode, please? I'm all setup to read one big text output. Thanks.
Anything for u Jeff.. But do i have some 3 hrs ? got to meet my friend now who is in hospital.. so can your consolidation hold for another 3 hrs? Thanks in advance..
Anytime is a good time. Absolutely no rush, CC. I'm happy to have this much help, so far. I didn't think I'd have this much help until Monday or so. Thanks for your help and I hope your friend is OK.
My friend is doing OK, Jeff.. Thanks for your words ๐
And i have attached the "Text Mode" Results .. I promise atleast 5 distinct environment's results tomorrow morning once i reach office..Please tel me if i have to provide any further information...
September 19, 2010 at 1:07 am
Darn time zones. Jeff, the script has a comment in it to say that Profiler should be running:
--=====================================================================================================================
-- Run the functions (Profiler turned on for this given SPID)
--=====================================================================================================================
I'm going to assume that isn't in fact required, but if it is, can you provide a server-side trace definition so we're all running the same thing? As you know, traces can have a huge impact on scalar and multi-statement TVFs (e.g. dbo.Split8KXML1) so that would tend to favour the in-line TVFs unfairly.
Paul
September 19, 2010 at 2:00 am
Hi Jeff
Happy to help.
Desktop
SQL Server 2008 R2 Dev ed. 10.50.1600.1
OS NameMicrosoftยฎ Windows Vista Business
Version6.0.6002 Service Pack 2 Build 6002
System Manufacturer Dell Inc.
System ModelPrecision WorkStation 390
System Typex64-based PC
ProcessorIntel(R) Core(TM)2 CPU 6300 @ 1.86GHz, 1862 Mhz, 2 Core(s), 2 Logical Processor(s)
Installed Physical Memory (RAM)2.00 GB.
Good luck
Regards Graham
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
September 19, 2010 at 2:11 am
Hi Jeff,
Results attached for my creaky Fujitsu Amilo laptop running Vista Home Premium, 2GB RAM, Intel Core 2 Duo
with SQL Server 2008 Express R2.
Query took 27 mins 49 secs to complete.
Cheers
Mark
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 19, 2010 at 4:53 am
My results attached. Total run time 5 min 04 sec.
(Pentium 4m Processor @2GHz - SQL Server 2008 x86 Dev)
Sorry I've done this slightly backwards: this run includes one of Brad's optimizations missing from the original rig:
CREATE FUNCTION dbo.Split8KXML3
(@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,
R.Item.value('text()[1]', 'varchar(8000)') AS ItemValue
FROM (SELECT CAST('<r>'+REPLACE(@Parameter, @Delimiter, '</r><r>')+'</r>' AS XML).query('.')) X(N)
CROSS APPLY N.nodes('//r') R(Item)
;
The difference is the ".query('.')" after the CAST...AS XML.
I'll post the results from the original rig in a bit.
Paul
September 19, 2010 at 5:19 am
Ok, these are the results for the unmodified script. Total execution time 19 min 11 sec.
Paul
September 19, 2010 at 5:51 am
For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)
CREATE ASSEMBLY Utility
AUTHORIZATION dbo
FROM 
WITH PERMISSION_SET = SAFE;
GO
CREATE FUNCTION dbo.SplitString_Multi
(
@Input NVARCHAR(MAX),
@Delimiter NVARCHAR(255)
)
RETURNS TABLE
(
sequence INTEGER NULL,
item NVARCHAR(4000) NULL
)
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME
Utility.UserDefinedFunctions.SplitString_Multi;
GO
This function handles input strings up to 2GB and multi-character delimiters (notice also that the test dataset is not Unicode, so this function has to convert to and from Unicode on every row.) The test script is:
DECLARE @RowNum INTEGER,
@ItemNumber INTEGER,
@ItemValue INTEGER;
SET STATISTICS IO, TIME ON;
SELECT @RowNum = CSV.RowNum,
@ItemNumber = iTVF.sequence,
@ItemValue = iTVF.item
FROM dbo.CsvTest CSV
CROSS
APPLY dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF
SET STATISTICS IO, TIME OFF;
On my machine, these are the results:
Tally:
Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 9578 ms, elapsed time = 10013 ms.
SQLCLR:
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 1953 ms, elapsed time = 2039 ms.
That's a factor of five :smooooth:
Paul
September 19, 2010 at 7:20 am
Paul White NZ (9/19/2010)
On my machine, these are the results:Tally:
Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 9578 ms, elapsed time = 10013 ms.
SQLCLR:
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 1953 ms, elapsed time = 2039 ms.
That's a factor of five :smooooth:
Paul
Yes, but if you compare the two to shift data either with a SELECT INTO or INSERT query you won't see that sort of difference
September 19, 2010 at 7:43 am
Paul White NZ (9/19/2010)
For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)
CREATE ASSEMBLY Utility
AUTHORIZATION dbo
FROM 0x
WITH PERMISSION_SET = SAFE;
GO
CREATE FUNCTION dbo.SplitString_Multi
(
@Input NVARCHAR(MAX),
@Delimiter NVARCHAR(255)
)
RETURNS TABLE
(
sequence INTEGER NULL,
item NVARCHAR(4000) NULL
)
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME
Utility.UserDefinedFunctions.SplitString_Multi;
GO
This function handles input strings up to 2GB and multi-character delimiters (notice also that the test dataset is not Unicode, so this function has to convert to and from Unicode on every row.) The test script is:
DECLARE @RowNum INTEGER,
@ItemNumber INTEGER,
@ItemValue INTEGER;
SET STATISTICS IO, TIME ON;
SELECT @RowNum = CSV.RowNum,
@ItemNumber = iTVF.sequence,
@ItemValue = iTVF.item
FROM dbo.CsvTest CSV
CROSS
APPLY dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF
SET STATISTICS IO, TIME OFF;
On my machine, these are the results:
Tally:
Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 9578 ms, elapsed time = 10013 ms.
SQLCLR:
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 1953 ms, elapsed time = 2039 ms.
That's a factor of five :smooooth:
Paul
The difference on mine is far less pronounced
Tally:
Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 7719 ms, elapsed time = 7791 ms.
SQLCLR:
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 5610 ms, elapsed time = 5815 ms.
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 19, 2010 at 8:09 am
Paul White NZ (9/19/2010)
My results attached. Total run time 5 min 04 sec.(Pentium 4m Processor @2GHz - SQL Server 2008 x86 Dev)
Sorry I've done this slightly backwards: this run includes one of Brad's optimizations missing from the original rig:
CREATE FUNCTION dbo.Split8KXML3
(@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,
R.Item.value('text()[1]', 'varchar(8000)') AS ItemValue
FROM (SELECT CAST('<r>'+REPLACE(@Parameter, @Delimiter, '</r><r>')+'</r>' AS XML).query('.')) X(N)
CROSS APPLY N.nodes('//r') R(Item)
;
The difference is the ".query('.')" after the CAST...AS XML.
I'll post the results from the original rig in a bit.
Paul
Thanks Paul... Yes, I found Brad's and I'll add it to the test rig. Thanks for posting your results.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 19, 2010 at 8:13 am
Paul White NZ (9/19/2010)
For anyone wanting to see how a SQLCLR implementation compares to the other tested methods on this dataset, here's my slightly-tweaked version of Adam Machanic's string splitter: (source code attached)
Heh... alright... guess it's finally time for me to give that bad boy a try. Thanks, Paul. ๐
--Jeff Moden
Change is inevitable... Change for the better is not.
September 19, 2010 at 8:30 am
Paul White NZ (9/19/2010)
Darn time zones. Jeff, the script has a comment in it to say that Profiler should be running:
--=====================================================================================================================
-- Run the functions (Profiler turned on for this given SPID)
--=====================================================================================================================
I'm going to assume that isn't in fact required, but if it is, can you provide a server-side trace definition so we're all running the same thing? As you know, traces can have a huge impact on scalar and multi-statement TVFs (e.g. dbo.Split8KXML1) so that would tend to favour the in-line TVFs unfairly.
Paul
Nah... my apologies, Paul. I wouldn't make anyone read the whole script to figure out what they need to do to do ME a favor. That's an artifact from my previous test code and I'll remove it so it doesn't confuse anyone. Thanks for reading the code, though! I love a good peer review.
Also, I'm going to add two of Brad's... the one you posted will be commented as "XML-Brad1 (Split8KXMLBrad1 iTVF)". The other one is what I believe Brad meant to be the fastest and will be commented as XML-Brad (Split8KXMLBrad iTVF). Almost done with that. Just running a sanity check before I update the code.
And thanks for jumping in on this thread. I always appreciate your comments and your code.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 19, 2010 at 8:36 am
Mark-101232 (9/19/2010)
The difference on mine is far less pronouncedTally:
Table 'Tally'. Scan count 10000, logical reads 30000, physical reads 0
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 7719 ms, elapsed time = 7791 ms.
SQLCLR:
Table 'CsvTest'. Scan count 1, logical reads 774, physical reads 0
CPU time = 5610 ms, elapsed time = 5815 ms.
1. The logical reads difference seems about the same ๐
2. CLR uses JIT (just-in-time) compilation, so the first few times you (ever) use the SQLCLR function it won't have been fully compiled to native machine code. Run the test several times to ensure the code is fully optimized.
3. If running on a laptop, ensure that your CPU is running at full clock speed (not on battery/ check power plan settings etc.) If in doubt, run something like CPU-Z to check.
Paul
September 19, 2010 at 8:43 am
Alright... both of Brad's good XML examples have been added. I left the others in simply because they were there before and I'll continue to collect data on those points. ๐
Thanks again for everyone who participates/participated in this test. I don't know if you have the gumption to do so but for those folks that already ran it, it would be an additional huge help to me if you reran the code and posted the results now that we have Brad's more performant XML functions in the code.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 19, 2010 at 8:51 am
steve-893342 (9/19/2010)
Yes, but if you compare the two to shift data either with a SELECT INTO or INSERT query you won't see that sort of difference
I think I see what you mean, and the answer is very much "It Depends" ๐
For example, let's use a bcp export to a file (this removes a lot of variable factors):
Tally:
bcp "SELECT csv.RowNum, split.ItemNumber, item = CONVERT(INTEGER, split.ItemValue) FROM tempdb.dbo.CsvTest csv CROSS APPLY tempdb.dbo.Split8KTally(csv.CsvParameter,',') AS split" queryout tally.bcp -n -S .\SQL2008 -T
Results:
1000000 rows copied.
Clock Time (ms.) Total : 11406 Average : (87673.15 rows per sec.)
SQLCLR:
bcp "SELECT CSV.RowNum, iTVF.sequence, item = CONVERT(INTEGER, iTVF.item) FROM tempdb.dbo.CsvTest CSV CROSS APPLY tempdb.dbo.SplitString_Multi(CSV.CsvParameter, N',') iTVF" queryout sqlclr.bcp -n -S .\SQL2008 -T
Results:
1000000 rows copied.
Clock Time (ms.) Total : 3610 Average : (277008.31 rows per sec.)
Paul
Viewing 15 posts - 16 through 30 (of 214 total)
You must be logged in to reply to this topic. Login to reply