August 21, 2008 at 3:59 pm
Hey Wayne, I must be on drugs today. Or maybe I'm not and ought to be because of all the typos I'm making today. "Seconds" in my last post should be MS. Both techniques are subsecond queries at 20000 rows, but the "FOR XML" version is still 50 times faster. That gets my attention.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 22, 2008 at 12:47 am
For what it's worth try this method.
--Create a Tally table. Just a table that hold a big number of integer numbers (100000 in this case)
use AdventureWorks
DECLARE @StartTime DATETIME --Timer to measure total duration
SET @StartTime = GETDATE() --Start the timer
--=============================================================================
-- Create and populate a Tally table
--=============================================================================
--===== Conditionally drop
IF OBJECT_ID('dbo.Tally') IS NOT NULL
DROP TABLE dbo.Tally
--===== Create and populate the Tally table on the fly
SELECT TOP 100 000
IDENTITY(int,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Let the public use it
GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC
--===== Display the total duration
SELECT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'
Use XML method and note the Time (I got 1750 ms)
use AdventureWorks
GO
-- create comma-delimted string with above XML method
declare @CSV varchar(max), @StartTime datetime, @EndTime datetime
select @CSV = (select',' + AccountNumber from Sales.SalesOrderHeader FOR XML PATH('') )
set @CSV = substring(@CSV, 2, len(@CSV)-1)
-- convert the CSV string into a valid XML string
set @StartTime = CURRENT_TIMESTAMP
declare @MyXMLData XML
-- replace special XML characters that cause issues in SQL
set @CSV = replace(replace(@CSV,'&', '&'),'<', '<')
set @MyXMLData = ' '+
replace(@CSV,',',' ')+
' '
select x.item.value('AccountNumber[1]','nvarchar(15)')
from @MyXMLData.nodes('/Rows/Row')AS x(item)
set @EndTime = CURRENT_TIMESTAMP
select @StartTime, @EndTime, DateDiff(ms, @StartTime, @EndTime)
GO
Now use the tally table and check time again (I got 390 ms)
DECLARE @Parameter VARCHAR(max), @StartTime datetime, @EndTime datetime
select @Parameter = (select',' + AccountNumber from Sales.SalesOrderHeader FOR XML PATH('') ) + ','
set @StartTime = CURRENT_TIMESTAMP
--===== Join the Tally table to the string at the character level and
-- when we find a comma, insert what's between that command and
-- the next comma into the Elements table
SELECT SUBSTRING(@Parameter,N+1,CHARINDEX(',',@Parameter,N+1)-N-1)
FROM dbo.Tally
WHERE N < LEN(@Parameter)
AND SUBSTRING(@Parameter,N,1) = ',' --Notice how we find the comma
set @EndTime = CURRENT_TIMESTAMP
select @StartTime, @EndTime, DateDiff(ms, @StartTime, @EndTime)
I would like to think that both these methods have a place in a code library and can be used where applicable
August 22, 2008 at 7:16 am
lucian (8/22/2008)
I would like to think that both these methods have a place in a code library and can be used where applicable
I agree. And that's a pretty interesting use of the tally table to get the results. Easily modified for any delimiter.
It looks to me like the string substitution part to put in the XML tags is taking about 40% of that time.
FYI, I'm consistantly getting a longer time for the XML, and a faster time for the tally table, than what you posted.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
August 22, 2008 at 7:35 am
WayneS (8/22/2008)
lucian (8/22/2008)
I would like to think that both these methods have a place in a code library and can be used where applicableI agree. And that's a pretty interesting use of the tally table to get the results. Easily modified for any delimiter.
It looks to me like the string substitution part to put in the XML tags is taking about 40% of that time.
FYI, I'm consistantly getting a longer time for the XML, and a faster time for the tally table, than what you posted.
Strange:unsure: I first thought that it could be a Hardware difference. Just goes to show that testing is important.
August 22, 2008 at 1:23 pm
My existing parser function uses a tally table like that last example and runs pretty damn quick. The huge CPU performance difference I saw was in building the CSV string, not taking it apart. I don't see how a tally table could help out there.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 25, 2008 at 12:23 am
Thanks for the reply Wayne
Thanks and Regards
Anil
August 25, 2008 at 12:23 am
bhovious (8/22/2008)
My existing parser function uses a tally table like that last example and runs pretty damn quick. The huge CPU performance difference I saw was in building the CSV string, not taking it apart. I don't see how a tally table could help out there.
😉
So how do you build the string, The select for xml is very fast?
August 25, 2008 at 12:43 pm
Hi, thanks for the article.
In reading this and other articles that show how to use XML for string tasks, they all provide examples using variables. I'm looking to parse a comma-separated string, but from a table, not just one row. I'm working through the XQuery help files, but it's a little tricky to me. Anyone do anything like this? So for clarification, I'd like to turn this:
insert into #myexample ( id, liststring ) VALUES ( 1, '100,101' )
insert into #myexample ( id, liststring ) VALUES ( 2, '200,301,401' )
insert into #myexample ( id, liststring ) VALUES ( 3, '100,110' )
insert into #myexample ( id, liststring ) VALUES ( 4, '70' )
into this:
1 100
1 101
2 200
2 301
2 401
3 100
3 101
4 70
but as a result set that I could use in a view.
I do realize I could do this with a tally table (and currently am doing it that way), but I'd like to expand myself and use/learn about XML.
Thanks much.
August 25, 2008 at 3:44 pm
Hey lucian
Here is some code for doing parsing using a tally table. We usually have a pre-generated one in each db, but this code will run on any 2005 machine (it generates the tally table on the fly using CTE's).
I've often generated comma- or pipe-separated lists using a simple select like the one below, but I agree that the XML method is CPU faster.
Regards,
Bob
------------------------------------------------------------------------------------------------------------------
-- parse string using a tally table
declare @inputString varchar(7900)
declare @sepChar varchar(50)
declare @element varchar(4)
declare @input varchar(8000)
set @inputString = '1/2/3/a/b/c/delta/bravo/#/@'
set @sepchar = '/'
set @element = 7
set @input = @sepChar+@inputString+@sepChar
select @input
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1),--2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows "L0 as A, L0 as B" is just shorthand way to code a cross join
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
Tally AS (SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY C) AS N FROM L4),
Array AS
(select Row_Number() over (order by N) as E,substring(@input,N+1,charindex(@sepChar,@input,N+1)-(N+1)) as element
from tally
where substring(@input,N,1) = @sepChar
and N < len(@input)
)
select element
from Array
where E = @element
------------------------------------------------------------------------------------------------------------------
-- generate CSV string
declare @csv as varchar(max)
set @csv = ''
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1),--2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows "L0 as A, L0 as B" is just shorthand way to code a cross join
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
Tally AS (SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY C) AS N FROM L4)
select @csv = @csv+cast(N as varchar(6))+','
from tally
set @csv = stuff(@csv,len(@csv),1,'')
print @csv
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 26, 2008 at 12:16 am
bhovious (8/25/2008)
Hey lucianHere is some code for doing parsing using a tally table. We usually have a pre-generated one in each db, but this code will run on any 2005 machine (it generates the tally table on the fly using CTE's).
I've often generated comma- or pipe-separated lists using a simple select like the one below, but I agree that the XML method is CPU faster.
Regards,
Bob
Thanks Bob. I never had a case where I need to build a CSV string, the programmers do that in their code on the client side. I just process/parse the CSV string, but we never knows what lies ahead, so I will hold onto this idea.
Viewing 10 posts - 31 through 39 (of 39 total)
You must be logged in to reply to this topic. Login to reply