October 15, 2009 at 3:44 am
Jeff. Once again, greet work on the performance testing.
I did a wee compare last night but do not have the results due to a bit of a power cut.
Once character replacement was added in to non typed xml then it did run about the same, maybe very slightly slower and I am by no means certain I included all the possible replacements.
I'll certainly be going with typed xml for my concatenations.
Using a CLR is absolutely great for readability of script and not having to rememeber the STUF((())()()())XML()())() ness of everything but it lacks the ability to use a custom separator unless you then wrap it in an additional replace anyway.
Besides, I have auto replacements so I just type |st and it drops in a nice stuffed for xml templated.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
October 15, 2009 at 6:18 am
Jon, when I looked at the initial article is said no cursors...
October 15, 2009 at 6:30 am
We can also concatenete rows using coalesce function, I have given example for that,
Declare @Name as varchar(Max);
with fruit as (
select 'Apple' as name, 101 as id union all
select 'Banana' as name, 102 as id union all
select 'Orange' as name, 103 as id union all
select 'Melon' as name, 104 as id union all
select 'Grape' as name, 105 as id
)
select @Name = coalesce(@Name + ',','') + name From Fruit
select @Name
The Out put Will be..
---------------------
Apple,Banana,Orange,Melon,Grape
October 15, 2009 at 7:30 am
Sigh, when will people learn to read before posting.
You are NOT using coalesce to concatenate.
in the example above, coalesce is used to remove NULL entries.
The below works just the same using that example:
DECLARE @Name AS varchar(Max);
SELECT @Name = ''
;WITH fruit AS (
SELECT 'Apple' AS name, 101 AS id UNION all
SELECT 'Banana' AS name, 102 AS id UNION all
SELECT 'Orange' AS name, 103 AS id UNION all
SELECT 'Melon' AS name, 104 AS id UNION all
SELECT 'Grape' AS name, 105 AS id
)
SELECT @Name = @Name + ',' + name FROM Fruit
SELECT @Name
Also, as mentioned many times consider this:
DECLARE @Name AS varchar(Max);
SELECT @Name = ''
;WITH fruit AS (
SELECT 'Apple' AS name, 101 AS id UNION all
SELECT 'Banana' AS name, 101 AS id UNION all
SELECT 'Orange' AS name, 102 AS id UNION all
SELECT 'Melon' AS name, 1042 AS id UNION all
SELECT 'Grape' AS name, 103 AS id
)
How will you get the results:
ID Names
101 Apple,Banana
102 Orange,melon
103 Grape
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
October 15, 2009 at 7:35 am
Andy DBA (10/14/2009)
Jeff Moden
Also, I noticed the varchar(5) type you used in your " XML TYPE solution" truncated the results. But, when I tested with varchar(max) or varchar(2000) I did not notice any change in performance.
Heh... yeah... I hit the ball out of the park and broke the mayor's window. 😛 I noticed the problem last night and changed it to a VARCHAR(8000). Thanks for the kudo and the catch, Andy.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2009 at 7:40 am
Rob Fisk (10/15/2009)
You are NOT using coalesce to concatenate.in the example above, coalesce is used to remove NULL entries.
The below works just the same using that example:
DECLARE @Name AS varchar(Max);
SELECT @Name = ''
;WITH fruit AS (
SELECT 'Apple' AS name, 101 AS id UNION all
SELECT 'Banana' AS name, 102 AS id UNION all
SELECT 'Orange' AS name, 103 AS id UNION all
SELECT 'Melon' AS name, 104 AS id UNION all
SELECT 'Grape' AS name, 105 AS id
)
SELECT @Name = @Name + ',' + name FROM Fruit
SELECT @Name
Oh, be careful...the code above will leave a trailing comma. The COALESCE (or ISNULL) not only removes null entries, its position in the concatenation formula prevents having a trailing comma.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2009 at 8:02 am
Rob Fisk (10/15/2009)
Jeff. Once again, greet work on the performance testing.I did a wee compare last night but do not have the results due to a bit of a power cut.
Once character replacement was added in to non typed xml then it did run about the same, maybe very slightly slower and I am by no means certain I included all the possible replacements.
I'll certainly be going with typed xml for my concatenations.
Using a CLR is absolutely great for readability of script and not having to rememeber the STUF((())()()())XML()())() ness of everything but it lacks the ability to use a custom separator unless you then wrap it in an additional replace anyway.
Besides, I have auto replacements so I just type |st and it drops in a nice stuffed for xml templated.
Thanks for the feedback. I did make a small error that caused some truncation in the XML method that uses TYPE and have repaired it in the code I previously posted.
As for CLR's... yeap... great for readability... I do have to say again that the SQL Ninjas on this forum can normally beat or come very close to meeting the performance of a CLR. Since most of them (and most SQL Developers in the real world) have a library of T-SQL code for tricks like concatenation, remembering things like STUFF isn't usually a problem.
I'm still looking for someone to do a CLR performance test on concatenation like this against the code I posted so we can have a complete set of tests on a single machine. It can't be me that does the tests because of three things... 1) it would be like the mouse guarding the cheese because I'm mostly against the use of CLRs except for some very special cases and 2) I can't write a lick of C and 3) I just can't bring myself to turn on CLR's. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2009 at 8:14 am
For those interested in the performance impact of each concatenation method, I blogged about this some time ago. The XML method consistently does better for reads and IO. I did not use the TYPE directive in my XML test, so I would anticipate a bit more overhead, for the XML subquery solution, but it still should remain the best performing.
Link:
http://jahaines.blogspot.com/2009/07/concatenating-column-values-part-2.html
Note: I do not have a CLR concatenation solution. I am like Jeff and cannot bring myself to code a concatenation solution via CLR.
October 15, 2009 at 8:56 am
Oh, be careful...the code above will leave a trailing comma. The COALESCE (or ISNULL) not only removes null entries, its position in the concatenation formula prevents having a trailing comma.
Oops, but it does indicate that it is not coalesce doing the concatenation.
I just went to try performance testing your script Jeff and I ticked the "discard results after execution" option to eliminate the time taken to write the results since there was no, write to temp table element in the script.
I got nothing in messages.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
October 15, 2009 at 9:01 am
When you discard query results you dont get any information back to the client (SSMS).
October 15, 2009 at 9:16 am
Below is simple sql that does that
declare @res varchar(Max)
select @res = coalesce(@res + ',', '') + [name]
from
(
select 'Apple' as name, 101 as id union all
select 'Banana' as name, 102 as id union all
select 'Orange' as name, 103 as id union all
select 'Melon' as name, 104 as id union all
select 'Grape' as name, 105 as id
) a
select @res
October 15, 2009 at 9:33 am
OK, so the code is about the same but I also inserted the data to a table which was cleared between statistics time toggling:
--===== Do the testing in a nice "safe" place
USE TempDB
GO
--===== Setup the display environment for the "Messages" Tab
SET NOCOUNT OFF --Make sure we can see the rowcounts
SET STATISTICS TIME OFF --Turned ON and OFF later
GO
--===========================================================================================================
--===== Create and populate a 1,000,000 row test table. This is my standard "million row test table".
--===========================================================================================================
-- Column "RowNum" has a range of 1 to 100,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
-- Jeff Moden
PRINT '========== Building the test table and indexes =========='
GO
SELECT TOP 100000
SomeID = IDENTITY(INT,1,1),
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTest
FROM Master.dbo.SysColumns t1
CROSS JOIN
Master.dbo.SysColumns t2
GO
--===== Add a Clustered PK
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (SomeID)
GO
--===== Add an index to support the concatenation
CREATE INDEX IX_JBMTest_SomeInt ON dbo.JBMTest (SomeInt,SomeLetters2)
GO
--===========================================================================================================
--===== Create the functions to be tested. Again, make sure you're in TempDB
--===========================================================================================================
PRINT '========== Building the functions to test =========='
--===== Ensure we're still using TempDB
USE TempDB
GO
----------------------------------------------------------------------------------
--===== Create a function that uses VARCHAR(8000)
CREATE FUNCTION dbo.Concat8KTest
(@SomeInt INT)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @ReturnString VARCHAR(8000)
SELECT @ReturnString = ISNULL(@ReturnString+',' ,'') + SomeLetters2
FROM dbo.JBMTest
WHERE SomeInt = @SomeInt
--ORDER BY SomeLetters2 --Uncomment to guarantee ordered output if required
RETURN @ReturnString
END
GO
----------------------------------------------------------------------------------
--===== Create a function that uses VARCHAR(8000) with Coalesce
CREATE FUNCTION dbo.Concat8KTestCoalesce
(@SomeInt INT)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @ReturnString VARCHAR(8000)
SELECT @ReturnString = COALESCE(@ReturnString+',' ,'') + SomeLetters2
FROM dbo.JBMTest
WHERE SomeInt = @SomeInt
--ORDER BY SomeLetters2 --Uncomment to guarantee ordered output if required
RETURN @ReturnString
END
GO
----------------------------------------------------------------------------------
--===== Create a function that uses VARCHAR(MAX)
CREATE FUNCTION dbo.ConcatMAXTest
(@SomeInt INT)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @ReturnString VARCHAR(MAX)
SELECT @ReturnString = ISNULL(@ReturnString+',' ,'') + SomeLetters2
FROM dbo.JBMTest
WHERE SomeInt = @SomeInt
--ORDER BY SomeLetters2 --Uncomment to guarantee ordered output if required
RETURN @ReturnString
END
GO
----------------------------------------------------------------------------------
--===== Create a function that uses VARCHAR(MAX) with Coalesce
CREATE FUNCTION dbo.ConcatMAXTestCoalesce
(@SomeInt INT)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @ReturnString VARCHAR(MAX)
SELECT @ReturnString = COALESCE(@ReturnString+',' ,'') + SomeLetters2
FROM dbo.JBMTest
WHERE SomeInt = @SomeInt
--ORDER BY SomeLetters2 --Uncomment to guarantee ordered output if required
RETURN @ReturnString
END
GO
--===== Create a table to drop the results in to eliminate resuls set lag
CREATE TABLE wibble(id INT, string NVARCHAR(MAX))
--===========================================================================================================
--===== Do the tests which includes each function we created and some "inline" methods using XML.
--===== Check the "Messages" table for CPU, Duration, and Row Counts.
--===========================================================================================================
SET NOCOUNT OFF
SET STATISTICS TIME OFF
PRINT '========== 8k ISNULL solution (Concat8KTest) =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT SomeInt,dbo.Concat8KTest(SomeInt)
FROM dbo.JBMTest
GROUP BY SomeInt
ORDER BY SomeInt
SET STATISTICS TIME OFF
PRINT '========== MAX ISNULL solution (ConcatMAXTest) =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT SomeInt,dbo.ConcatMAXTest(SomeInt)
FROM dbo.JBMTest
GROUP BY SomeInt
ORDER BY SomeInt
SET STATISTICS TIME OFF
PRINT '========== 8k Coalesce solution (Concat8KTestCoalesce) =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT SomeInt,dbo.Concat8KTestCoalesce(SomeInt)
FROM dbo.JBMTest
GROUP BY SomeInt
ORDER BY SomeInt
SET STATISTICS TIME OFF
PRINT '========== MAX Coalesce solution (ConcatMAXTestCoalesce) =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT SomeInt,dbo.ConcatMAXTestCoalesce(SomeInt)
FROM dbo.JBMTest
GROUP BY SomeInt
ORDER BY SomeInt
SET STATISTICS TIME OFF
PRINT '========== Inline XML solution =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT t1.SomeInt,STUFF((SELECT ',' + t2.SomeLetters2
FROM dbo.JBMTest t2
WHERE t2.SomeInt = t1.SomeInt --Correlation here
--ORDER BY t2.SomeLetters2 --Uncomment to guarantee ordered output if required
FOR XML PATH(''))
,1,1,'') AS SomeLetters2
FROM dbo.JBMTest t1
GROUP BY t1.SomeInt --- without GROUP BY multiple rows are returned
ORDER BY t1.SomeInt
SET STATISTICS TIME OFF
PRINT '========== Inline XML solution with REPLACE =========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT t1.SomeInt,REPLACE(REPLACE(REPLACE(STUFF((SELECT ',' + t2.SomeLetters2
FROM dbo.JBMTest t2
WHERE t2.SomeInt = t1.SomeInt --Correlation here
--ORDER BY t2.SomeLetters2 --Uncomment to guarantee ordered output if required
FOR XML PATH(''))
,1,1,''),'&#gt','>'),'&#lt','<'),'�d','') AS SomeLetters2
FROM dbo.JBMTest t1
GROUP BY t1.SomeInt --- without GROUP BY multiple rows are returned
ORDER BY t1.SomeInt
SET STATISTICS TIME OFF
PRINT '========== Inline XML solution with TYPE=========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECT t1.SomeInt,STUFF((SELECT ',' + t2.SomeLetters2
FROM dbo.JBMTest t2
WHERE t2.SomeInt = t1.SomeInt --Correlation here
--ORDER BY t2.SomeLetters2 --Uncomment to guarantee ordered output if required
FOR XML PATH(''),TYPE).value('.','VARCHAR(8000)')
,1,1,'') AS SomeLetters2
FROM dbo.JBMTest t1
GROUP BY t1.SomeInt --- without GROUP BY multiple rows are returned
ORDER BY t1.SomeInt
SET STATISTICS TIME OFF
PRINT '========== CLR Solution, left JOIN=========='
DELETE wibble
SET STATISTICS TIME ON
INSERT wibble
SELECTt1.SomeInt,dbo.Concatenate(t2.SomeLetters2) AS SomeLetters2
FROMdbo.JBMTest t1
LEFT JOIN dbo.JBMTest t2
ONt2.SomeInt = t1.SomeInt
GROUP BY t1.SomeInt --- without GROUP BY multiple rows are returned
ORDER BY t1.SomeInt
SET STATISTICS TIME OFF
GO
--===========================================================================================================
--===== Housekeeping code for when you are done testing.
--===========================================================================================================
USE TempDB; DROP TABLE dbo.JBMTest
go
USE TempDB; DROP FUNCTION dbo.Concat8KTest,dbo.Concat8KTestCoalesce,dbo.ConcatMAXTest,dbo.ConcatMAXTestCoalesce
go
USE TempDB; DROP TABLE wibble
In addition to the CLR I added Inline XML solution with REPLACE
The machine is a core2 DUO 2.2GH with 2GB RAM running SQL 2005 express locally and the results are as follows:
========== Building the test table and indexes ==========
(100000 row(s) affected)
========== Building the functions to test ==========
========== 8k ISNULL solution (Concat8KTest) ==========
(0 row(s) affected)
SQL Server Execution Times:
CPU time = 3438 ms, elapsed time = 5714 ms.
(43326 row(s) affected)
========== MAX ISNULL solution (ConcatMAXTest) ==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 4375 ms, elapsed time = 6887 ms.
(43326 row(s) affected)
========== 8k Coalesce solution (Concat8KTestCoalesce) ==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 3718 ms, elapsed time = 5752 ms.
(43326 row(s) affected)
========== MAX Coalesce solution (ConcatMAXTestCoalesce) ==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 4563 ms, elapsed time = 6953 ms.
(43326 row(s) affected)
========== Inline XML solution ==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 781 ms, elapsed time = 772 ms.
(43326 row(s) affected)
========== Inline XML solution with REPLACE ==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 1172 ms, elapsed time = 1172 ms.
(43326 row(s) affected)
========== Inline XML solution with TYPE==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 1984 ms, elapsed time = 1982 ms.
(43326 row(s) affected)
========== CLR Solution, left JOIN==========
(43326 row(s) affected)
SQL Server Execution Times:
CPU time = 3266 ms, elapsed time = 3288 ms.
(43326 row(s) affected)
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
October 15, 2009 at 4:09 pm
Outstanding, Rob! I love it when a plan comes together. According to your timings, the XML beat the tar out of the CLR and the simple 8k function almost caught the CLR.
As a bit of a side bar, the REPLACE method is fine but only if you catch all the possible characters. For example, you missed "&" and a couple of others which may make it worthwhile to use Adam's "TYPE" directive even though it slows things down a bit. With or without the TYPE, the XML still beats up on both the CLR and the 8k function.
Anyway, thank you again for taking the time to setup and test the CLR concatenation. It gives folks like me something to reference without having to actually instantiate CLR's.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 15, 2009 at 4:14 pm
msam77-503326 (10/15/2009)
Below is simple sql that does thatdeclare @res varchar(Max)
select @res = coalesce(@res + ',', '') + [name]
from
(
select 'Apple' as name, 101 as id union all
select 'Banana' as name, 102 as id union all
select 'Orange' as name, 103 as id union all
select 'Melon' as name, 104 as id union all
select 'Grape' as name, 105 as id
) a
select @res
Yep... agreed... lot's of folks have posted that same thing this thread.
I'd also recommend that, unless you really need it, that you use VARCHAR(8000) instead of VARCHAR(MAX) because it's usage does slow things down a bit.
I also see that's your first post... "Welcome aboard!"
--Jeff Moden
Change is inevitable... Change for the better is not.
October 16, 2009 at 3:07 am
As a bit of a side bar, the REPLACE method is fine but only if you catch all the possible characters. For example, you missed "&" and a couple of others which may make it worthwhile to use Adam's "TYPE" directive even though it slows things down a bit. With or without the TYPE, the XML still beats up on both the CLR and the 8k function.
Yeah, it was nearing 5 and I didn't have time to hunt down all the characters that should be escaped. It was more of an indication towards the performance bloat the more manual replaces you have to do, not to mention making readability a bitch.
In hindsight missing some and having you pick up on it highlights the possible unreliability of that method too.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
Viewing 15 posts - 76 through 90 (of 159 total)
You must be logged in to reply to this topic. Login to reply