Creating a Comma-Separated List (SQL Spackle)

Question

Creating a Comma-Separated List (SQL Spackle)

Viewing 15 posts - 16 through 30 (of 84 total)

You must be logged in to reply to this topic. Login to reply

WayneS SSC Guru Points: 95460 More actions · Answer 1

Matt, Jason - thanks!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Tom Bakerman Hall of Fame Points: 3367 More actions · Answer 2

Wayne,

Can you provide more explanation on this part of the query:

TYPE).value('.','varchar(max)')

I have always used:

WITH CTE AS

(

SELECT DISTINCT

AccountNumber

FROM #TestData

)

SELECT AccountNumber,

CommaList = STUFF((

SELECT ',' + Value

FROM #TestData

WHERE AccountNumber = CTE.AccountNumber

ORDER BY Value

FOR XML PATH('')),1,1,'') -- no TYPE.value

FROM CTE;

This provides the same results as your query.

WayneS SSC Guru Points: 95460 More actions · Answer 3

Tom Bakerman (1/12/2011)
Wayne,
Can you provide more explanation on this part of the query:
TYPE).value('.','varchar(max)')

Tom,

Put an Ampersand (&), Less than (<) or Greater than (>) in your data, and see what happens with or without it:

WITH CTE (Data) AS

(

SELECT 'Rolling Stones' UNION ALL

SELECT 'Hall & Oates' UNION ALL

SELECT 'One <> Two'

)

SELECT [WithType] = STUFF((

SELECT ',' + Data

FROM CTE

FOR XML PATH(''),TYPE).value('.','varchar(max)'),1,1,''),

[WithOutType] = STUFF((

SELECT ',' + Data

FROM CTE

FOR XML PATH('')),1,1,'');

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

WayneS (1/12/2011)
Mark-101232 (1/12/2011)
Changing
value('.','varchar(max)'),1,1,'')
to
value('(./text())[1]','varchar(max)'),1,1,'')
appears to give a better query plan
Thanks for the information... I'll have to test this out.
Edit: WOW! What a difference! Identical queries, with just that one change. The way in the article gets:
(1000 row(s) affected)
Table '#TestData'. Scan count 1001, logical reads 2411, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 234 ms, elapsed time = 3175 ms.
Your suggested change gets:
(1000 row(s) affected)
Table '#TestData'. Scan count 1001, logical reads 2411, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 172 ms, elapsed time = 561 ms.
Looking at the execution plans (both queries run together), the method in the article uses 99% and your solution 1%.
Thanks for the tip!

Very cool! Wayne writes an excellent article, Mark provides an excellent performance enhancement in the discussion, and Wayne tests it for everyone. It doesn't get much better than that. Well done to you both. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95460 More actions · Answer 5

Jeff Moden (1/12/2011)
Very cool! Wayne writes an excellent article, Mark provides an excellent performance enhancement in the discussion, and Wayne tests it for everyone. It doesn't get much better than that. Well done to you both. 🙂

Thanks Jeff. Now, I just have to get used to this "new" way of doing this!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Tom Bakerman Hall of Fame Points: 3367 More actions · Answer 6

WayneS (1/12/2011)
Tom,
Put an Ampersand (&), Less than (<) or Greater than (>) in your data, and see what happens with or without it:

Got it. Thank you. For our application, having those characters encoded is actually a plus :-D, as all we do is ship the data off to a web app.

Tom

WayneS SSC Guru Points: 95460 More actions · Answer 7

Tom Bakerman (1/12/2011)
WayneS (1/12/2011)
Tom,
Put an Ampersand (&), Less than (<) or Greater than (>) in your data, and see what happens with or without it:
Got it. Thank you. For our application, having those characters encoded is actually a plus :-D, as all we do is ship the data off to a web app.
Tom

Makes sense... now you have two ways to handle things!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 8

1) don't put the clustered index on the table and the order by and then run the two versions. Virtually identical timing and resource usage on my mid-grade laptop.

2) Does NOT having the clustered index on the table open up the potential for getting the wrong output?

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Tom Bakerman Hall of Fame Points: 3367 More actions · Answer 9

TheSQLGuru (1/12/2011)
1) don't put the clustered index on the table and the order by and then run the two versions. Virtually identical timing and resource usage on my mid-grade laptop.
2) Does NOT having the clustered index on the table open up the potential for getting the wrong output?

Wrong how? The ORDER BY is explicit in the query to generate the xml. If you leave the ORDER BY out of that part of the query. then, as with any other query, you run the risk of seemingly random ordering.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 10

WayneS (1/12/2011)
Jeff Moden (1/12/2011)
Very cool! Wayne writes an excellent article, Mark provides an excellent performance enhancement in the discussion, and Wayne tests it for everyone. It doesn't get much better than that. Well done to you both. 🙂
Thanks Jeff. Now, I just have to get used to this "new" way of doing this!

Perhaps an addition to the article to show Mark's method and the race-code you ran would be in order. I know Steve would go for it. It'll help us all (especially me 'cuz I forget 😉 ).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95460 More actions · Answer 11

WayneS (1/12/2011)
Mark-101232 (1/12/2011)
Changing
value('.','varchar(max)'),1,1,'')
to
value('(./text())[1]','varchar(max)'),1,1,'')
appears to give a better query plan
Thanks for the information... I'll have to test this out.
Edit: WOW! What a difference! Identical queries, with just that one change. The way in the article gets:
(1000 row(s) affected)
Table '#TestData'. Scan count 1001, logical reads 2411, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 234 ms, elapsed time = 3175 ms.
Your suggested change gets:
(1000 row(s) affected)
Table '#TestData'. Scan count 1001, logical reads 2411, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 172 ms, elapsed time = 561 ms.
Looking at the execution plans (both queries run together), the method in the article uses 99% and your solution 1%.
Thanks for the tip!

FYI: I followed up the testing I did at work with testing at home. The difference at home is not so profound, but Mark's method is still routinely faster, sometimes by 33%. There is a difference in the test code between work and home: the home version is dumping the results into a temp table, to keep everything strictly within SQL and eliminate display (sending output to the display is "the great equalizer"!).

The test code:

DBCC FREEPROCCACHE WITH NO_INFOMSGS;

DBCC FREESYSTEMCACHE ('ALL') WITH NO_INFOMSGS;

PRINT '--XML TYPE with value(''.'')';

SET STATISTICS IO,TIME ON;

GO

IF OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test;

WITH CTE AS

(

SELECT DISTINCT

AccountNumber

FROM #TestData

)

SELECT AccountNumber,

CommaList = STUFF((SELECT ',' + Value

FROM #TestData

WHERE AccountNumber = CTE.AccountNumber

ORDER BY Value

FOR XML PATH(''),

TYPE).value('.','varchar(max)'),1,1,'')

INTO #test

FROM CTE

ORDER BY AccountNumber;

GO

SET STATISTICS IO,TIME OFF;

PRINT '';

PRINT '--XML TYPE with value(''(./text())[1]'')';

SET STATISTICS IO,TIME ON;

GO

IF OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test;

WITH CTE AS

(

SELECT DISTINCT

AccountNumber

FROM #TestData

)

SELECT AccountNumber,

CommaList = STUFF((SELECT ',' + Value

FROM #TestData

WHERE AccountNumber = CTE.AccountNumber

ORDER BY Value

FOR XML PATH(''),

TYPE).value('(./text())[1]','varchar(max)'),1,1,'')

INTO #test

FROM CTE

ORDER BY AccountNumber;

SET STATISTICS IO,TIME OFF;

The results:

--XML TYPE with value('.')

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 13 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 2 ms.

Table '#TestData'. Scan count 1001, logical reads 2411, ...

SQL Server Execution Times:

CPU time = 140 ms, elapsed time = 213 ms.

--XML TYPE with value('(./text())[1]')

SQL Server parse and compile time:

CPU time = 10 ms, elapsed time = 10 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 1 ms.

Table '#TestData'. Scan count 1001, logical reads 2411, ...

SQL Server Execution Times:

CPU time = 94 ms, elapsed time = 137 ms.

As I said, not so profound a difference, but yet always faster and it still has a better execution plan!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Clay Bowen SSC Enthusiast Points: 106 More actions · Answer 12

Is there a way to take these results and post them to a .csv without using powershell? We have been creating out comma delimited files using VB but it would be nice to do it directly from SQL Server. This procedure creates the Comma-Delimited values in a table, it would be nice if I could output them directly from a procedure in SQL Server.

Tom Bakerman Hall of Fame Points: 3367 More actions · Answer 13

Clay Bowen (1/13/2011)
Is there a way to take these results and post them to a .csv without using powershell? We have been creating out comma delimited files using VB but it would be nice to do it directly from SQL Server. This procedure creates the Comma-Delimited values in a table, it would be nice if I could output them directly from a procedure in SQL Server.

Use Sqlcmd and redirect the output to a file:

sqlcmd -S yourserver-d yourdatabase -Q "your query here" > output.csv

Paul White SSC Guru Points: 150468 More actions · Answer 14

Hey Wayne,

The (./text())[1] modification (and variations) has been around for a very long time. I included in the CSV demo in part 2 of my APPLY article in April last year, and I certainly didn't invent it. I have a suspicion that I picked it up from Barry, but I can't be sure.

Finally, I have a feeling the /text() node has a length limitation...

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

janis.l.murphy SSC Veteran Points: 239 More actions · Answer 15

I found the alternate solution offered here to be most efficient.