split a comma delimited string into columns

Question

Post reply

split a comma delimited string into columns

mynkdby

SSC Eights!

Points: 846
More actions
October 3, 2013 at 11:21 pm

#278693

DECLARE @p NVARCHAR(MAX)
SELECT @P = 'MONDAY,SUN,FUN,D'
DECLARE @STRSQL NVARCHAR(MAX)
SET @sql = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''
EXEC( SQL)
Above Code is splitting the comma delimited string into four columns but how to give columns names with in this code.??
I was trying this with creating new dynamic table with four column but i am not sure that if it really works.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Pulivarthi Sasidhar SSCertifiable Points: 6706 More actions · Answer 1

Use the following code...

SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' as Col,''') + ''''+' as lst'

OR

SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' ,''') + ''''

If you are sure that it contains 4 columns only... to generate dynamic...

then create a table (temp /table variable)

Declare @tbl table(col1 Varchar(10),col2 Varchar(10),col3 Varchar(10),col4 Varchar(10))

INSERT INTO @tbl

Execute(@STRSQL)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

You could use Jeff Moden's string splitter [/url]and pivot the result. Alternatively, if the number of columns is known up front, a CROSS APPLY cascade [/url]would be quicker.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

mynkdby SSC Eights! Points: 846 More actions · Answer 3

Actually there can be any number of commas in between input parameter so in that only having 4 column thing is incorrect...

and the code you have provided that i am getting this result with Column Name Col for rest of the column except last one

Col Col Col lst

---------- ---- ---- ----

AFGSDFGSDF BSDF CSDF D

mynkdby SSC Eights! Points: 846 More actions · Answer 4

Pulivarthi Sasidhar (10/4/2013)
Use the following code...
SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' as Col,''') + ''''+' as lst'
OR
SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' ,''') + ''''
If you are sure that it contains 4 columns only... to generate dynamic...
then create a table (temp /table variable)
Declare @tbl table(col1 Varchar(10),col2 Varchar(10),col3 Varchar(10),col4 Varchar(10))
INSERT INTO @tbl
Execute(@STRSQL)

hi ,

Actually there can be any number of commas in between input parameter so in that only having 4 column thing is incorrect...

and the code you have provided that i am getting this result with Column Name Col for rest of the column except last one

Col Col Col lst

---------- ---- ---- ----

AFGSDFGSDF BSDF CSDF D

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 5

Hello

please see some code (still work in progress) below

the idea is to strip a csv column into separate columns:

the number of columns is indeterminate as is the length of the string parts.

each col will have a numerically incremented column header

nulls converted to empty strings for readability

maybe it will give you some ideas

thanks to Jeff Moden for random string creations in this post

http://www.sqlservercentral.com/Forums/FindPost1501816.aspx

and to the following article:

http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx

will appreciate feedback and improvements 😛

thanks

use tempdb

go

/*drop the test tables*/

IF OBJECT_ID('tempdb..Table1','U') IS NOT NULL DROP TABLE Table1;

IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;

IF OBJECT_ID('tempdb..Tally','U') IS NOT NULL DROP TABLE Tally;

GO

/*Create a Tally/Numbers table....NB other methods can be used*/

SELECT TOP 10000

IDENTITY(INT,1,1) AS N

INTO dbo.Tally

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

/* Primary Key */

ALTER TABLE dbo.Tally

ADD CONSTRAINT PK_Tally_N

PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

/*Create some sample data for the csv column split taken from this excellent post by Jeff Moden*/

/*Create Table1*/

;WITH

cteGenProduct AS

(

SELECT DISTINCT TOP 1000

ProductName = LTRIM(REPLACE(SUBSTRING(LTRIM(NEWID()),ABS(CHECKSUM(NEWID()))%10+1,ABS(CHECKSUM(NEWID()))%10+5),'-',' '))

FROM master.sys.all_columns ac1

CROSS JOIN master.sys.all_columns ac2

)

SELECT ID = IDENTITY(BIGINT,1,1)

,ProductName

INTO Table1

FROM cteGenProduct

;

--SELECT * FROM Table1

;

--===== Create Table 2.

-- The ProductNames from Table 1 will randomly appear as CSV data

-- with 4 to 10 "elements"

WITH

cteRandomData AS

(

SELECT TOP 25000

ID = ROW_NUMBER()OVER(ORDER BY (SELECT 1))

FROM master.sys.all_columns ac1

CROSS JOIN master.sys.all_columns ac2

)

SELECT ID

,ProductName =

(SELECT STUFF(CAST(

( --=== Concatenate N number of random t1.ProductName's

SELECT TOP (ABS(CHECKSUM(NEWID()))%7+4)

N','+t1.ProductName

FROM Table1 t1

WHERE t2.ID > 0 --This nonsense is to randomize the CSV

ORDER BY NEWID()

FOR XML PATH('')

)

AS VARCHAR(8000)),1,1,N'')

)

INTO Table2

FROM cteRandomData t2

;

CREATE NONCLUSTERED INDEX [NCI_1] ON [dbo].[Table2]

([ProductName] ASC) INCLUDE ( [ID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

--SELECT * FROM Table2

;

/*SOLUTION HERE*/

/*split into columns with col headers and NULL as empty string*/

/*base idea from here http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx*/

DECLARE @pivot varchar(8000)

DECLARE @select varchar(8000)

DECLARE @output varchar(8000)

SELECT

@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'

FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)

--print @pivot

SELECT

@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''

FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)

--print @output

SELECT

@select='

select p.id, '+@output+'

--into tempresults

from (

select

id,substring(productname, start+2, endPos-Start-2) as token,

''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n

from (

select

id, productname, n as start, charindex('','',productname,n+2) endPos

from (select N-1 as n from Tally ) num

cross join

(

select

id, '','' + productname +'','' as productname

from

table2

) m

where n < len(productname)-1

and substring(productname,n+1,1) = '','') as productname

) pvt

Pivot ( max(token)for n in ('+@pivot+'))p

order by p.id'

EXEC(@select)

/*add additional csvs for extra columns*/

--INSERT INTO [table2]([ID],[ProductName])

--VALUES(50000,'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z')

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 6

Sorry... deleted bad post.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 7

J Livingston SQL (10/6/2013)
Hello
please see some code (still work in progress) below
the idea is to strip a csv column into separate columns:
the number of columns is indeterminate as is the length of the string parts.
each col will have a numerically incremented column header
nulls converted to empty strings for readability
maybe it will give you some ideas
thanks to Jeff Moden for random string creations in this post
http://www.sqlservercentral.com/Forums/FindPost1501816.aspx
and to the following article:
http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx
will appreciate feedback and improvements 😛
thanks
use tempdb
go
/*drop the test tables*/
IF OBJECT_ID('tempdb..Table1','U') IS NOT NULL DROP TABLE Table1;
IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;
IF OBJECT_ID('tempdb..Tally','U') IS NOT NULL DROP TABLE Tally;
GO
/*Create a Tally/Numbers table....NB other methods can be used*/
SELECT TOP 10000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
/* Primary Key */
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
/*Create some sample data for the csv column split taken from this excellent post by Jeff Moden*/
/*Create Table1*/
;WITH
cteGenProduct AS
(
SELECT DISTINCT TOP 1000
ProductName = LTRIM(REPLACE(SUBSTRING(LTRIM(NEWID()),ABS(CHECKSUM(NEWID()))%10+1,ABS(CHECKSUM(NEWID()))%10+5),'-',' '))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID = IDENTITY(BIGINT,1,1)
,ProductName
INTO Table1
FROM cteGenProduct
;
--SELECT * FROM Table1
;
--===== Create Table 2.
-- The ProductNames from Table 1 will randomly appear as CSV data
-- with 4 to 10 "elements"
WITH
cteRandomData AS
(
SELECT TOP 25000
ID = ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID
,ProductName =
(SELECT STUFF(CAST(
( --=== Concatenate N number of random t1.ProductName's
SELECT TOP (ABS(CHECKSUM(NEWID()))%7+4)
N','+t1.ProductName
FROM Table1 t1
WHERE t2.ID > 0 --This nonsense is to randomize the CSV
ORDER BY NEWID()
FOR XML PATH('')
)
AS VARCHAR(8000)),1,1,N'')
)
INTO Table2
FROM cteRandomData t2
;
CREATE NONCLUSTERED INDEX [NCI_1] ON [dbo].[Table2]
([ProductName] ASC) INCLUDE ( [ID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--SELECT * FROM Table2
;
/*SOLUTION HERE*/
/*split into columns with col headers and NULL as empty string*/
/*base idea from here http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx*/
DECLARE @pivot varchar(8000)
DECLARE @select varchar(8000)
DECLARE @output varchar(8000)
SELECT
@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @pivot
SELECT
@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @output
SELECT
@select='
select p.id, '+@output+'
--into tempresults
from (
select
id,substring(productname, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, productname, n as start, charindex('','',productname,n+2) endPos
from (select N-1 as n from Tally ) num
cross join
(
select
id, '','' + productname +'','' as productname
from
table2
) m
where n < len(productname)-1
and substring(productname,n+1,1) = '','') as productname
) pvt
Pivot ( max(token)for n in ('+@pivot+'))p
order by p.id'
EXEC(@select)
/*add additional csvs for extra columns*/
--INSERT INTO [table2]([ID],[ProductName])
--VALUES(50000,'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z')

It works great, Graham. Well done. I could be mistaken but it does look like there's a chance for SQL Injection in the dynamic use of the @Pivot variable, though. Correct me if I'm wrong.

I guess the only thing I don't understand is why this has to be able to handle any number of elements. The data shows that each element should belong to a specifically named column. Pivoting the data with generic column names doesn't make sense to me according to the very limited information given. I have the bad feeling that someone is trying to make a generic crud solution instead of doing it the right way. Certainly this will all become a problem if anyone ever changes the order of what is being passed because there is simply no way of knowing which data goes to what columns according to what is passed in.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 8

mynkdby (10/4/2013)
Actually there can be any number of commas in between input parameter so in that only having 4 column thing is incorrect...
and the code you have provided that i am getting this result with Column Name Col for rest of the column except last one
Col Col Col lst
---------- ---- ---- ----
AFGSDFGSDF BSDF CSDF D

What's the rest of the story behind this CSV? I ask because you are 100% depending on blind postitional notation and that's a very bad idea. It's as bad using SELECT * in a view. What will you be using the output of the pivoted table for?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 9

mynkdby (10/3/2013)
DECLARE @p NVARCHAR(MAX)
SELECT @P = 'MONDAY,SUN,FUN,D'
DECLARE @STRSQL NVARCHAR(MAX)
SET @sql = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''
EXEC( SQL)
Above Code is splitting the comma delimited string into four columns but how to give columns names with in this code.??
I was trying this with creating new dynamic table with four column but i am not sure that if it really works.

Pulivarthi Sasidhar (10/4/2013)
Use the following code...
SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' as Col,''') + ''''+' as lst'
OR
SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''' ,''') + ''''
If you are sure that it contains 4 columns only... to generate dynamic...
then create a table (temp /table variable)
Declare @tbl table(col1 Varchar(10),col2 Varchar(10),col3 Varchar(10),col4 Varchar(10))
INSERT INTO @tbl
Execute(@STRSQL)

Actually, I had it right the first time. Don't use code like this because it is VERY prone to SQL Injection. For example...

DECLARE @p NVARCHAR(MAX)

SELECT @P = 'MONDAY,SUN,FUN,D'';SELECT ''BANG! The database is Dead!! This could have been an EXEC or DROP or...'';--'

DECLARE @STRSQL NVARCHAR(MAX)

SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''

EXEC( @STRSQL)

Here are the results..

------ ---- ---- ----

MONDAY SUN FUN D

(1 row(s) affected)

-----------------------------------------------------------------------

BANG! The database is Dead!! This could have been an EXEC or DROP or...

(1 row(s) affected)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 10

I could be mistaken but it does look like there's a chance for SQL Injection in the dynamic use of the @Pivot variable, though. Correct me if I'm wrong.
I guess the only thing I don't understand is why this has to be able to handle any number of elements. The data shows that each element should belong to a specifically named column. Pivoting the data with generic column names doesn't make sense to me according to the very limited information given. I have the bad feeling that someone is trying to make a generic crud solution instead of doing it the right way. Certainly this will all become a problem if anyone ever changes the order of what is being passed because there is simply no way of knowing which data goes to what columns according to what is passed in.

Jeff...the reason I did this was to take some poorly structured data (contact info) passed from a legacy system...wanted to tidy up for readability for users. added column headers more as an "exercise" and to possibly meet OP's request...but seems OP doesn't care to respond so far.

re SQL injection....this is not production code....one off process I run periodically...but take on board all posts of late re this issue.

as brief demo of what I need to do.....code below

use tempdb

go

/*drop the test table*/

IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;

GO

/*NEED A TALLY TABLE AS WELL !*/

CREATE TABLE [dbo].[Table2](

[ID] [int] NULL,

[csvdata] [varchar](8000) NULL

) ON [PRIMARY]

INSERT INTO [Table2]([ID],[csvdata])

VALUES(1,'Phone 04412 3355 44445, email b.baggins@abc.com, mobile: 0777 1234564444')

INSERT INTO [Table2]([ID],[csvdata])

VALUES(2,'Mob 04412 3355 44445 main no, facsimile 01288 6655447 888 in hours, mail bill.theboss@xyz.com, tel office 44 55 66 887777, tel out of hours 44 55 66 8812345, email bill@home.gmoon.com')

INSERT INTO [Table2]([ID],[csvdata])

VALUES(3,'Telephone only 55 66 1112333 (9to5)')

SELECT * FROM Table2

/*SOLUTION HERE*/

DECLARE @pivot varchar(8000)

DECLARE @select varchar(8000)

DECLARE @output varchar(8000)

SELECT

@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'

FROM Tally where N <=(SELECT max(len(csvdata)-len(replace(csvdata,',',''))+1) FROM table2)

SELECT

@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''

FROM Tally where N <=(SELECT max(len(csvdata)-len(replace(csvdata,',',''))+1) FROM table2)

SELECT

@select='

select p.id, '+@output+'

--into tempresults

from (

select

id,substring(csvdata, start+2, endPos-Start-2) as token,

''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n

from (

select

id, csvdata, n as start, charindex('','',csvdata,n+2) endPos

from (select N-1 as n from Tally ) num

cross join

(

select

id, '','' + csvdata +'','' as csvdata

from

table2

) m

where n < len(csvdata)-1

and substring(csvdata,n+1,1) = '','') as csvdata

) pvt

Pivot ( max(token)for n in ('+@pivot+'))p

order by p.id'

EXEC(@select)

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

mynkdby SSC Eights! Points: 846 More actions · Answer 11

Jeff Moden (10/6/2013)
mynkdby (10/4/2013)
Actually there can be any number of commas in between input parameter so in that only having 4 column thing is incorrect...
and the code you have provided that i am getting this result with Column Name Col for rest of the column except last one
Col Col Col lst
---------- ---- ---- ----
AFGSDFGSDF BSDF CSDF D
What's the rest of the story behind this CSV? I ask because you are 100% depending on blind postitional notation and that's a very bad idea. It's as bad using SELECT * in a view. What will you be using the output of the pivoted table for?

Correct

mynkdby SSC Eights! Points: 846 More actions · Answer 12

J Livingston SQL (10/6/2013)
Hello
please see some code (still work in progress) below
the idea is to strip a csv column into separate columns:
the number of columns is indeterminate as is the length of the string parts.
each col will have a numerically incremented column header
nulls converted to empty strings for readability
maybe it will give you some ideas
thanks to Jeff Moden for random string creations in this post
http://www.sqlservercentral.com/Forums/FindPost1501816.aspx
and to the following article:
http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx
will appreciate feedback and improvements 😛
thanks
use tempdb
go
/*drop the test tables*/
IF OBJECT_ID('tempdb..Table1','U') IS NOT NULL DROP TABLE Table1;
IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;
IF OBJECT_ID('tempdb..Tally','U') IS NOT NULL DROP TABLE Tally;
GO
/*Create a Tally/Numbers table....NB other methods can be used*/
SELECT TOP 10000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
/* Primary Key */
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
/*Create some sample data for the csv column split taken from this excellent post by Jeff Moden*/
/*Create Table1*/
;WITH
cteGenProduct AS
(
SELECT DISTINCT TOP 1000
ProductName = LTRIM(REPLACE(SUBSTRING(LTRIM(NEWID()),ABS(CHECKSUM(NEWID()))%10+1,ABS(CHECKSUM(NEWID()))%10+5),'-',' '))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID = IDENTITY(BIGINT,1,1)
,ProductName
INTO Table1
FROM cteGenProduct
;
--SELECT * FROM Table1
;
--===== Create Table 2.
-- The ProductNames from Table 1 will randomly appear as CSV data
-- with 4 to 10 "elements"
WITH
cteRandomData AS
(
SELECT TOP 25000
ID = ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID
,ProductName =
(SELECT STUFF(CAST(
( --=== Concatenate N number of random t1.ProductName's
SELECT TOP (ABS(CHECKSUM(NEWID()))%7+4)
N','+t1.ProductName
FROM Table1 t1
WHERE t2.ID > 0 --This nonsense is to randomize the CSV
ORDER BY NEWID()
FOR XML PATH('')
)
AS VARCHAR(8000)),1,1,N'')
)
INTO Table2
FROM cteRandomData t2
;
CREATE NONCLUSTERED INDEX [NCI_1] ON [dbo].[Table2]
([ProductName] ASC) INCLUDE ( [ID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--SELECT * FROM Table2
;
/*SOLUTION HERE*/
/*split into columns with col headers and NULL as empty string*/
/*base idea from here http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx*/
DECLARE @pivot varchar(8000)
DECLARE @select varchar(8000)
DECLARE @output varchar(8000)
SELECT
@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @pivot
SELECT
@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @output
SELECT
@select='
select p.id, '+@output+'
--into tempresults
from (
select
id,substring(productname, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, productname, n as start, charindex('','',productname,n+2) endPos
from (select N-1 as n from Tally ) num
cross join
(
select
id, '','' + productname +'','' as productname
from
table2
) m
where n < len(productname)-1
and substring(productname,n+1,1) = '','') as productname
) pvt
Pivot ( max(token)for n in ('+@pivot+'))p
order by p.id'
EXEC(@select)
/*add additional csvs for extra columns*/
--INSERT INTO [table2]([ID],[ProductName])
--VALUES(50000,'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z')

Got the work done 🙂

DECLARE @CREATETABLE NVARCHAR(MAX)

DECLARE @p NVARCHAR(MAX)

SELECT @CREATETABLE = ''

SELECT @P = 'SEE,IF,THIS,CODE,IS,SPLITTING'

DECLARE @STRSQL NVARCHAR(MAX)

--Create temporary table to store results

SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''

SELECT @CREATETABLE = @CREATETABLE + ' ,COLUMN' + CONVERT(NVARCHAR(4),ROWNUMBER) + ' NVARCHAR(4000)'

FROM (

SELECT TOP (LEN(@P) - LEN(REPLACE(@P,',',''))) ROW_NUMBER() OVER(ORDER BY (SELECT 1)) ROWNUMBER

FROM SYS.COLUMNS

) A

SELECT @CREATETABLE = 'CREATE TABLE #TEMPSTORAGE ( COLUMN0 NVARCHAR(4000) ' + ISNULL(@CREATETABLE,'') + ')'

--Insert the result set into the temporary table

SELECT @STRSQL = @CREATETABLE + '

INSERT INTO #TEMPSTORAGE ' + @STRSQL + '

SELECT * FROM #TEMPSTORAGE '

EXEC( @STRSQL)

mynkdby SSC Eights! Points: 846 More actions · Answer 13

Jeff Moden (10/6/2013)
J Livingston SQL (10/6/2013)
Hello
please see some code (still work in progress) below
the idea is to strip a csv column into separate columns:
the number of columns is indeterminate as is the length of the string parts.
each col will have a numerically incremented column header
nulls converted to empty strings for readability
maybe it will give you some ideas
thanks to Jeff Moden for random string creations in this post
http://www.sqlservercentral.com/Forums/FindPost1501816.aspx
and to the following article:
http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx
will appreciate feedback and improvements 😛
thanks
use tempdb
go
/*drop the test tables*/
IF OBJECT_ID('tempdb..Table1','U') IS NOT NULL DROP TABLE Table1;
IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;
IF OBJECT_ID('tempdb..Tally','U') IS NOT NULL DROP TABLE Tally;
GO
/*Create a Tally/Numbers table....NB other methods can be used*/
SELECT TOP 10000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
/* Primary Key */
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
/*Create some sample data for the csv column split taken from this excellent post by Jeff Moden*/
/*Create Table1*/
;WITH
cteGenProduct AS
(
SELECT DISTINCT TOP 1000
ProductName = LTRIM(REPLACE(SUBSTRING(LTRIM(NEWID()),ABS(CHECKSUM(NEWID()))%10+1,ABS(CHECKSUM(NEWID()))%10+5),'-',' '))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID = IDENTITY(BIGINT,1,1)
,ProductName
INTO Table1
FROM cteGenProduct
;
--SELECT * FROM Table1
;
--===== Create Table 2.
-- The ProductNames from Table 1 will randomly appear as CSV data
-- with 4 to 10 "elements"
WITH
cteRandomData AS
(
SELECT TOP 25000
ID = ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID
,ProductName =
(SELECT STUFF(CAST(
( --=== Concatenate N number of random t1.ProductName's
SELECT TOP (ABS(CHECKSUM(NEWID()))%7+4)
N','+t1.ProductName
FROM Table1 t1
WHERE t2.ID > 0 --This nonsense is to randomize the CSV
ORDER BY NEWID()
FOR XML PATH('')
)
AS VARCHAR(8000)),1,1,N'')
)
INTO Table2
FROM cteRandomData t2
;
CREATE NONCLUSTERED INDEX [NCI_1] ON [dbo].[Table2]
([ProductName] ASC) INCLUDE ( [ID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--SELECT * FROM Table2
;
/*SOLUTION HERE*/
/*split into columns with col headers and NULL as empty string*/
/*base idea from here http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx*/
DECLARE @pivot varchar(8000)
DECLARE @select varchar(8000)
DECLARE @output varchar(8000)
SELECT
@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @pivot
SELECT
@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @output
SELECT
@select='
select p.id, '+@output+'
--into tempresults
from (
select
id,substring(productname, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, productname, n as start, charindex('','',productname,n+2) endPos
from (select N-1 as n from Tally ) num
cross join
(
select
id, '','' + productname +'','' as productname
from
table2
) m
where n < len(productname)-1
and substring(productname,n+1,1) = '','') as productname
) pvt
Pivot ( max(token)for n in ('+@pivot+'))p
order by p.id'
EXEC(@select)
/*add additional csvs for extra columns*/
--INSERT INTO [table2]([ID],[ProductName])
--VALUES(50000,'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z')
It works great, Graham. Well done. I could be mistaken but it does look like there's a chance for SQL Injection in the dynamic use of the @Pivot variable, though. Correct me if I'm wrong.
I guess the only thing I don't understand is why this has to be able to handle any number of elements. The data shows that each element should belong to a specifically named column. Pivoting the data with generic column names doesn't make sense to me according to the very limited information given. I have the bad feeling that someone is trying to make a generic crud solution instead of doing it the right way. Certainly this will all become a problem if anyone ever changes the order of what is being passed because there is simply no way of knowing which data goes to what columns according to what is passed in.

Thanks that was a help, got this done with this

DECLARE @CREATETABLE NVARCHAR(MAX)

DECLARE @p NVARCHAR(MAX)

SELECT @CREATETABLE = ''

SELECT @P = 'SEE,IF,THIS,CODE,IS,SPLITTING'

DECLARE @STRSQL NVARCHAR(MAX)

--Create temporary table to store results

SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''

SELECT @CREATETABLE = @CREATETABLE + ' ,COLUMN' + CONVERT(NVARCHAR(4),ROWNUMBER) + ' NVARCHAR(4000)'

FROM (

SELECT TOP (LEN(@P) - LEN(REPLACE(@P,',',''))) ROW_NUMBER() OVER(ORDER BY (SELECT 1)) ROWNUMBER

FROM SYS.COLUMNS

) A

SELECT @CREATETABLE = 'CREATE TABLE #TEMPSTORAGE ( COLUMN0 NVARCHAR(4000) ' + ISNULL(@CREATETABLE,'') + ')'

--Insert the result set into the temporary table

SELECT @STRSQL = @CREATETABLE + '

INSERT INTO #TEMPSTORAGE ' + @STRSQL + '

SELECT * FROM #TEMPSTORAGE '

EXEC( @STRSQL)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 14

mynkdby (10/9/2013)
Jeff Moden (10/6/2013)
J Livingston SQL (10/6/2013)
Hello
please see some code (still work in progress) below
the idea is to strip a csv column into separate columns:
the number of columns is indeterminate as is the length of the string parts.
each col will have a numerically incremented column header
nulls converted to empty strings for readability
maybe it will give you some ideas
thanks to Jeff Moden for random string creations in this post
http://www.sqlservercentral.com/Forums/FindPost1501816.aspx
and to the following article:
http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx
will appreciate feedback and improvements 😛
thanks
use tempdb
go
/*drop the test tables*/
IF OBJECT_ID('tempdb..Table1','U') IS NOT NULL DROP TABLE Table1;
IF OBJECT_ID('tempdb..Table2','U') IS NOT NULL DROP TABLE Table2;
IF OBJECT_ID('tempdb..Tally','U') IS NOT NULL DROP TABLE Tally;
GO
/*Create a Tally/Numbers table....NB other methods can be used*/
SELECT TOP 10000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
/* Primary Key */
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
/*Create some sample data for the csv column split taken from this excellent post by Jeff Moden*/
/*Create Table1*/
;WITH
cteGenProduct AS
(
SELECT DISTINCT TOP 1000
ProductName = LTRIM(REPLACE(SUBSTRING(LTRIM(NEWID()),ABS(CHECKSUM(NEWID()))%10+1,ABS(CHECKSUM(NEWID()))%10+5),'-',' '))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID = IDENTITY(BIGINT,1,1)
,ProductName
INTO Table1
FROM cteGenProduct
;
--SELECT * FROM Table1
;
--===== Create Table 2.
-- The ProductNames from Table 1 will randomly appear as CSV data
-- with 4 to 10 "elements"
WITH
cteRandomData AS
(
SELECT TOP 25000
ID = ROW_NUMBER()OVER(ORDER BY (SELECT 1))
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
)
SELECT ID
,ProductName =
(SELECT STUFF(CAST(
( --=== Concatenate N number of random t1.ProductName's
SELECT TOP (ABS(CHECKSUM(NEWID()))%7+4)
N','+t1.ProductName
FROM Table1 t1
WHERE t2.ID > 0 --This nonsense is to randomize the CSV
ORDER BY NEWID()
FOR XML PATH('')
)
AS VARCHAR(8000)),1,1,N'')
)
INTO Table2
FROM cteRandomData t2
;
CREATE NONCLUSTERED INDEX [NCI_1] ON [dbo].[Table2]
([ProductName] ASC) INCLUDE ( [ID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--SELECT * FROM Table2
;
/*SOLUTION HERE*/
/*split into columns with col headers and NULL as empty string*/
/*base idea from here http://beyondrelational.com/modules/2/blogs/70/posts/10844/splitting-delimited-data-to-columns-set-based-approach.aspx*/
DECLARE @pivot varchar(8000)
DECLARE @select varchar(8000)
DECLARE @output varchar(8000)
SELECT
@pivot=coalesce(@pivot+',','')+'[col'+cast(N as varchar(10))+']'
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @pivot
SELECT
@output=coalesce(@output+',','')+'isnull([col'+cast(N as varchar(10))+'],'''')' + ' as col'+cast(N as varchar(10))+''
FROM Tally where N <=(SELECT max(len(productname)-len(replace(productname,',',''))+1) FROM table2)
--print @output
SELECT
@select='
select p.id, '+@output+'
--into tempresults
from (
select
id,substring(productname, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, productname, n as start, charindex('','',productname,n+2) endPos
from (select N-1 as n from Tally ) num
cross join
(
select
id, '','' + productname +'','' as productname
from
table2
) m
where n < len(productname)-1
and substring(productname,n+1,1) = '','') as productname
) pvt
Pivot ( max(token)for n in ('+@pivot+'))p
order by p.id'
EXEC(@select)
/*add additional csvs for extra columns*/
--INSERT INTO [table2]([ID],[ProductName])
--VALUES(50000,'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z')
It works great, Graham. Well done. I could be mistaken but it does look like there's a chance for SQL Injection in the dynamic use of the @Pivot variable, though. Correct me if I'm wrong.
I guess the only thing I don't understand is why this has to be able to handle any number of elements. The data shows that each element should belong to a specifically named column. Pivoting the data with generic column names doesn't make sense to me according to the very limited information given. I have the bad feeling that someone is trying to make a generic crud solution instead of doing it the right way. Certainly this will all become a problem if anyone ever changes the order of what is being passed because there is simply no way of knowing which data goes to what columns according to what is passed in.
Thanks that was a help, got this done with this
DECLARE @CREATETABLE NVARCHAR(MAX)
DECLARE @p NVARCHAR(MAX)
SELECT @CREATETABLE = ''
SELECT @P = 'SEE,IF,THIS,CODE,IS,SPLITTING'
DECLARE @STRSQL NVARCHAR(MAX)
--Create temporary table to store results
SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''
SELECT @CREATETABLE = @CREATETABLE + ' ,COLUMN' + CONVERT(NVARCHAR(4),ROWNUMBER) + ' NVARCHAR(4000)'
FROM (
SELECT TOP (LEN(@P) - LEN(REPLACE(@P,',',''))) ROW_NUMBER() OVER(ORDER BY (SELECT 1)) ROWNUMBER
FROM SYS.COLUMNS
) A
SELECT @CREATETABLE = 'CREATE TABLE #TEMPSTORAGE ( COLUMN0 NVARCHAR(4000) ' + ISNULL(@CREATETABLE,'') + ')'
--Insert the result set into the temporary table
SELECT @STRSQL = @CREATETABLE + '
INSERT INTO #TEMPSTORAGE ' + @STRSQL + '
SELECT * FROM #TEMPSTORAGE '
EXEC( @STRSQL)

You're not listening and it will cost you dearly in the future. The method for splitting the input parameter that you're using is dangerous because it can be used for SQL Injection someday in the future. Go back and look at the post above where I mention this.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)