March 18, 2009 at 6:30 pm
Hello all,
I have a table where one column contains comma delimited string
ID vlaue_string
1 23:45:355:68
1 4545:908:24
2 343:32
3 812:321:23434:34:45:68
There is maximum number of tokens in the string - 6
I need to split each string and put every token in its respective column for further manipulation:
ID vlaue_string Col1 Col2 Col3 Col4 Col5 Col6
1 23:45:35:68 23 45 35 68 null null
1 15:908:24 15 908 24 null null null
2 343:32 343 32 null null null null
3 812:321:434:34:45:68 812 321 434 34 45 68
Could someone please help me to do it?
Is it possible at all to do it in SSIS, or in T-SQL? Or I have to do all processing in CLR?
Thanks a lot in advance!
Pit.
March 19, 2009 at 8:34 am
It can be done in SSIS, but you have to use a Script Component to do it. Within the Script Component you would use the Split function which creates an array then you can add the columns to the Script Components output by looping through the array.
You can also do it in T-SQL like this:
DECLARE @table TABLE (ID INT, value VARCHAR(100))
INSERT INTO @table (
ID,
value
)
SELECT
1, '23:45:355:68'
Union All
Select
1, '4545:908:24'
Union All
Select
2, '343:32'
Union All
Select
3, '812:321:23434:34:45:68'
/*
this builds a tally or numbers cte to help do the split
see this article http://www.sqlservercentral.com/articles/TSQL/62867/
for an explanation of the tally table and the split function that follows
*/
;WITH cteTally AS
(
SELECT TOP 100
ROW_NUMBER() OVER(ORDER BY NAME) as n
FROM
sys.all_columns AS AC
),
cteData As
(
SELECT
/*
I need this to uniquely identitfy each row since your
test data includes 2 rows with id 1. If the id column
is unique then you can skip this step and replace value_id
with id in later steps
*/
ROW_NUMBER() OVER (ORDER BY T.id) AS value_id,
T.ID,
T.value
FROM
@table T
),
/*
this splits the values into multiple rows identifying each row
for each value_id for use to PIVOT the data
*/
cteValues AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY T.value_id ORDER BY T.id) AS row_id,
T.value_id,
T.ID,
SUBSTRING(':'+T.value+':',N+1,CHARINDEX(':',':'+T.value+':',N+1)-N-1) AS split_value,
T.value
FROM
cteData AS T CROSS JOIN
cteTally AS TL
WHERE
N < LEN(':'+T.value+':') AND
SUBSTRING(':'+T.value+':',N,1) = ':'
)
/*
Here is where we return the final data in the format you want,
I used the MIN function but you could use MAX as each row will only
have one value for each column. See this article for details on the
pivoting http://www.sqlservercentral.com/articles/T-SQL/63681/
The first example uses the PIVOT function the second (Commented out) does not.
*/
SELECT
id,
value,
[1] AS Col1,
[2] AS Col2,
[3] AS Col3,
[4] AS Col4,
[5] AS Col5,
[6] AS Col6
FROM
(SELECT value_id, row_id, id, value, split_value FROM cteValues) AS Src
PIVOT (MIN(split_value) FOR row_id IN ([1],[2],[3], [4], [5], [6]))AS pvt
ORDER BY
id,
value
/*
SELECT
id,
value,
Min(CASE
WHEN row_id = 1 THEN split_value
ELSE NULL
END) AS col1,
Min(CASE
WHEN row_id = 2 THEN split_value
ELSE NULL
END) AS col2,
Min(CASE
WHEN row_id = 3 THEN split_value
ELSE NULL
END) AS col3,
Min(CASE
WHEN row_id = 4 THEN split_value
ELSE NULL
END) AS col4,
Min(CASE
WHEN row_id = 5 THEN split_value
ELSE NULL
END) AS col5,
Min(CASE
WHEN row_id = 6 THEN split_value
ELSE NULL
END) AS col6
FROM
cteValues
GROUP BY
value_id,
id,
value
*/
Which you use (SSIS or T-SQL) depends on what you need to do next.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
March 23, 2009 at 1:32 pm
Thanks a lot, Jack!
It all worked perfect! And performance with Tally table is great!
Thanks a lot for your help!
Pit.
March 23, 2009 at 2:42 pm
Glad I could help. A tally table has a lot of uses, almost anytime you think you need a loop see if there is a way to use a tally table.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
October 28, 2009 at 3:05 pm
January 15, 2010 at 2:45 am
Jack Corbett (3/19/2009)
It can be done in SSIS, but you have to use a Script Component to do it. Within the Script Component you would use the Split function which creates an array then you can add the columns to the Script Components output by looping through the array.
Hi Jack - would please be able to provide an example on how to do this using the Scrip Component?
I'm trying to accomplish the same thing, but I so far I have never needed to use that Component before.
Thanks
/Matthew
January 15, 2010 at 7:10 am
Matthew,
Here's an example of what you could do. In the example I don't pivot the data as I just want to split the string and have multiple rows. You should be able to figure out how to Pivot it if that is your need:
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim RowId As String = Row.RowID
Dim DelimitedList As String = Row.DelimitedList
Dim delimiter As String = ","
If Not String.IsNullOrEmpty(RowId.Trim) Then
If Not (String.IsNullOrEmpty(DelimitedList)) Then
Dim DelimitedListArray() As String = DelimitedList.Split(New String() {delimiter}, StringSplitOptions.RemoveEmptyEntries)
For Each item As String In DelimitedListArray
With Output0Buffer
.AddRow()
.RowId = RowId
.Item = item.Replace(ControlChars.Cr, "").Replace(ControlChars.Lf, "").Replace(ControlChars.Tab, "").Trim()
End With
Next
End If
End If
End Sub
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply