Split comma delimited field?

  • Hi

    I have a field that is comma dellitted (text1,text2,text3)
    Id like to split the field into separate columns, but not separate rows 
    So I'd LIke  text1  text2  text3 
    not
    Text1
    Text2
    Text3 
    But , they sometimes 3 values comma separated, sometimes 6 , sometimes 8 with a  max of 13(if it helps) 

    Any ideas would be great

    Thanks

  • and what have you tried so far?

    This subject has been discussed over and over and there is plenty of info both here on this forum and on the net. Sure you know how to use google.

  • Look for function (ITVF) dbo.DelimitedSplit8K,

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Wednesday, October 24, 2018 1:51 PM

    Look for function (ITVF) dbo.DelimitedSplit8K,

    And that will help make additional rows, but the OP needs to pivot those rows to columns and the number columns is not fixed.  That means the OP needs more than a spliter.

  • Thanks Scott

    That works great!!

  • Lynn Pettis - Wednesday, October 24, 2018 1:57 PM

    ScottPletcher - Wednesday, October 24, 2018 1:51 PM

    Look for function (ITVF) dbo.DelimitedSplit8K,

    And that will help make additional rows, but the OP needs to pivot those rows to columns and the number columns is not fixed.  That means the OP needs more than a spliter.

    Couldn't I just group by ID ?

  • I thought I'd try to do it with cross apply, I was successful up to 12 columns but then 13 broke SQL Server
    Msg 8632, Level 17, State 2, Line 1
    Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.


    ;WITH CTE AS (
    SELECT *
    FROM (VALUES ('qwe,ert,dfg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfg,fgh,'),
      ('qwe,ert,dfkjhgkg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfgfgh,213,dfg,yuk,dfg,sdf,'),
      ('qwe,er khjgt,dfg,xcvb,')) T(C)
           )
    select T.C,Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8,Col9,Col10,Col11,Col12 --,Col13
    from CTE T
    cross apply (values (CHARINDEX(',',T.C))) AS c1(CommaPos)
    cross apply (values (LEFT(T.C,ABS(c1.CommaPos-1)),SUBSTRING(T.C,c1.CommaPos+1,8000))) t1(Col1,RHS)
    cross apply (values (CHARINDEX(',',t1.RHS))) c2(CommaPos)
    cross apply (values (LEFT(t1.RHS,ABS(c2.CommaPos-1)),SUBSTRING(t1.RHS,c2.CommaPos+1,8000))) t2(Col2,RHS)
    cross apply (values (CHARINDEX(',',t2.RHS))) c3(CommaPos)
    cross apply (values (LEFT(t2.RHS,ABS(c3.CommaPos-1)),SUBSTRING(t2.RHS,c3.CommaPos+1,8000))) t3(Col3,RHS)
    cross apply (values (CHARINDEX(',',t3.RHS))) c4(CommaPos)
    cross apply (values (LEFT(t3.RHS,ABS(c4.CommaPos-1)),SUBSTRING(t3.RHS,c4.CommaPos+1,8000))) t4(Col4,RHS)
    cross apply (values (CHARINDEX(',',t4.RHS))) c5(CommaPos)
    cross apply (values (LEFT(t4.RHS,ABS(c5.CommaPos-1)),SUBSTRING(t4.RHS,c5.CommaPos+1,8000))) t5(Col5,RHS)
    cross apply (values (CHARINDEX(',',t5.RHS))) c6(CommaPos)
    cross apply (values (LEFT(t5.RHS,ABS(c6.CommaPos-1)),SUBSTRING(t5.RHS,c6.CommaPos+1,8000))) t6(Col6,RHS)
    cross apply (values (CHARINDEX(',',t6.RHS))) c7(CommaPos)
    cross apply (values (LEFT(t6.RHS,ABS(c7.CommaPos-1)),SUBSTRING(t6.RHS,c7.CommaPos+1,8000))) t7(Col7,RHS)
    cross apply (values (CHARINDEX(',',t7.RHS))) c8(CommaPos)
    cross apply (values (LEFT(t7.RHS,ABS(c8.CommaPos-1)),SUBSTRING(t7.RHS,c8.CommaPos+1,8000))) t8(Col8,RHS)
    cross apply (values (CHARINDEX(',',t8.RHS))) c9(CommaPos)
    cross apply (values (LEFT(t8.RHS,ABS(c9.CommaPos-1)),SUBSTRING(t8.RHS,c9.CommaPos+1,8000))) t9(Col9,RHS)
    cross apply (values (CHARINDEX(',',t9.RHS))) c10(CommaPos)
    cross apply (values (LEFT(t9.RHS,ABS(c10.CommaPos-1)),SUBSTRING(t9.RHS,c10.CommaPos+1,8000))) t10(Col10,RHS)
    cross apply (values (CHARINDEX(',',t10.RHS))) c11(CommaPos)
    cross apply (values (LEFT(t10.RHS,ABS(c11.CommaPos-1)),SUBSTRING(t10.RHS,c11.CommaPos+1,8000))) t11(Col11,RHS)
    cross apply (values (CHARINDEX(',',t11.RHS))) c12(CommaPos)
    cross apply (values (LEFT(t11.RHS,ABS(c12.CommaPos-1)),SUBSTRING(t11.RHS,c12.CommaPos+1,8000))) t12(Col12,RHS)
    --cross apply (values (CHARINDEX(',',t12.RHS))) c13(CommaPos)
    --cross apply (values (LEFT(t12.RHS,ABS(c13.CommaPos-1)),SUBSTRING(t12.RHS,c13.CommaPos+1,8000))) t13(Col13,RHS)

    Hmmm, how to simplify?

  • Think dynamic pivot of the delimited values.  You will need to determine the maximum number of values in the column being split.  Really looks like something for dynamic SQL, but isn't necessarily something that could be created off the cuff.  Would take me a little time to hack something together.  Maybe someone else could do it quicker.

  • I'd use a cross tab within the CROSS APPLY to reduce overhead (I would think).


    IF OBJECT_ID('tempdb.dbo.#your_table') IS NOT NULL
        DROP TABLE #your_table;

    CREATE TABLE #your_table
    (
        id int NOT NULL,
        combined_column varchar(8000) NULL,
        name varchar(60) NOT NULL,
        other_column_1 decimal(9, 3) NULL,
        other_column_2 datetime NULL
    )
    INSERT INTO #your_table VALUES
    (1, 'qwe,ert,dfg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfg,fgh', 'A', 1.1, GETDATE()-1),
    (2, 'qwe,ert,dfkjhgkg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfgfgh,213,dfg,yuk,dfg,sdf', 'B', 2.2, GETDATE()-2),
    (3, 'qwe,ert,dfg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfg,fgh', 'C', 3.3, GETDATE()-3)

    SELECT yt.id, yt.combined_column, ca1.*, yt.other_column_1, yt.other_column_2
    FROM #your_table yt
    CROSS APPLY (
         SELECT
             MAX(CASE WHEN ItemNumber = 1 THEN Item END) AS Col01,
             MAX(CASE WHEN ItemNumber = 2 THEN Item END) AS Col02,
             MAX(CASE WHEN ItemNumber = 3 THEN Item END) AS Col03,
             MAX(CASE WHEN ItemNumber = 4 THEN Item END) AS Col04,
             MAX(CASE WHEN ItemNumber = 5 THEN Item END) AS Col05,
             MAX(CASE WHEN ItemNumber = 6 THEN Item END) AS Col06,
             MAX(CASE WHEN ItemNumber = 7 THEN Item END) AS Col07,
             MAX(CASE WHEN ItemNumber = 8 THEN Item END) AS Col08,
             MAX(CASE WHEN ItemNumber = 9 THEN Item END) AS Col09,
             MAX(CASE WHEN ItemNumber = 10 THEN Item END) AS Col10,
             MAX(CASE WHEN ItemNumber = 11 THEN Item END) AS Col11,
             MAX(CASE WHEN ItemNumber = 12 THEN Item END) AS Col12,
             MAX(CASE WHEN ItemNumber = 13 THEN Item END) AS Col13 --,...
        FROM (
            SELECT ItemNumber, Item
            FROM dbo.DelimitedSplit8K(yt.combined_column, ',')
        ) AS derived

    ) AS ca1

    DROP TABLE #your_table;

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Wednesday, October 24, 2018 2:48 PM

    I'd use a cross tab within the CROSS APPLY to reduce overhead (I would think).


    IF OBJECT_ID('tempdb.dbo.#your_table') IS NOT NULL
        DROP TABLE #your_table;

    CREATE TABLE #your_table
    (
        id int NOT NULL,
        combined_column varchar(8000) NULL,
        name varchar(60) NOT NULL,
        other_column_1 decimal(9, 3) NULL,
        other_column_2 datetime NULL
    )
    INSERT INTO #your_table VALUES
    (1, 'qwe,ert,dfg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfg,fgh', 'A', 1.1, GETDATE()-1),
    (2, 'qwe,ert,dfkjhgkg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfgfgh,213,dfg,yuk,dfg,sdf', 'B', 2.2, GETDATE()-2),
    (3, 'qwe,ert,dfg,xcvb,fghj,tyu,gjk,tyu,dfg,ey,dfg,fgh', 'C', 3.3, GETDATE()-3)

    SELECT yt.id, yt.combined_column, ca1.*, yt.other_column_1, yt.other_column_2
    FROM #your_table yt
    CROSS APPLY (
         SELECT
             MAX(CASE WHEN ItemNumber = 1 THEN Item END) AS Col01,
             MAX(CASE WHEN ItemNumber = 2 THEN Item END) AS Col02,
             MAX(CASE WHEN ItemNumber = 3 THEN Item END) AS Col03,
             MAX(CASE WHEN ItemNumber = 4 THEN Item END) AS Col04,
             MAX(CASE WHEN ItemNumber = 5 THEN Item END) AS Col05,
             MAX(CASE WHEN ItemNumber = 6 THEN Item END) AS Col06,
             MAX(CASE WHEN ItemNumber = 7 THEN Item END) AS Col07,
             MAX(CASE WHEN ItemNumber = 8 THEN Item END) AS Col08,
             MAX(CASE WHEN ItemNumber = 9 THEN Item END) AS Col09,
             MAX(CASE WHEN ItemNumber = 10 THEN Item END) AS Col10,
             MAX(CASE WHEN ItemNumber = 11 THEN Item END) AS Col11,
             MAX(CASE WHEN ItemNumber = 12 THEN Item END) AS Col12,
             MAX(CASE WHEN ItemNumber = 13 THEN Item END) AS Col13 --,...
        FROM (
            SELECT ItemNumber, Item
            FROM dbo.DelimitedSplit8K(yt.combined_column, ',')
        ) AS derived

    ) AS ca1

    DROP TABLE #your_table;

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

  • Lynn Pettis - Wednesday, October 24, 2018 3:23 PM

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

    Wouldn't Scott's solution just put nulls in columns higher than the number of values in the csv? The OP said the maximum number of columns is 13.

  • Jonathan AC Roberts - Wednesday, October 24, 2018 3:51 PM

    Lynn Pettis - Wednesday, October 24, 2018 3:23 PM

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

    Wouldn't Scott's solution just put nulls in columns higher than the number of values in the csv? The OP said the maximum number of columns is 13.

    That would be true... until the OP runs into something with more than 13 columns.  If Scott were to convert his great CROSS TAB solution to dynamic SQL, it would become "self healing".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, October 24, 2018 6:07 PM

    Jonathan AC Roberts - Wednesday, October 24, 2018 3:51 PM

    Lynn Pettis - Wednesday, October 24, 2018 3:23 PM

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

    Wouldn't Scott's solution just put nulls in columns higher than the number of values in the csv? The OP said the maximum number of columns is 13.

    That would be true... until the OP runs into something with more than 13 columns.  If Scott were to convert his great CROSS TAB solution to dynamic SQL, it would become "self healing".

    And far more complex.  If this is for data periods in a year, for example, it's unlikely to ever change.  A year has 12 months or 13 4-week periods per year, and that's it, ever.  If it's something truly fixed like that, there's no need to add complexity.

    That is, I don't think it's necessarily wrong to keep the code straight-forward if that's all that is reasonably likely to be needed.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Lynn Pettis - Wednesday, October 24, 2018 3:23 PM

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

    As is, that query just delivers NULL values for columns where the number of values in the existing data is less  than 13, so no need for dynamic, as there's no variation in the number of columns actually delivered, and chances are, there shouldn't be.   What tool could effectively consume a dataset that varies in the actual number of columns for a given row?   SQL Server couldn't even deliver such a row-set.   Rows are fixed things.   That's not avoidable until you start delivering 1 row at a time, and slow yourself down to a crawl with overhead.   Who needs more RBAR?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Thursday, October 25, 2018 12:45 PM

    Lynn Pettis - Wednesday, October 24, 2018 3:23 PM

    Still needs to be dynamic unless you plan to edit it every time you want to run it since the number of values in the comma delimited column is unknown.

    As is, that query just delivers NULL values for columns where the number of values in the existing data is less  than 13, so no need for dynamic, as there's no variation in the number of columns actually delivered, and chances are, there shouldn't be.   What tool could effectively consume a dataset that varies in the actual number of columns for a given row?   SQL Server couldn't even deliver such a row-set.   Rows are fixed things.   That's not avoidable until you start delivering 1 row at a time, and slow yourself down to a crawl with overhead.   Who needs more RBAR?

    My problem is that I have been burned too many times with that requirement that will never ever change changing.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply