Split two delimited strings in table

  • Hi,

    I have a table that's in the following format:

    ID, DegreeStr, YearEarnedStr

    1, "BS,MS,PhD", "2001,2005,2011"

    2, "BS", "2003"

    3, "BS,MS", "2002,2008"

    I would like to make it look like the following, using a method that does not require a Tally table, CLR, or any functions.

    ID, Degree, YearEarned

    1, BS, 2001

    1, MS, 2005

    1, PhD, 2011

    2, BS, 2003

    3, BS, 2002

    3, MS, 2008

    The only other two ways that I believe I've seen are to use XML or a recursive CTE. Regarding the XML way, I don't have any characters that I need to escape--I've just never seen an example of how it would be done on multiple columns. Does anyone know how to do this, or is there even another way that I'm not thinking of?

    Here's some T-SQL to produce my inputs. Thanks in advance for your help.

    Mike

    DECLARE @testdata TABLE

    (

    IDINT NOT NULL,

    DegreeStr VARCHAR(100),

    YearEarnedStrVARCHAR(100)

    );

    INSERT INTO @testdata

    SELECT 1, 'BS,MS,PhD', '2001,2005,2011' UNION ALL

    SELECT 2, 'BS', '2003' UNION ALL

    SELECT 3, 'BS,MS', '2002,2008'

    SELECT *

    FROM @testdata;

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • mikes84 (6/1/2016)


    .....using a method that does not require a Tally table, CLR, or any functions.

    any reason why no Tally or functions?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL (6/1/2016)


    mikes84 (6/1/2016)


    .....using a method that does not require a Tally table, CLR, or any functions.

    any reason why no Tally or functions?

    and does this exclusion apply to inline tally tables (tally tables created using chained CTEs)?

    Also, is there a maximum number of elements per string?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Elementary when using the DelimitedSplit8K function

    😎

    USE TEEST;

    GO

    DECLARE @testdata TABLE

    (

    IDINT NOT NULL,

    DegreeStr VARCHAR(100),

    YearEarnedStrVARCHAR(100)

    );

    INSERT INTO @testdata

    SELECT 1, 'BS,MS,PhD', '2001,2005,2011' UNION ALL

    SELECT 2, 'BS', '2003' UNION ALL

    SELECT 3, 'BS,MS', '2002,2008'

    SELECT

    TD.ID

    ,DX.Item

    ,YX.Item

    FROM @testdata TD

    CROSS APPLY dbo.DelimitedSplit8K(TD.DegreeStr,CHAR(44)) AS DX

    CROSS APPLY dbo.DelimitedSplit8K(TD.YearEarnedStr,CHAR(44)) AS YX

    WHERE DX.ItemNumber = YX.ItemNumber

    ;

    Output

    ID Item Item

    ----------- -------- ------

    1 BS 2001

    1 MS 2005

    1 PhD 2011

    2 BS 2003

    3 BS 2002

    3 MS 2008

  • First, why no Tally Table? Have there been complaints that your queries are too fast perhaps?

    OK, no other functions either, fine. Have a look at the article about APPLY referenced in ChrisM's signature line. There is a content in there about how to encapsulate function logic in a subquery or CTE.

    Provided that inline Tally tables are okay (we'll just refer to it as the "CROSS JOIN/Rownumber trick" ;-)) you can encapsulate the splitter logic for delimitesplit8K inside a subquery then use Eirikur's solution as a template for how to do what you need without using a function.

    If the query executes too quickly you could slow it down by doing the same thing but using XML logic for splitting the strings.

    P.S. I would provide an example but am not at a PC. I will later tonight unless someone beats me to it.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • I should have clarified. If it's part of a CTE, I can use a Tally table. I'm trying to avoid adding a physical table to the database as it's a vendor database that could get changed/overwritten with upgrades.

    These are some good suggestions. I'm also interested to see how I would incorporate some of the function logic into the CTEs. Alan, looking forward to seeing your example.

    Thanks again,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • If you can put a DBAUtilities database on the same server as the 3rd party database, your queries can execute the DelimitedSplit function from the utility database. No worry about upgrades affecting the DBAUtility database.

  • mikes84 (6/1/2016)


    I should have clarified. If it's part of a CTE, I can use a Tally table. I'm trying to avoid adding a physical table to the database as it's a vendor database that could get changed/overwritten with upgrades.

    These are some good suggestions. I'm also interested to see how I would incorporate some of the function logic into the CTEs. Alan, looking forward to seeing your example.

    Thanks again,

    Mike

    Silly vendors!

    Here's the solution I was talking about....

    -- Sample Data

    DECLARE @testdata TABLE

    (

    IDINT NOT NULL,

    DegreeStr VARCHAR(100),

    YearEarnedStrVARCHAR(100)

    );

    INSERT INTO @testdata

    SELECT 1, 'BS,MS,PhD', '2001,2005,2011' UNION ALL

    SELECT 2, 'BS', '2003' UNION ALL

    SELECT 3, 'BS,MS', '2002,2008'

    -- Functionless and dbo.Tally-less solution

    SELECT DS.ID, Degree = DS.item1, YearEarned = YE.item1

    FROM

    (

    SELECT

    itemnumber1 = ROW_NUMBER() OVER (ORDER BY t1.N),

    ID,

    item1 = SUBSTRING

    (

    DegreeStr,

    t1.N+1,

    ISNULL(NULLIF(CHARINDEX(',',DegreeStr,t1.N+1),0)-t1.N-1,8000)

    )

    FROM @testdata td

    CROSS JOIN

    (

    SELECT 0 UNION ALL

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)

    ) t1(N)

    WHERE

    t1.N <= (ISNULL(DATALENGTH(DegreeStr),0)) AND

    (SUBSTRING(DegreeStr,t1.N,1) = ',' OR t1.N = 0)

    ) DS

    JOIN

    (

    SELECT

    itemnumber1 = ROW_NUMBER() OVER (ORDER BY t1.N),

    ID,

    item1 = SUBSTRING

    (

    YearEarnedStr,

    t1.N+1,

    ISNULL(NULLIF(CHARINDEX(',',YearEarnedStr,t1.N+1),0)-t1.N-1,8000)

    )

    FROM @testdata td

    CROSS JOIN

    (

    SELECT 0 UNION ALL

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)

    ) t1(N)

    WHERE

    t1.N <= (ISNULL(DATALENGTH(YearEarnedStr),0)) AND

    (SUBSTRING(YearEarnedStr,t1.N,1) = ',' OR t1.N = 0)

    ) YE

    ON DS.itemnumber1 = YE.itemnumber1

    ORDER BY ID -- REMOVE THIS! Included for display only, not required and will slow you down

    Again - I understand and know how to do this because I read that Paul White article (as well as Jeff's article about delimitedsplit8k). I highly recommend both for anyone who has not read them.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Evolution at work: here's Alan's excellent solution tweaked to read the source table only once.

    ;WITH _Tally (n) AS (

    SELECT 0 UNION ALL

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x),

    (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)

    )

    SELECT td.*, ds.ItemNumber, ds.Item, ye.Item

    FROM @testdata td

    CROSS APPLY (

    SELECT

    ItemNumber = ROW_NUMBER() OVER (ORDER BY t1.N),

    Item = SUBSTRING(DegreeStr, t1.N+1, ISNULL(NULLIF(CHARINDEX(',',DegreeStr,t1.N+1),0)-t1.N-1,8000))

    FROM _Tally t1

    WHERE t1.N <= ISNULL(DATALENGTH(DegreeStr),0)

    AND (SUBSTRING(DegreeStr,t1.N,1) = ',' OR t1.N = 0)

    ) DS

    CROSS APPLY (

    SELECT

    ItemNumber = ROW_NUMBER() OVER (ORDER BY t2.N),

    Item = SUBSTRING(YearEarnedStr, t2.N+1, ISNULL(NULLIF(CHARINDEX(',',YearEarnedStr,t2.N+1),0)-t2.N-1,8000))

    FROM _Tally t2

    WHERE t2.N <= ISNULL(DATALENGTH(YearEarnedStr),0)

    AND (SUBSTRING(YearEarnedStr,t2.N,1) = ',' OR t2.N = 0)

    ) YE

    WHERE DS.itemnumber = YE.itemnumber

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • These are really great suggestions. I'm looking forward to trying them next week. Thanks so much!

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • mikes84 (6/3/2016)


    These are really great suggestions. I'm looking forward to trying them next week. Thanks so much!

    Here's the original question[/url].

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply