Parse delimited column into multiple records

  • What is the best way to take 30,000 records with a delimited Outcome field like the following….

    PatientId           Outcome

    180460             “Observation initiated/increased; Laboratory tests performed; Drug therapy initiated/changed”

    And insert multiple recs into another table, one rec for each of the semi-colon delimited values, such as:

    PatientId           Outcome

    180460             Observation initiated/increased

    180460              Laboratory tests performed

    180460              Drug therapy initiated/changed

     

  • Can I always split on the semicolon?  What is the maximum number of semicolons in a field?  Is this something that you want to do once and then not again or is this part of a query that you will be running multiple times and need it have reasonable performance?

    Russel Loski, MCSE Business Intelligence, Data Platform

  • Just posted the function here:

    http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=338231

    _____________
    Code for TallyGenerator

  • Stef,

    These types of problems are incredibly simple if you have the right kind of tools.  Fortunately, there is one tool that has tremendous utility as just such a tool.  It is called a "Tally" table, consists of a single well-indexed column of numbers, and has many, many uses including solving problems like these.  If you don't already have one, now is the time to make one... here's how...

    --===== Create and populate the Tally table on the fly

     SELECT TOP 11000 --equates to more than 30 years of dates when needed

            IDENTITY(INT,1,1) AS N

       INTO dbo.Tally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.Tally

            ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)

    --===== Allow the general public to use it

      GRANT SELECT ON dbo.Tally TO PUBLIC

    You only need to make it once... it should be a permanent table in your arsenal of SQL tools.

    Because I always like to test these things, here's how I made 30,000 rows of test data.  Granted, the data is highly duplicated, but it will suffice for this test... you have real data so you won't need to create this table...

    --===== Conditionally drop and recreate a test table to simulate Stef's original table 
         -- (creates 30,000 rows)
    --      IF OBJECT_ID('OriginalTable') IS NOT NULL
    --         DROP TABLE OriginalTable

     SELECT TOP 30000
            ParentID = IDENTITY(INT,180000,1),
            Outcome = CAST('Observation initiated/increased; Laboratory tests performed; Drug therapy initiated/changed' AS VARCHAR(400))
       INTO OriginalTable
       FROM Master.dbo.SysColumns sc1,
            Master.dbo.SysColumns sc2

      ALTER TABLE OriginalTable
        ADD PRIMARY KEY (ParentID)

    Ok... here's the code that does the trick.  I do it only as a SELECT but you could very easily change it to an INSERT/SELECT to get the table you asked for...

    --===== Split the original table data into a new normalized table

     SELECT ParentID,

            LTRIM(SUBSTRING(+ot.Outcome,t.N+1,CHARINDEX(';',ot.Outcome,t.N+1)-t.N-1)) AS OutCome

       FROM (SELECT ParentID,';'+Outcome+';' AS OutCome FROM OriginalTable) ot,

            dbo.Tally t

      WHERE t.N < LEN(ot.Outcome)

        AND SUBSTRING(ot.Outcome,t.n,1)=';'

    The above code should take about 5 seconds to run.  Obviously, you need to change "OriginalTable" to the name of your actual table.  Write back if you have any questions...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply