Parse Data in SQL

  • I have data into field (SS)(DD)(GG) i want parse that field into 3 different column. Is Substring work or there is other way i can parse this field.
    Three field should be like this
    Field 1 Field 2 Field 3
    SS DD GG

  • Will there always be 3 columns?

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • 4 column max. it can be like that way (DD)(FFF)(GG)(HH) Field can have any character in ()

  • It depends... If your data is always in that format, then yes, the SUBSTRING function is the way to go.
    If the number of characters between parens can vary or the number of sets can vary, then you may need to parsing things with a string splitting function.

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
    DROP TABLE #TestData;

    CREATE TABLE #TestData (
        String CHAR(12) NOT NULL
        );
    INSERT #TestData (String) VALUES ('(SS)(DD)(GG)')

    SELECT
        td.String,
        cs.CleanString,
        Col1 = SUBSTRING(cs.CleanString, 1, 2),
        Col2 = SUBSTRING(cs.CleanString, 3, 2),
        Col3 = SUBSTRING(cs.CleanString, 5, 2)
    FROM
        #TestData td
        CROSS APPLY ( VALUES (REPLACE(REPLACE(td.String, '(', ''), ')', '')) ) cs (CleanString)

  • sks_989 - Friday, June 9, 2017 9:47 AM

    4 column max. it can be like that way (DD)(FFF)(GG)(HH) Field can have any character in ()

    If 4 groups is the max, then you can use the PARSENAME function.


    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
    DROP TABLE #TestData;

    CREATE TABLE #TestData (
        String VARCHAR(20) NOT NULL
        );
    INSERT #TestData (String) VALUES ('(SS)(DD)(GG)'), ('(DD)(FFF)(GG)(HH)');

    SELECT
        td.String,
        cs.CleanString,
        Col1 = PARSENAME(cs.CleanString, 1),
        Col2 = PARSENAME(cs.CleanString, 2),
        Col3 = PARSENAME(cs.CleanString, 3),
        Col4 = PARSENAME(cs.CleanString, 4)
    FROM
        #TestData td
        CROSS APPLY ( VALUES (REPLACE(REPLACE(REPLACE(td.String, ')(', '.'), '(', ''), ')', '')) ) cs (CleanString);

  • Thanks Jason It worked.

  • sks_989 - Friday, June 9, 2017 10:11 AM

    Thanks Jason It worked.

    Good deal. Glad to help. 🙂

  • Jason if data is like this way (AS)(RT) output is coming as RT first and AS second. I want AS as first column

  • sks_989 - Friday, June 9, 2017 11:31 AM

    Jason if data is like this way (AS)(RT) output is coming as RT first and AS second. I want AS as first column

    DOH!!! I completely forgot about the PARSENAME working from right to left. Sorry about that...

    Try it like this...

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
    DROP TABLE #TestData;

    CREATE TABLE #TestData (
      String VARCHAR(20) NOT NULL
      );
    INSERT #TestData (String) VALUES ('(SS)(DD)(GG)'), ('(DD)(FFF)(GG)(HH)'), ('(AS)(RT)');

    SELECT
      td.String,
      cs.CleanString,
        cc.ColumnCount,
        Col1 = PARSENAME(cs.CleanString, cc.ColumnCount),
        Col2 = PARSENAME(cs.CleanString, cc.ColumnCount -1),
        Col3 = PARSENAME(cs.CleanString, cc.ColumnCount -2),
        Col4 = PARSENAME(cs.CleanString, cc.ColumnCount -3)
    FROM
      #TestData td
      CROSS APPLY ( VALUES (REPLACE(REPLACE(REPLACE(td.String, ')(', '.'), '(', ''), ')', '')) ) cs (CleanString)
        CROSS APPLY ( VALUES (LEN(cs.CleanString) - LEN(REPLACE(cs.CleanString, '.', '')) + 1) ) cc (ColumnCount);

  • Thanks Jason. It worked now the way I want. Much Appreciated.

  • sks_989 - Friday, June 9, 2017 12:26 PM

    Thanks Jason. It worked now the way I want. Much Appreciated.

    No problem. Glad it worked for you.

  • Also be seriously aware that the return datatype for PARSENAME is NCHAR.  It you're saving the output in a table somewhere, it will save trailing spaces even if the target column is NVARCHAR or VARCHAR and can cause serious bloat of the table and any indexes that the columns may have on them.

    {EDIT}  Scratch that.  MS lists the return type as NCHAR and, if that were true, my warning above would be correct.  See the following post...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's the link for the current definition of PARSENAME.
    https://docs.microsoft.com/en-us/sql/t-sql/functions/parsename-transact-sql

    But Books Online and the link above is incorrect.  PARSENAME actually returns an NVARCHAR(128), which is the same as the data type alias of "SYSNAME" (which also supposedly defaults to NOT NULL in tables and has a few other goodies associated with it which I'm not going to prove/disprove just now).  Here's the proof.  Don't forget that you have to divide the max data length by 2 to get the number of characters that can be used for Unicode datatypes like NVARCHAR.


    DECLARE @OtherLimit INT = 6
    ;
       WITH ctePreAgg AS
    (
     SELECT  product_name = CASE
                                WHEN DENSE_RANK() OVER (ORDER BY SUM(total_sale) DESC) <= @OtherLimit
                                THEN product_name
                                ELSE 'Others'
                            END
            ,total_sale = SUM(total_sale)
       FROM #values
      GROUP BY product_name
    )
     SELECT  product_name
            ,sum_total_sale = SUM(total_sale)
       FROM ctePreAgg
      GROUP BY product_name
      ORDER BY sum_total_sale DESC
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It would appear that you are correct about it being an NVARCHAR(128).

    DECLARE @String VARCHAR(50) = 'this.is.a.test';

    IF OBJECT_ID('tempdb..#temp', 'U') IS NOT NULL
    DROP TABLE #temp;

    SELECT
        Col1 = PARSENAME(@String, 4),
        Col2 = PARSENAME(@String, 3),
        Col3 = PARSENAME(@String, 2),
        Col4 = PARSENAME(@String, 1)
    INTO #temp;

    EXEC tempdb..sp_help #temp;

    That said, I think the only time that actually needs to be of concern is if you're doing a "SELECT INTO" like above.
    Otherwise, it's able to insert into a predefined table w/o any issues (at least none that I've noticed)...

    DECLARE @String VARCHAR(50) = 'this.is.a.test';

    IF OBJECT_ID('tempdb..#temp', 'U') IS NOT NULL
    DROP TABLE #temp;
    GO
    CREATE TABLE #temp (
        Col1 varchar(20),
        Col2 varchar(20),
        Col3 varchar(20),
        Col4 varchar(20)
        );
    INSERT #temp (Col1,Col2,Col3,Col4)
    SELECT
        Col1 = PARSENAME(@String, 4),
        Col2 = PARSENAME(@String, 3),
        Col3 = PARSENAME(@String, 2),
        Col4 = PARSENAME(@String, 1);

    SELECT
        Col1 = '''' + t.Col1 + '''',
        Col2 = '''' + t.Col2 + '''',
        Col3 = '''' + t.Col3 + '''',
        Col4 = '''' + t.Col4 + ''''
    FROM
        #temp t;

    Since it's not actually a SYSNAME data type, it has no issues handling NULLs.

    DECLARE @String VARCHAR(50) = 'This...Test';

    SELECT
        Col1 = PARSENAME(@String, 4),
        Col2 = PARSENAME(@String, 3),
        Col3 = PARSENAME(@String, 2),
        Col4 = PARSENAME(@String, 1);

  • If it were an NCHAR, you would need to be seriously concerned no matter what.  I ran into this problem before with some system tables (can't remember which ones just now).  Someone was capturing some of the data and the table they were capturing the data in was growing explosively.  Here's an example of why it was growing out of control and why I was so concerned with PARSENAME returning NCHAR (which we've discovered is not true).

    Heh... perhaps this should be a "QOD".  Before you run the following code, how many bytes will the single letter "A" consume in the table?

     CREATE TABLE #MyHead
            (
             SomeString NVARCHAR(300)
            )
    ;
    DECLARE @SomeString NCHAR(128) = 'A';
     INSERT INTO #MyHead
            (SomeString)
     SELECT SomeString = @SomeString
    ;
     SELECT  SomeString
            ,LengthInBytes = DATALENGTH(SomeString)
       FROM #MyHead
    ;

    p.s. Don't forget about the "overhead".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply