Removing Brackets Along with Text from Column

  • Hello All,

    When i query a database I receive certain as follows:

    ID       Data
    1        Data0 [2], Data1  [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]
    2        Gen [201], Whatever [804]
    3.       Something [1900], Temp [1403], Temp2 [80]

    I would like to retrieve the data as follows:

    ID      Data
    1       Data0, Data1, Data2, Data3, Data4, Data5, Data6. Data7
    2       Gen, Whatever
    3.      Something, Temp, Temp2

    I was playing around with Stuff, PatIndex and Replace but was only able to get the first one to display and the rest was removed.

    Thank you,
    Tony

  • Something like this?:
    CREATE TABLE Test(BadData varchar(15));
    GO
    INSERT INTO Test(BadData) VALUES
    ('Data0 [2]'), ('Data1 [2602]'),('Data2 [130]');

    SELECT BadData
        , LTRIM(LEFT(BadData, CHARINDEX('[',BadData,1)-1)) AS Loc
    FROM Test;

  • Thank you for the query. This works only when there is one item in the column. In my example utilizing yours,

    Data0 [2], Data1 [2602], Data2 [130]

    will all belong in one column in one row, not three rows.

    Thanks again,
    Tony

  • This works, but it's not pretty.  I think that it can be better written.

    CREATE TABLE #Test(ID TINYINT, BadData varchar(100));
    GO
    INSERT INTO #Test(ID, BadData) VALUES
    (1, 'Data0 [2], Data1 [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]'),
    (2, 'Gen [201], Whatever [804]'),
    (3, 'Something [1900], Temp [1403], Temp2 [80]')
    ;
    WITH CTE AS
    (
        SELECT *
        FROM
        (
            VALUES(0), (0), (0), (0), (0)
        ) t(n)
    ),
    Tally AS
    (
        SELECT ROW_NUMBER() OVER(ORDER BY @@VERSION) AS n
        FROM CTE AS a
        CROSS JOIN CTE AS b
        CROSS JOIN CTE AS c
    )

    SELECT BadData
    , st.value('.', 'VARCHAR(100)') AS st
    FROM #Test t
    CROSS APPLY
    (
        SELECT c1 AS [*]
        FROM
        (
            SELECT c1, ta.n,
                SUM(
                    CASE
                        WHEN c1 = '[' THEN 1
                        WHEN c2 = ']' THEN -1
                        ELSE 0
                    END
                ) OVER(ORDER BY n ROWS UNBOUNDED PRECEDING) AS is_quoted
            FROM Tally ta
            CROSS APPLY ( VALUES(SUBSTRING(t.BadData, ta.n, 1), SUBSTRING(t.BadData, ta.n-1, 1))) AS c(c1, c2)
        ) AS c
        WHERE is_quoted = 0
        ORDER BY c.n
        FOR XML PATH(''), TYPE
    ) x(st)
        ORDER BY t.ID
    ;

    DROP TABLE #Test

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thank you Drew for the code...It works great with the test data that was inserted.
    When I tried the code to the table that the data is in, it truncated some of it. I changed the
    value of ", st.value('.', 'VARCHAR(100)') AS st" to ", st.value('.', 'VARCHAR(300)') AS st"
    but it still truncated it even though the total characters of data = 241.

    It seems like it would be easier because using the following removes the brackets but leaves
    the data inside the brackets, but what the heck do I know...

    SELECT REPLACE(REPLACE(Field_Name, '[', ''), ']', '')

    Thank you again,
    Tony

  • You need to normalise the data.

    You can do it in 2 steps:
    1. Using Splitter8k function (see the article from Jeff Moden on this site) split the comma delimited values into separate rows.
    2. Separate differen values in the column return by Splitter : place values before ' [' to Col1 and the rest of it into Col2.

    Don't forget to keep row ID from the original table together with the separated values from each row.

    Normalization is done.

    Now, concatenate values from Col1 using common STUFF ... FOR XML method (you can find it in the query posted by Drew above on this page).

    You can do all this in one query, if you don't want to change the terrible database design right now.

    _____________
    Code for TallyGenerator

  • Hi Sergiy,

    I will look into this.

    Thank you,
    Tony

  • @eusanpe1 ,
    First, help us help you on future posts.  Please see the first link in my signature line below under "Helpful Links" for how to post readily consumable data.  Thanks.

    Borrowing heavily on the test code setup that @drew.allen  posted, here's the test data that I used.


    --DROP TABLE #Test
    GO
    CREATE TABLE #Test(ID TINYINT, BadData varchar(100));
    GO
     INSERT INTO #Test(ID, BadData) VALUES
             (1, 'Data0 [2], Data1 [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]')
            ,(2, 'Gen [201], Whatever [804]')
            ,(3, 'Something [1900], Temp [1403], Temp2 [80]')
            ,(4, 'No brackets here')
            ,(5, 'Data0, Data1 [2602], Data2, Data3 [64], Data4, Data5 [], Data6 ')
    ;

    Here's a solution that will work on all versions of SQL Server from 2005 and up.


       WITH
    cteSplit AS
    (--==== Split the data at the commas
     SELECT tt.ID, split.ItemNumber, Item = LTRIM(split.Item)
       FROM #Test tt
      CROSS APPLY dbo.DelimitedSplit8K(tt.BadData,',') split
    )
    ,cteClean AS
    (--==== Select only up to and not including the "[" character
         -- and leave the space before it.
         -- Personally, I'd clean up any trailing space and stop here.
     SELECT ID, ItemNumber, Item = RTRIM(LEFT(Item,ISNULL(NULLIF(CHARINDEX('[',Item),0)-1,100)))
       FROM cteSplit
    )
    --===== Re-pivot the split-out, cleaned-up items.
     SELECT  ID
            ,GoodData = STUFF((
                       SELECT ', ' + c2.Item
                         FROM cteClean c2
                        WHERE c2.ID = c1.ID
                        ORDER BY c2.ItemNumber
                          FOR XML PATH(''),
                                  TYPE).value('(./text())[1]','VARCHAR(100)'),1,2,'')

       FROM cteClean c1
      GROUP BY ID
    ;

    Hat's off to Wayne Sheffield for documenting the XML Concatenation method in the following article:
    Creating a comma-separated list (SQL Spackle)

    You can get the DelimitedSplit8K function from the "Resources" section at the bottom of the following article:
    Tally OH! An Improved SQL 8K “CSV Splitter” Function

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, I'd prefer to use ' [' instead of '[' as a split point.

    It's more definitive, and leaves no trailing spaces.

    But it requires some degree of consistency in string formatting.

    _____________
    Code for TallyGenerator

  • Sergiy - Saturday, May 13, 2017 7:38 PM

    Jeff, I'd prefer to use ' [' instead of '[' as a split point.It's more definitive, and leaves no trailing spaces.But it requires some degree of consistency in string formatting.

    That was my initial inclination, as well.  But, like you stated, it does require consistency in the formatting of the string and I've been seriously burned in the past by making such an assumption and so went with the more forgiving method.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you all for the help.

    Jeff, I read the Helpful Links that is posted and will follow the directions with future posts.

    Thank you all again,
    Tony

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply