Removing Brackets Along with Text from Column

Question

Removing Brackets Along with Text from Column

eusanpe1

SSC Enthusiast

Points: 145
More actions
May 12, 2017 at 6:11 pm

#321406

Hello All,
When i query a database I receive certain as follows:
ID Data
1 Data0 [2], Data1 [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]
2 Gen [201], Whatever [804]
3. Something [1900], Temp [1403], Temp2 [80]
I would like to retrieve the data as follows:
ID Data
1 Data0, Data1, Data2, Data3, Data4, Data5, Data6. Data7
2 Gen, Whatever
3. Something, Temp, Temp2
I was playing around with Stuff, PatIndex and Replace but was only able to get the first one to display and the rest was removed.
Thank you,
Tony

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63503 More actions · Answer 1

Something like this?:
CREATE TABLE Test(BadData varchar(15)); GO INSERT INTO Test(BadData) VALUES ('Data0 [2]'), ('Data1 [2602]'),('Data2 [130]');

SELECT BadData , LTRIM(LEFT(BadData, CHARINDEX('[',BadData,1)-1)) AS Loc FROM Test;

eusanpe1 SSC Enthusiast Points: 145 More actions · Answer 2

Thank you for the query. This works only when there is one item in the column. In my example utilizing yours,

Data0 [2], Data1 [2602], Data2 [130]

will all belong in one column in one row, not three rows.

Thanks again,
Tony

drew.allen SSC Guru Points: 76989 More actions · Answer 3

This works, but it's not pretty. I think that it can be better written.

CREATE TABLE #Test(ID TINYINT, BadData varchar(100)); GO INSERT INTO #Test(ID, BadData) VALUES (1, 'Data0 [2], Data1 [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]'), (2, 'Gen [201], Whatever [804]'), (3, 'Something [1900], Temp [1403], Temp2 [80]') ; WITH CTE AS ( SELECT * FROM ( VALUES(0), (0), (0), (0), (0) ) t(n) ), Tally AS ( SELECT ROW_NUMBER() OVER(ORDER BY @@VERSION) AS n FROM CTE AS a CROSS JOIN CTE AS b CROSS JOIN CTE AS c )

SELECT BadData , st.value('.', 'VARCHAR(100)') AS st FROM #Test t CROSS APPLY ( SELECT c1 AS [*] FROM ( SELECT c1, ta.n, SUM( CASE WHEN c1 = '[' THEN 1 WHEN c2 = ']' THEN -1 ELSE 0 END ) OVER(ORDER BY n ROWS UNBOUNDED PRECEDING) AS is_quoted FROM Tally ta CROSS APPLY ( VALUES(SUBSTRING(t.BadData, ta.n, 1), SUBSTRING(t.BadData, ta.n-1, 1))) AS c(c1, c2) ) AS c WHERE is_quoted = 0 ORDER BY c.n FOR XML PATH(''), TYPE ) x(st) ORDER BY t.ID ;

DROP TABLE #Test

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

eusanpe1 SSC Enthusiast Points: 145 More actions · Answer 4

Thank you Drew for the code...It works great with the test data that was inserted.
When I tried the code to the table that the data is in, it truncated some of it. I changed the
value of ", st.value('.', 'VARCHAR(100)') AS st" to ", st.value('.', 'VARCHAR(300)') AS st"
but it still truncated it even though the total characters of data = 241.

It seems like it would be easier because using the following removes the brackets but leaves
the data inside the brackets, but what the heck do I know...

SELECT REPLACE(REPLACE(Field_Name, '[', ''), ']', '')

Thank you again,
Tony

Sergiy SSC Guru Points: 110208 More actions · Answer 5

You need to normalise the data.

You can do it in 2 steps:
1. Using Splitter8k function (see the article from Jeff Moden on this site) split the comma delimited values into separate rows.
2. Separate differen values in the column return by Splitter : place values before ' [' to Col1 and the rest of it into Col2.

Don't forget to keep row ID from the original table together with the separated values from each row.

Normalization is done.

Now, concatenate values from Col1 using common STUFF ... FOR XML method (you can find it in the query posted by Drew above on this page).

You can do all this in one query, if you don't want to change the terrible database design right now.

_____________
Code for TallyGenerator

eusanpe1 SSC Enthusiast Points: 145 More actions · Answer 6

eusanpe1

SSC Enthusiast

Points: 145

May 13, 2017 at 5:00 pm

#1942446

Hi Sergiy,

I will look into this.

Thank you,
Tony

Jeff Moden SSC Guru Points: 1004365 More actions · Answer 7

@eusanpe1 ,
First, help us help you on future posts. Please see the first link in my signature line below under "Helpful Links" for how to post readily consumable data. Thanks.

Borrowing heavily on the test code setup that @drew.allen posted, here's the test data that I used.

--DROP TABLE #Test GO CREATE TABLE #Test(ID TINYINT, BadData varchar(100)); GO INSERT INTO #Test(ID, BadData) VALUES (1, 'Data0 [2], Data1 [2602], Data2 [130], Data3 [64], Data4 [550], Data5 [1070], Data6 [1117]') ,(2, 'Gen [201], Whatever [804]') ,(3, 'Something [1900], Temp [1403], Temp2 [80]') ,(4, 'No brackets here') ,(5, 'Data0, Data1 [2602], Data2, Data3 [64], Data4, Data5 [], Data6 ') ;

Here's a solution that will work on all versions of SQL Server from 2005 and up.

WITH cteSplit AS (--==== Split the data at the commas SELECT tt.ID, split.ItemNumber, Item = LTRIM(split.Item) FROM #Test tt CROSS APPLY dbo.DelimitedSplit8K(tt.BadData,',') split ) ,cteClean AS (--==== Select only up to and not including the "[" character -- and leave the space before it. -- Personally, I'd clean up any trailing space and stop here. SELECT ID, ItemNumber, Item = RTRIM(LEFT(Item,ISNULL(NULLIF(CHARINDEX('[',Item),0)-1,100))) FROM cteSplit ) --===== Re-pivot the split-out, cleaned-up items. SELECT ID ,GoodData = STUFF(( SELECT ', ' + c2.Item FROM cteClean c2 WHERE c2.ID = c1.ID ORDER BY c2.ItemNumber FOR XML PATH(''), TYPE).value('(./text())[1]','VARCHAR(100)'),1,2,'')

FROM cteClean c1
GROUP BY ID
;

Hat's off to Wayne Sheffield for documenting the XML Concatenation method in the following article:
Creating a comma-separated list (SQL Spackle)

You can get the DelimitedSplit8K function from the "Resources" section at the bottom of the following article:
Tally OH! An Improved SQL 8K “CSV Splitter” Function

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110208 More actions · Answer 8

Jeff, I'd prefer to use ' [' instead of '[' as a split point.

It's more definitive, and leaves no trailing spaces.

But it requires some degree of consistency in string formatting.

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1004365 More actions · Answer 9

Sergiy - Saturday, May 13, 2017 7:38 PM
Jeff, I'd prefer to use ' [' instead of '[' as a split point.It's more definitive, and leaves no trailing spaces.But it requires some degree of consistency in string formatting.

That was my initial inclination, as well. But, like you stated, it does require consistency in the formatting of the string and I've been seriously burned in the past by making such an assumption and so went with the more forgiving method.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

eusanpe1 SSC Enthusiast Points: 145 More actions · Answer 10

Thank you all for the help.

Jeff, I read the Helpful Links that is posted and will follow the directions with future posts.

Thank you all again,
Tony