Tricky REPLACE problem

Question

Post reply

Tricky REPLACE problem

ronmoses@gmail.com

SSCarpal Tunnel

Points: 4480
More actions
September 7, 2012 at 1:25 pm

#281240

I've been poring over how to handle this for a few hours without using some kind of horrifying loop, and having failed to find a solution, I turn to the masters.
I have a column containing strings that look like some variation on this:
A#+AB#+BA#+B#
The letters before the # can be any combination of one or two letters. I need to replace all instances of 'A#' with 'C#'. Simple enough, except for the fact that 'BA#' must remain untouched. Obviously a simple REPLACE(@String, 'A#', 'C#') will return this:
C#+AB#+BC#+B#
That's wrong. I need:
C#+AB#+BA#+B#
It would be perfect if there was a function like:
REPLACE(@String, '%[^A-Z]A#%', 'C#')
But as far as I know there isn't. (Even if there was, that would also exclude the A# at the beginning of the string, as there is no character before it.) Can anyone offer a suggestion?
Bonus points if that solution does not involve a CLR. 😉
Thanks!
Ron
-----
a haiku...
NULL is not zero
NULL is not an empty string
NULL is the unknown

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Tom257 SSChampion Points: 11543 More actions · Answer 1

replace the untouchable 'BA#' first, then do the main replace, then put the 'BA#' back

SELECT REPLACE(REPLACE(REPLACE(@STRING, 'BA#', '***'), 'A#', 'C#'), '***', 'BA#')

Lowell SSC Guru Points: 323518 More actions · Answer 2

the problem with abstracting out the problem is it hides the actual issue.

i suspect that this would involve splitting ont he # sign, cleaning up any elements that meet the criteria i could not figure out, and then using FOR XML to concatenate them back together again.

search for "DelimitedSplit8K" here on SSC to get teh function and the great article on how to use it.

try this, and see how it turns the values into a column of data? i think that's the key here:

with MyTable(ID,SomeString)

AS

(

SELECT 1,'A#+AB#+BA#+B#' UNION ALL

SELECT 2,'C#+AB#+BC#+B#'

)

select * from MyTable

cross apply dbo.delimitedsplit8k(SomeString,'#')

ID SomeString ItemNumber Item

1 A#+AB#+BA#+B# 1 A

1 A#+AB#+BA#+B# 2 +AB

1 A#+AB#+BA#+B# 3 +BA

1 A#+AB#+BA#+B# 4 +B

1 A#+AB#+BA#+B# 5

2 C#+AB#+BC#+B# 1 C

2 C#+AB#+BC#+B# 2 +AB

2 C#+AB#+BC#+B# 3 +BC

2 C#+AB#+BC#+B# 4 +B

2 C#+AB#+BC#+B# 5

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 3

Tom Brown (9/7/2012)
replace the untouchable 'BA#' first, then do the main replace, then put the 'BA#' back
SELECT REPLACE(REPLACE(REPLACE(@STRING, 'BA#', '***'), 'A#', 'C#'), '***', 'BA#')

Well, the problem with that is that I've only given one very limited example. That string could contain anything from AA# through ZA#. I'm not sure this approach could be made to work in that regard. Please correct me if I'm wrong.

Ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

Another suggestion.

SELECT STUFF( REPLACE( '+' + string, '+A#', '+C#'), 1, 1, '')

FROM(SELECT 'A#+AB#+BA#+B#' AS string) A

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

@lowell

Shouldn't the splitting be done on the plus sign(+)? These look as adding formulas.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 6

Lowell (9/7/2012)
the problem with abstracting out the problem is it hides the actual issue.

You're correct, I apologize for confusing the issue by trying to oversimplify it.

This column contains formulas that are parsed out and calculated. These formulas can be written and edited by the end user. A through ZZ represent variables. The plain-language description of these variables is also user-definable (R may represent "Length in feet," for example). After each variable there is a @, &, or #. The function of these symbols is not relevant to this parsing process, but there will always be one of them. The string itself can be almost any mathematical expression, and can include constants. Here are a few real-world examples:

L@*K@*H@/324

(AF#-AR#)*(Y#+BB#)

TT#*(L#+D@)*C@

((AE#-AR#)*(Y#+A#))/Y#

The reason I need to replace certain variables is that, when importing these formula strings from one database to another, their variables may not represent the same thing. I have a means of establishing the replacement variable; in other words, I'm able to determine that 'L' needs to be replaced with 'AT'. It's the actual replacement that's an issue.

I hope that clarifies the problem.

Thanks,

Ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Lowell SSC Guru Points: 323518 More actions · Answer 7

ok here's a quick prototype i *think* does what you are asking;

it's basically a power-replace that cross joins your funciton symbols with a know list of repalces;

does this work the way you expect?

with MyTable(ID,SomeString)

AS

(

SELECT 1,'L@*K@*H@/324' UNION ALL

SELECT 2,'(AF#-AR#)*(Y#+BB#)' UNION ALL

SELECT 3,'TT#*(L#+D@)*C@' UNION ALL

SELECT 4,'((AE#-AR#)*(Y#+A#))/Y#/'

),

MyFn (fnval) AS

(

SELECT '@' UNION ALL

SELECT '&' UNION ALL

SELECT '#'

),

MyReplaceMents (oldval,newval)

AS

(

SELECT 'L','AT'

)

select

MyTable.* ,

REPLACE(SomeString,oldval + MyFn.fnval ,newval + MyFn.fnval ) AS NEWVAL

FROM MyTable

CROSS JOIN MyFn

CROSS JOIN MyReplaceMents

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 8

Lowell (9/7/2012)
does this work the way you expect?

Well... it certainly could be a step in the right direction, but I'm not quite there yet. Here's the result set I'm getting:

ID SomeString NEWVAL

--------------------------------------------------

1 L@*K@*H@/324 AT@*K@*H@/324

2 (AF#-AR#)*(Y#+BB#) (AF#-AR#)*(Y#+BB#)

3 TT#*(L#+D@)*C@ TT#*(L#+D@)*C@

4 ((AE#-AR#)*(Y#+A#))/Y# ((AE#-AR#)*(Y#+A#))/Y#

1 L@*K@*H@/324 L@*K@*H@/324

2 (AF#-AR#)*(Y#+BB#) (AF#-AR#)*(Y#+BB#)

3 TT#*(L#+D@)*C@ TT#*(L#+D@)*C@

4 ((AE#-AR#)*(Y#+A#))/Y# ((AE#-AR#)*(Y#+A#))/Y#

1 L@*K@*H@/324 L@*K@*H@/324

2 (AF#-AR#)*(Y#+BB#) (AF#-AR#)*(Y#+BB#)

3 TT#*(L#+D@)*C@ TT#*(AT#+D@)*C@

4 ((AE#-AR#)*(Y#+A#))/Y# ((AE#-AR#)*(Y#+A#))/Y#

The bad news is that there are twelve rows instead of four. The good news is that one of those three copies contains the replaced variable. So the bad news is easily eliminated by altering the SELECT and making it into an additional CTE:

,cte_Replace AS

(select

MyTable.* ,

REPLACE(SomeString,oldval + MyFn.fnval ,newval + MyFn.fnval ) AS NEWVAL

FROM MyTable

CROSS JOIN MyFn

CROSS JOIN MyReplaceMents)

SELECT * FROM cte_Replace

WHERE SomeString != NEWVAL

This excludes records where no variables were replaced, but that's fine. They can be discarded and it won't affect my overall task.

The first problem is that if I try to do more than one replacement in the same pass:

MyReplaceMents (oldval,newval)

AS

(

SELECT 'L','AT'

UNION

SELECT 'AR','ZZ'

),

...any row that contains two replacements appears twice in the result set, once for each replacement, and each row containing only one of the replacements. But I suppose that could be resolved with multiple iterations. That would certainly be less painful than iterating through each formula string character-by-character.

The larger problem is that when I change the second formula to '(AL#-AR#)*(Y#+BB#)' to test the "do not replace" condition, the AL# becomes AAT#. So it's not discriminating based on a preceding character, which is the primary requirement. Without it, we basically have the same results as if we'd used a simple REPLACE() function.

The more I consider this, the less convinced I am that there's a solution to this that involves REPLACE(). But I haven't given up hope yet.

Thank you very much for your efforts, Lowell. I'll dig more into your suggestion and see if there's a tweak that can be made to overcome the "do not replace" problem.

Thanks!

Ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Luis Cazares SSC Guru Points: 183706 More actions · Answer 9

Am I overthinking?

EDIT: No, I'm undertesting.

DECLARE@Searchedchar(1),

@newchar(1)

SET @Searched = 'L'

SET @new = 'M'

DECLARE @test-2TABLE(

stringvarchar(1000))

INSERT @test-2

SELECT 'L@*K@*H@/324' AS string

UNION ALL SELECT '(AF#-AR#)*(Y#+BB#)'

UNION ALL SELECT 'TT#*(L#+D@)*L@'

UNION ALL SELECT '((AE#-AR#)*(Y#+A#))/Y#';

WITH cteTally(n) AS ( SELECT TOP 1000 ROW_NUMBER() OVER( Order BY (SELECT NULL)) FROM master.sys.all_columns),

Data AS(

SELECT '+' + string string, n, SUBSTRING( '+' + string, n, 1) ch

FROM @test-2, cteTally)

UPDATE t SET string = STUFF( REPLACE( d.string, dp.ch+d.ch+dn.ch, dp.ch+@New+dn.ch),1,1,'')

FROM @test-2 t

JOIN Data d ON '+' + t.string = d.string

JOIN Data dp ON d.string = dp.string

AND d.n-1 = dp.n

AND dp.ch LIKE '[^A-Z]'

JOIN Data dn ON d.string = dn.string

AND d.n+1 = dn.n

AND dn.ch IN ( '#', '@', '&')

WHERE d.ch = @Searched

SELECT * FROM @test-2

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

Nasty piece of work this, but you could try tokenizing the string with a rCTE, replacing the token and then putting it all back together. First the setup data:

DECLARE @t TABLE (ID INT IDENTITY, Formula VARCHAR(100))

DECLARE @Operators VARCHAR(15) = '%[*/()+-]%'

INSERT INTO @t

SELECT 'L@*K@*H@/324'

UNION ALL SELECT '(AY#-AR#)*(Y#+BB#)'

UNION ALL SELECT 'TD@*(L#+D@)*C@'

UNION ALL SELECT '((AE#-AR#)*(Y#+A#))/Y#'

Now my (ugly) solution:

;WITH

Replacements (Search, Replace) AS (

SELECT 'Y#', 'Z#' UNION ALL SELECT 'D@', 'B@'

),

T AS (

SELECT ID, Formula=Formula + '*'

FROM @t),

Tokenize AS (

SELECT ID, n=1

,a.Token

,a.Operator

,Formula=SUBSTRING(Formula, PATINDEX(@Operators, Formula) + 1

,LEN(Formula) - LEN(a.Token + a.Operator))

FROM T

CROSS APPLY (

SELECT Token=SUBSTRING(Formula, 1, PATINDEX(@Operators, Formula)-1)

,Operator=SUBSTRING(Formula, PATINDEX(@Operators, Formula), 1)) a

UNION ALL

SELECT ID, n+1

,a.Token

,a.Operator

,Formula=SUBSTRING(T.Formula, PATINDEX(@Operators, T.Formula) + 1

,LEN(T.Formula) - LEN(a.Token + a.Operator))

FROM Tokenize T

CROSS APPLY (

SELECT Token=SUBSTRING(T.Formula, 1, PATINDEX(@Operators, T.Formula)-1)

,Operator=SUBSTRING(T.Formula, PATINDEX(@Operators, T.Formula), 1)) a

WHERE LEN(Formula) > 1

)

SELECT ID, Formula=STUFF(b.Formula, LEN(b.Formula), 1, '')

FROM @t a

CROSS APPLY (

SELECT (

SELECT Token=

CASE WHEN Token IN (SELECT Search FROM Replacements)

THEN (SELECT Replace FROM Replacements WHERE Token = Search)

ELSE Token END + Operator

FROM Tokenize b

WHERE a.ID = b.ID

ORDER BY n

FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)')

) b(Formula)

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

umair 4720 Newbie Points: 3 More actions · Answer 11

Regex searching and replace ?

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 12

dwain.c (9/9/2012)
Nasty piece of work this, but you could try tokenizing the string with a rCTE, replacing the token and then putting it all back together.

Wow, this is very impressive. The results are exactly correct in terms of the replacements. The only problem I see is that the closing parenthesis on formula #2 is being truncated, but I'm sure I can figure out why that is. I'm definitely going to use this as my jumping off point, thank you so much. I'll be sure to give you credit in the completed script. 😀

Thanks!

ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 13

umair 4720 (9/10/2012)
Regex searching and replace ?
http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

I considered going with Regex or another CLR-based solution. If I was doing this on my own data server, that's absolutely how I'd go with it. But we're a software vendor who distributes our schema and updates via script, to thousands customers who are typically on the lower end of the technical expertise scale. We have enough difficulty walking them through the process of installing SQL Server, and I haven't had the time or resources available to investigate what would be involved in distributing CLR to these customers. It's on the list, but for a very small company like ours, things have to be prioritized.

Thanks!

Ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Mark Cowne One Orange Chip Points: 26968 More actions · Answer 14

Yet another to try...

DECLARE @t TABLE(ID INT IDENTITY, Formula VARCHAR(100))

DECLARE @Old CHAR(1)

DECLARE @new CHAR(1)

INSERT @t(Formula)

SELECT 'L@*K@*H@/324'

UNION ALL SELECT '(AY#-AR#)*(Y#+BB#)'

UNION ALL SELECT 'TD@*(L#+D@)*C@'

UNION ALL SELECT '((AE#-AR#)*(Y#+A#))/Y#'

UNION ALL SELECT 'A#+AB#+BA#+B#'

SET @Old = 'A';

SET @new = 'C';

WITH CTE AS (

SELECT ID,

Formula,

number,

CASE WHEN SUBSTRING(Formula,number-1,1) NOT BETWEEN 'A' AND 'Z'

AND SUBSTRING(Formula,number,1) = @Old

AND SUBSTRING(Formula,number+1,1) NOT BETWEEN 'A' AND 'Z'

THEN @new

ELSE SUBSTRING(Formula,number,1)

END AS ch

FROM @t

INNER JOIN master.dbo.spt_values ON type='P'

AND number BETWEEN 1 AND LEN(Formula))

SELECT t1.Formula,

(SELECT (SELECT t2.ch AS "text()"

FROM CTE t2

WHERE t2.ID = t1.ID

ORDER BY t2.number

FOR XML PATH(''),TYPE).value('.','VARCHAR(100)')) AS NewFormula

FROM @t t1;

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537