August 13, 2012 at 10:30 am
Hi everyone,
I am actually fairly new to using the REPLACE function and don't know if I can use regex with it. The string that I want to replace is always between 2 % characters and includes the % characters.
Sample data:
SELECT 'somestring+%sometext%' AS string
INTO #temp
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT 'somestring+%sometext%+somesecondstring'
So I would like the results when selected from the tale to look like this (no update needed, this is just a query):
SELECT 'somestring+replaced' AS newString
UNION ALL
SELECT 'replaced'
UNION ALL
SELECT 'somestring+replaced+somesecondstring'
So, can I use the REPLACE function to find the '%whatever%' string?
Jared
CE - Microsoft
August 13, 2012 at 10:39 am
I could be wrong, but I'm pretty sure Replace() won't take regex or other patterns. I'm pretty sure it only deals with strings or string-type variables with pre-defined values.
However, solving this with Stuff() would be quite easy.
SELECT string,
CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1),
STUFF(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')
FROM (SELECT 'somestring+%sometext%' AS string
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT 'somestring+%sometext%+somesecondstring') AS Example;
The first three columns are just the internal pieces of the logic. The final column is what you need for this kind of thing.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 13, 2012 at 10:41 am
GSquared (8/13/2012)
I could be wrong, but I'm pretty sure Replace() won't take regex or other patterns. I'm pretty sure it only deals with strings or string-type variables with pre-defined values.However, solving this with Stuff() would be quite easy.
SELECT string,
CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1),
STUFF(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')
FROM (SELECT 'somestring+%sometext%' AS string
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT 'somestring+%sometext%+somesecondstring') AS Example;
The first three columns are just the internal pieces of the logic. The final column is what you need for this kind of thing.
Interesting... Thanks G. I think you are right that REPLACE() does not accept patterns. That's why my RegEx wasn't working 🙂
Jared
CE - Microsoft
August 13, 2012 at 10:42 am
Does this not work for you?
select REPLACE(yourstring '%sometext%', 'replaced')
from #temp
August 13, 2012 at 10:43 am
Alternatively, if you really need to use Replace instead of Stuff for this, try using Substring to get the piece that you want to replace. Build the Substring parameters the same way I did the Stuff position and length. Then use that as the first argument for the Replace function.
SELECT string,
SUBSTRING(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),
REPLACE(string,
SUBSTRING(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),
'replaced')
FROM (SELECT 'somestring+%sometext%' AS string
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT 'somestring+%sometext%+somesecondstring') AS Example;
Either will work. Just thought of the nested Replace(Substring()) version second is all.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 13, 2012 at 10:52 am
@rory: Sorry I was not clear that sometext could be anything; i.e. %sometext%, %hogwash%, '%palinka%'... So your proposed solution will not work.
@g: No need to use replace, I think your STUFF solution will work just fine. Thanks again!
Jared
CE - Microsoft
August 13, 2012 at 10:52 am
roryp 96873 (8/13/2012)
Does this not work for you?
select REPLACE(yourstring '%sometext%', 'replaced')
from #temp
I think the point is to replace any text between %.
--Vadim R.
August 13, 2012 at 10:54 am
But based on the desired output he provided it looked like he wanted the % signs removed as well. Or did I misread that?
Edit: Nevermind, I get it now. You won't know what the string is in advance, you are just looking for % and want to replace what is in there. Sorry for the confusion.
August 13, 2012 at 11:14 am
roryp 96873 (8/13/2012)
But based on the desired output he provided it looked like he wanted the % signs removed as well. Or did I misread that?Edit: Nevermind, I get it now. You won't know what the string is in advance, you are just looking for % and want to replace what is in there. Sorry for the confusion.
Yes, but your were right first with the %'s themselves; i.e. they should also be replaced.
Jared
CE - Microsoft
August 14, 2012 at 4:12 am
Playing around with PATINDEX, which is the closest thing you'll find to REGEX (except the CLR REGEX functions of course), I come up with something like this:
;WITH SampleData (string) AS (
SELECT 'somestring+%sometext%'
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT CAST('somestring+%sometext%+somesecondstring' AS VARCHAR(100))
)
SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')
FROM SampleData
CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a
CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b
Not sure if this helps though.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 14, 2012 at 6:10 am
dwain.c (8/14/2012)
Playing around with PATINDEX, which is the closest thing you'll find to REGEX (except the CLR REGEX functions of course), I come up with something like this:
;WITH SampleData (string) AS (
SELECT 'somestring+%sometext%'
UNION ALL
SELECT '%sometext%'
UNION ALL
SELECT CAST('somestring+%sometext%+somesecondstring' AS VARCHAR(100))
)
SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')
FROM SampleData
CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a
CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b
Not sure if this helps though.
Nice twist Dwain! I'll take a look at performance and execution plans between the both of these when I into work today.
Jared
CE - Microsoft
August 14, 2012 at 7:46 am
So I set up a quick test using this setup:
USE test
GO
CREATE TABLE testStringReplace (string VARCHAR(250))
INSERT INTO testStringReplace
SELECT TOP 500000 c1.name + '%' + c1.name + '%'
FROM sys.columns c1
CROSS JOIN sys.columns c2
GO
--SET STATISTICS IO ON
--SET STATISTICS TIME ON
SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')
FROM testStringReplace
CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a
CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b
SELECT string,
REPLACE(string,
SUBSTRING(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),
'replaced')
FROM testStringReplace
SELECT string,
STUFF(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')
FROM testStringReplace
GO
The actual execution plans are exactly the same for each and IO is the same. Interestingly enough... Every time I run this, the third one performs slightly better on time. I saw the same result when I did a second CROSS JOIN for the data load and took TOP 1000000. Now, we are only talking a couple hundred ms, but it was consistent. Thoughts?
Jared
CE - Microsoft
August 14, 2012 at 8:27 am
If you are looking for performance, then "well-done" CLR using RegEx will outperform T-SQL for large and/or complex string replacements.
August 14, 2012 at 6:30 pm
SQLKnowItAll (8/14/2012)
So I set up a quick test using this setup:
USE test
GO
CREATE TABLE testStringReplace (string VARCHAR(250))
INSERT INTO testStringReplace
SELECT TOP 500000 c1.name + '%' + c1.name + '%'
FROM sys.columns c1
CROSS JOIN sys.columns c2
GO
--SET STATISTICS IO ON
--SET STATISTICS TIME ON
SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')
FROM testStringReplace
CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a
CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b
SELECT string,
REPLACE(string,
SUBSTRING(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),
'replaced')
FROM testStringReplace
SELECT string,
STUFF(string, CHARINDEX('%', string, 1),
CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')
FROM testStringReplace
GO
The actual execution plans are exactly the same for each and IO is the same. Interestingly enough... Every time I run this, the third one performs slightly better on time. I saw the same result when I did a second CROSS JOIN for the data load and took TOP 1000000. Now, we are only talking a couple hundred ms, but it was consistent. Thoughts?
I confess, I didn't really expect the PATINDEX version to outperform the others. Just thought I'd throw it out there in case it ended up more applicable to what you were doing. You might be able to squeak a little bit extra performance out of it by eliminating the cascading CROSS APPLYs. Meaning, embed the intermediate result calculations directly where they're needed. I wrote it that way for readability.
Eugene's suggestion of using a CLR REGEX function may be faster (haven't tried) but here's a link: http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 14, 2012 at 9:19 pm
Eugene Elutin (8/14/2012)
If you are looking for performance, then "well-done" CLR using RegEx will outperform T-SQL for large and/or complex string replacements.
I've found that's frequently not true even when "well-done".
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply