Nested replaces ?

  • Hello all,

    At this moment I need some nested REPLACE actions.

    Some time ago I saw a set of routines for string 'manipulation', I think there were two sets, one for free, one which could be bought.

    Now I can not find that set anymore, have been searching my own set of data, been searching with google.

    Is there a set of string manipulation routines which do extend the functionality of the build in routines ?

    (I did find the stringworkshop script).

    Thanks,

    Ben Brugman.

  • i havent got what you actually needed and asked ? can you post your requirement with sample data along with desired output. see "we cant see what you see"

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Bhuvnesh (1/17/2013)


    i havent got what you actually needed and asked ? can you post your requirement with sample data along with desired output. see "we cant see what you see"

    An example what I am doing at the moment, but there are many variants on this:

    I would like to have an interface like:

    special_replace(@string, 'abcdefgh','12345678')

    where all the 'a' get replaced by a 1,

    all be 'b' get replaced by a 2, etc.

    or

    special_replace2(@string, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8')

    where the 'aaa' get's replaced by a 11

    or

    special_replace3(@string, '1111|111|11|1|2222|222|22|2','41|31|21|11|42|32|22|12')

    where the '1111' get's replaced by a 41

    where the '111' get's replaced by a 31

    -- To find 'exotic characters' in a table

    select '--' [--], COUNT(*) as tel from (

    select distinct

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(H,'a','')

    ,'b','')

    ,'c','')

    ,'d','')

    ,'e','')

    ,'f','')

    ,'g','')

    ,'h','')

    ,'i','')

    ,'j','')

    ,'k','')

    ,'l','')

    ,'m','')

    ,'n','')

    ,'0','')

    ,'p','')

    ,'q','')

    ,'r','')

    ,'s','')

    ,'t','')

    ,'u','')

    ,'v','')

    ,'w','')

    ,'x','')

    ,'y','')

    ,'z','')

    ,'0','')

    ,'1','')

    ,'2','')

    ,'3','')

    ,'4','')

    ,'5','')

    ,'6','')

    ,'7','')

    ,'8','')

    ,'9','')

    ,'8','')

    AS H

    FROM A_table

    ) as xxx

    -- To find how significant the exotic characters are.

    select distinct

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(H,' ','')

    ,',','')

    ,'&','')

    ,'/','')

    ,'\','')

    ,'|','')

    AS H

    FROM A_table

    ) as xxx

    -- to do some manipulation on the string.

    select distinct

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(H,' ','\1 ')

    ,',','\2')

    ,'&','\3')

    ,'/','\4')

    ,'\','\5')

    ,'|','\6')

    AS H

    FROM A_table

    ) as xxx

    Thanks for your time and attention,

    ben brugman

  • Wow! That's quite a set of REPLACEs. If I'm reading what you want correctly, I see two different approaches you can take here.

    The first is using the STUFF statement. http://msdn.microsoft.com/en-us/magazine/cc163473.aspx. I find it works pretty well for small numbers of replacements with varying character lengths. It's a cool function to have in your "toolbox", but please continue reading.

    It looks to me like regular expressions are what you really want. I suggest writing a .NET CLR, create an assembly for it and then write database function wrappers. I've seen this used with some ridiculously complicated pattern matches.

    Library and How-To: http://www.codeproject.com/Articles/42764/Regular-Expressions-in-MS-SQL-Server-2005-2008

    MSDN Article: http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

    I don't do the .NET library exposed to SQL Server very much because I find that most things can be accomplished with normal T-SQL (I'm not looking for an argument here) but this is one of those cases where it just makes sense. Performance is pretty good, but it of course depends on how you write your .NET library. Having used regular expressions natively in Oracle 8 and 9i, I can say that the performance of this approach wins, hands down. It'll take some work to write the first time, but then you'll find yourself using it as you need it.

  • I would like to have an interface like:

    special_replace(@string, 'abcdefgh','12345678')

    where all the 'a' get replaced by a 1,

    all be 'b' get replaced by a 2, etc.

    or

    special_replace2(@string, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8')

    where the 'aaa' get's replaced by a 11

    or

    special_replace3(@string, '1111|111|11|1|2222|222|22|2','41|31|21|11|42|32|22|12')

    where the '1111' get's replaced by a 41

    where the '111' get's replaced by a 31

    What a bizarre requirement!!! I can't come up with a real world scenario where any of these make sense. The regex suggestion won't really help here because you need to modify the values. All the regex is going to do is tell you if it matches some pattern, which you don't have.

    I do however think that CLR would be the way to go here. You are going to have to loop through these strings character by character and t-sql just isn't the best way to do that.

    I understand the requirements for the first and third versions but the second one I don't get why 'aaa' becomes 11 and why does 'd' become 4444????

    For the first one you could do a pretty straight forward ascii calculation. Just force the string to upper first and then for each character take the ascii value and subtract 16.

    The third one would be a modification to that logic, just need to track which character you are working with and a counter. When the character changes you output the counter and the same ascii calculation.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for your input,

    Suggestions how to proceed are very welcome.

    My goal at the moment is finding similar strings.

    For a long reply see under the signature,

    Ben Brugman

    Sean Lange (1/17/2013)


    I would like to have an interface like:

    special_replace(@string, 'abcdefgh','12345678')

    where all the 'a' get replaced by a 1,

    all be 'b' get replaced by a 2, etc.

    or

    special_replace2(@string, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8')

    where the 'aaa' get's replaced by a 11

    This was a fantasie example. (Correctly spotted)

    or

    special_replace3(@string, '1111|111|11|1|2222|222|22|2','41|31|21|11|42|32|22|12')

    where the '1111' get's replaced by a 41

    where the '111' get's replaced by a 31

    This is a part of an actual script I made.

    What a bizarre requirement!!! I can't come up with a real world scenario where any of these make sense. The regex suggestion won't really help here because you need to modify the values. All the regex is going to do is tell you if it matches some pattern, which you don't have.

    The example code was actual code I used.

    First to study what characters were used.

    (Alphabetic, Numeric, +-.,; etc)

    This was for me the fastest way to find out which characters where used in a column. First getting rid of all 'normal' characters then eliminating the others.

    Afterwards only the others were removed so that the columns could be compared more easy.

    I do however think that CLR would be the way to go here. You are going to have to loop through these strings character by character and t-sql just isn't the best way to do that.

    Processing speed was no issue.

    The speed I can (or could) write the code and rerun and change the code made the script as it was.

    And I do not master CLR code.

    I understand the requirements for the first and third versions but the second one I don't get why 'aaa' becomes 11 and why does 'd' become 4444????

    Correctly spotted this was an artificial example.

    I didn't want to use a predictive structure to avoid not fitting solutions. (sorry).

    For the first one you could do a pretty straight forward ascii calculation. Just force the string to upper first and then for each character take the ascii value and subtract 16.

    Again you spotted the correct function. Indeed this is the count of a sequence of same characters. This actually is the anwser to a puzzle, I think I will throw this puzzle in this forum. The above was part of the solution. (So don't tell anyone yet:-))

    (My solution for the puzzle was rather cumbersom with loads of replaces, I am wondering if other come up with other solutions to learn from).

    This would probably require a function and then a loop within the function for each position. Now I did a copy/past 26 times for the alphabeth, which for me works faster than writing a function with a loop.

    But sometime ago I saw that somebody had allready solved this problem. (With functions ?) And I was hoping to find that solution again and re-using that solution.

    The number of replaces I use in largest script is over 50 which is nog very elegant.

    In stringworkshop there was a far more elegant solution but that was for a single string parameter, not for a column in a select. I could adapt that, but that would take more time.

    The third one would be a modification to that logic, just need to track which character you are working with and a counter. When the character changes you output the counter and the same ascii calculation.

    Now it is within a single select statement, doing this for a column would make the code more complex again.

    So as you can read I am building these examples on the fly running them and then modifying them to my next requirements. Changes are almost continues. Running speed is not a problem, because the building takes some time.

    I use the technique with nested replaces fairly often, because I can do it inline and build the code fast. Building this code inline although not elegant makes the code easy portable. And if speed is no issue, then it works fast enough.

    A problem with this code is overview (to much replaces) and it is difficult to keep count of the correct number of replaces and the matching number of parameters for the replace. (A miscount is easely made).

    The actual use this time was.

    Excel with over 25 000 rows.

    Containing strings and the translations of the strings.

    But some strings (or equivalent strings) appear more than once, sometimes with small differences like a comma, or another special character.

    So I wanted to eliminate the special characters.

    (But wanted first to know which special characters where used, I found two different spaces for example, still have to study that).

    Also in the next step I am thinking of eliminating repeating characters, to exclude some writing mistakes and perhaps something like replacing 'qu' with 'q'

    and replacing all 'c' with an 'k' ('disc' and 'disk' for example is the same word for me).

    And maybe substituting all numeric strings for the same numeric string. For example replacing any nummeric string with 123.

    (I am thinking of breaking up the strings without special characters in triads (sets of three characters) and test how many of them match up, and give this a score if allmost all triads are the same, the difference might be a small spelling error or a typing mistake).

    Example.

    I am a fish.

    Would become.

    iamafish

    And then:

    iam

    ama

    maf

    afi

    fis

    ish

    A sentence like

    I am the fish

    or

    I am an fish

    Would give a large number of the same triads in the same order, so a large likelyhood of the 'same' sentence.

    Thanks for your input,

    Suggestions how to proceed are very welcome.

    My goal at the moment is finding similar strings.

    Ben Brugman

  • For that, I would use a function with PATINDEX to retain only the desired chars.

    If performance is not an issue, you can use a scalar function.

    I can give you a sample function if you would like. I wrote somewhat similar functions to do "remove all non-alpha, non-numeric" chars and duplicate Oracle's "InitCap" function (first letter of each word is upper case, remaining letters of same word are all lower case).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Special_replace2 ain't so hard:

    DECLARE @MyString VARCHAR(100) = 'aaabbcdefgh'

    ;WITH Replacements (n,a,b) AS (

    SELECT 1, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8'),

    Transform (ItemNumber, a, b) AS (

    SELECT ItemNumber, MAX(a), MAX(b)

    FROM (

    SELECT n, ItemNumber, a=a.Item, b=NULL

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(a, '|') a

    UNION ALL

    SELECT n, ItemNumber, NULL, Item

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(b, '|')) a

    GROUP BY n, ItemNumber),

    rCTEReplace (n, s, r) AS (

    SELECT n=1, MyString, REPLACE(MyString, a, b)

    FROM (SELECT @MyString) a(MyString)

    JOIN Transform ON ItemNumber = 1

    UNION ALL

    SELECT n+1, s, REPLACE(r, a, b)

    FROM rCTEReplace

    JOIN Transform ON ItemNumber = n+1

    )

    SELECT *

    FROM rCTEReplace

    WHERE n = (SELECT COUNT(*) FROM Transform)

    Ain't gonna win no performance contests though! 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Probably will run a little faster with dynamic SQL though.

    DECLARE @sql NVARCHAR(MAX) = ''

    DECLARE @MyString VARCHAR(100) = 'aaabbcdefgh'

    ;WITH Replacements (n,a,b) AS (

    SELECT 1, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8'),

    Transform (ItemNumber, a, b) AS (

    SELECT ItemNumber, MAX(a), MAX(b)

    FROM (

    SELECT n, ItemNumber, a=a.Item, b=NULL

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(a, '|') a

    UNION ALL

    SELECT n, ItemNumber, NULL, Item

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(b, '|')) a

    GROUP BY n, ItemNumber)

    SELECT @sql='SELECT r=' + s + '''' + @MyString + ''',' + (

    SELECT '''' + a + ''',''' + b + '''),'

    FROM Transform

    ORDER BY ItemNumber

    FOR XML PATH('')) + 's=''' + @MyString + ''''

    FROM (

    SELECT 'REPLACE('

    FROM Transform

    FOR XML PATH('')) a(s)

    --PRINT @sql

    EXEC (@SQL)


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • ScottPletcher (1/17/2013)


    For that, I would use a function with PATINDEX to retain only the desired chars.

    First I used the replace to find which were the undesired chars.

    Next I used the replace to get rid of the undesired chars.

    (Maybe a next step would be to use replace to replace all 'c' with a 'k', to make some text like 'disc' and 'disk' the same. And maybe I'll remove all double characters to get even more matches.).

    I do not see how I would do that with the PATINDEX function.

    I can only see complex solutions when using the PATINDEX for this problem. But maybe I am missing something:(. (Sorry for that).

    Thanks for your response,

    ben brugman

  • dwain.c (1/17/2013)


    Special_replace2 ain't so hard:

    ::::Code::::

    Ain't gonna win no performance contests though! 😛

    Thanks for your solution, tried to study the solution,

    'DelimitedSplit8K' is missing.

    So I could not try the solution.

    Also I have no clue were to put the queried table and the column which has to be read or updated.

    For me this is not a solution I could make up in a jiffy, where my big replacement solution is fairly simple from construction.

    And Yes maybe I should go for a dynamic solution, or generating the code and cut/paste it, or make an actual function for that.

    (25 000 rows is at the moment no problem for the nested replace construction. I do not know how dynamic code running for 25 000 times in a function would impact my waiting time.)

    Again thanks for the code, I have to study it further to get an understanding of the mechanismn behind that.

    ben brugman

  • Nice problem; I guess i need to read the long detail provided ..

    Well, about the copy paste solution for the 2nd and third, I think dynamic sql could be of a help.

    I tried this :

    DECLARE @mystring VARCHAR(100) = 'aaa|bb|c|d|e|f|g|h'

    DECLARE @mystringnew VARCHAR(100) = '11|2|3|4444|5|6|7|8'

    -- create table #test (b int identity,a varchar(10))

    -- create table #testNew (b int identity,a varchar(10))

    DECLARE @test-2 VARCHAR(max)='insert into #test values(''' + (SELECT REPLACE(@mystring, '|', '''),(''')) + ''')'

    SELECT @test-2

    DECLARE @test_new VARCHAR(max)='insert into #testnew values(''' + (SELECT REPLACE(@mystringnew, '|', '''),(''')) + ''')'

    SELECT @test_new

    EXEC (@test)

    EXEC (@test_new)

    -- Drop Table #test

    -- Drop Table #testnew

    It creates the record in temp table for furthur use; then I got confused :w00t: why do i have to replace it ...I guess i need to read the full discussion ..

    ~ demonfox
    ___________________________________________________________________
    Wondering what I would do next , when I am done with this one :ermm:

  • ben.brugman (1/18/2013)


    Thanks for your solution, tried to study the solution,

    'DelimitedSplit8K' is missing.

    So I could not try the solution.

    Sorry about that. I thought most folks were familiar with this FUNCTION, which can be found here: http://www.sqlservercentral.com/articles/Tally+Table/72993/. To say its "just" a delimited string splitter would be an insult to Jeff Moden, so I'll say it is designed to address the problem at warp speed.

    ben.brugman (1/18/2013)


    Also I have no clue were to put the queried table and the column which has to be read or updated.

    For me this is not a solution I could make up in a jiffy, where my big replacement solution is fairly simple from construction.

    And Yes maybe I should go for a dynamic solution, or generating the code and cut/paste it, or make an actual function for that.

    (25 000 rows is at the moment no problem for the nested replace construction. I do not know how dynamic code running for 25 000 times in a function would impact my waiting time.)

    Again thanks for the code, I have to study it further to get an understanding of the mechanismn behind that.

    ben brugman

    If you would like to provide some DDL and sample data for your table and replacement strings, I'd be happy to adapt the code I provided to it.

    The dynamic SQL will probably require more work than the former but it would be a lot faster and wouldn't be suitable for a TVF (assuming that was your intent). I had fun with both!

    Edit: Oh yes, and I believe the Dynamic code could be constructed in a way that it only needs to run once and process all 25,000 rows in one go.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (1/17/2013)


    Special_replace2 ain't so hard:

    see above

    Ain't gonna win no performance contests though! 😛

    46 seconds for the 25 000 rows.

    About 2 hours for me to get this working.

    The Nested repeat was few seconds for 25 000 rows.

    About 15 minutes to do build 50 (alphabet/numeric/ect) replaces.

    No loops! No CURSORs! No RBAR! Hoo-uh!

    Using the with recursion for me is a loop, so the No loops quote here is not totaly justified.

    Thanks, for the support,

    I implemented both functions (8K and the Replace2).

    But in Retrospect this was not the right choice.

    The nested replace was a winner on three fronts.

    1. Easy to understand.

    2. Easy to build.

    3. Fast enough on execution.

    ( 4. No loops, no visible loops that is.).

    Thanx for your assistence,

    ben brugman

    The function:

    -- =============================================

    -- Author:ben brugman / dwain.c (sqlservercentral)

    -- Create date: 20130118

    -- Description:Does replace @in a str@ing

    -- =============================================

    -- drop function replace2

    CREATE FUNCTION Replace2

    (

    -- Add the parameters for the function here

    @in varchar(8000),

    @ff varchar(8000),

    @RR varchar(8000)

    )

    RETURNS varchar(8000)

    AS

    BEGin

    -- Declare the return variable here

    DECLARE @out varchar(8000)

    -- Add the T-SQL statements to compute the return value here

    -- SELECT @out = @in

    -- Special_replace2 ain't so hard:

    DECLARE @MyString VARCHAR(100) = @in

    ;WITH Replacements (n,a,b) AS (

    SELECT 1, @ff,@rr),

    Transform (ItemNumber, a, b) AS (

    SELECT ItemNumber, MAX(a), MAX(b)

    FROM (

    SELECT n, ItemNumber, a=a.Item, b=NULL

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(a, '|') a

    UNION ALL

    SELECT n, ItemNumber, NULL, Item

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(b, '|')) a

    GROUP BY n, ItemNumber),

    rCTEReplace (n, s, r) AS (

    SELECT n=1, MyString, REPLACE(MyString, a, b)

    FROM (SELECT @MyString) a(MyString)

    JOIN Transform ON ItemNumber = 1

    UNION ALL

    SELECT n+1, s, REPLACE(r, a, b)

    FROM rCTEReplace

    JOIN Transform ON ItemNumber = n+1

    )

    SELECT @out = r

    FROM rCTEReplace

    WHERE n = (SELECT COUNT(*) FROM Transform)

    -- Return the result of the function

    RETURN @out

    /*

    declare @uu varchar(8000) = 'ONVERANDERD aaaaaaa bb hello the quick brown fox jumps'

    SET @UU = dbo.Replace2(@uu, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8')

    -- SET @UU = dbo.Replace2( '11|2|3|4444|5|6|7|8', 'aaa|bb|c|d|e|f|g|h')

    PRINT @UU

    select dbo.replace2(G,'7777|77|7|a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|0|1|2|3|4|5|6|7|8|9|7777|7777',

    '7|7|7|||||||||||||||||||||||||||7|7|7|7|7|7|7|7|7|7|7|7') from a_table_with_a_column_G

    -- test if a number can be collapsed to a single digit.

    select dbo.replace2('894785423456','||0|1|2|3|4|5|6|7|8|9|77777777|7777|77|7|||||','||7|7|7|7|7|7|7|7|7|7|7|7|7|4|||||')

    */

    END

    GO

  • Writing the function as a iTVF might improve performance a notch.

    -- =============================================

    -- Author:ben brugman / dwain.c (sqlservercentral)

    -- Create date: 20130118

    -- Description:Does replace @in a str@ing

    -- =============================================

    -- drop function replace2

    CREATE FUNCTION Replace2

    (

    -- Add the parameters for the function here

    @in varchar(8000),

    @ff varchar(8000),

    @RR varchar(8000)

    )

    RETURNS TABLE AS

    RETURN

    WITH Replacements (n,a,b) AS (

    SELECT 1, @ff,@rr),

    Transform (ItemNumber, a, b) AS (

    SELECT ItemNumber, MAX(a), MAX(b)

    FROM (

    SELECT n, ItemNumber, a=a.Item, b=NULL

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(a, '|') a

    UNION ALL

    SELECT n, ItemNumber, NULL, Item

    FROM Replacements

    CROSS APPLY DelimitedSplit8K(b, '|')) a

    GROUP BY n, ItemNumber),

    rCTEReplace (n, s, r) AS (

    SELECT n=1, MyString, REPLACE(MyString, a, b)

    FROM (SELECT @in) a(MyString)

    JOIN Transform ON ItemNumber = 1

    UNION ALL

    SELECT n+1, s, REPLACE(r, a, b)

    FROM rCTEReplace

    JOIN Transform ON ItemNumber = n+1

    )

    SELECT r

    FROM rCTEReplace

    WHERE n = (SELECT COUNT(*) FROM Transform)

    /*

    declare @uu varchar(8000) = 'ONVERANDERD aaaaaaa bb hello the quick brown fox jumps'

    SET @UU = dbo.Replace2(@uu, 'aaa|bb|c|d|e|f|g|h','11|2|3|4444|5|6|7|8')

    -- SET @UU = dbo.Replace2( '11|2|3|4444|5|6|7|8', 'aaa|bb|c|d|e|f|g|h')

    PRINT @UU

    select dbo.replace2(G,'7777|77|7|a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|0|1|2|3|4|5|6|7|8|9|7777|7777',

    '7|7|7|||||||||||||||||||||||||||7|7|7|7|7|7|7|7|7|7|7|7') from a_table_with_a_column_G

    -- test if a number can be collapsed to a single digit.

    select dbo.replace2('894785423456','||0|1|2|3|4|5|6|7|8|9|77777777|7777|77|7|||||','||7|7|7|7|7|7|7|7|7|7|7|7|7|4|||||')

    */

    Above should compile above and thanks for the author credit.

    I'm actually a bit surprised that it processed 25K rows in 45 seconds.

    There are some other improvements that can be made also. For example, the Replacements CTE is not required.

    WITH Transform (ItemNumber, a, b) AS (

    SELECT ItemNumber, MAX(a), MAX(b)

    FROM (

    SELECT n=1, ItemNumber, a=a.Item, b=NULL

    FROM DelimitedSplit8K(@ff, '|') a

    UNION ALL

    SELECT 1 ItemNumber, NULL, Item

    FROM DelimitedSplit8K(@rr, '|')) a

    GROUP BY n, ItemNumber)

    ,

    You see, I did that so as to facilitate applying different replacement strings to the same string. Which you're function isn't capable of doing.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 40 total)

You must be logged in to reply to this topic. Login to reply