Replacing multiple occurrences of a string in a text column

  • I have a forum database and numerous threads contain links to other threads. The URL of the server was changed so the links are no longer valid.

    Here's my problem:

    I need to update the links to reflect the new URL. Easy? sure, just use UPDATETEXT, right?

    Well, that works--for the first occurrence of the string in a particular row, but a row may have many occurrences of the string.

    I have wrapped the code in a cursor but try as I might, I cannot quite get it to find more than the first occurrence of the string.

    Anyone got any ideas?

  • You could use PATINDEX in a WHERE clause to identify rows which require editing. SUBSTRING also works with text datatype: passing the return value of PATINDEX to the start value of SUBSTRING offers you a route to multiple updates - or at least, to a count of the number of occurrences of your string within the column.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • i have this in my snippets as Find and Replae for TXT/NTEXT;

    see if this works for you:

    [font="Courier New"]

    -- the table needs to have an ID column. my example has reviewID

    --example here is replacing href="reviews

    --with a full url like 'href="http://www.whatever.com/reviews

    DECLARE @reviewid INT, @ptr BINARY(16), @idx INT

    SET @reviewid = 0

    SELECT TOP 1 @reviewid = reviewid, @idx = PATINDEX('%href="reviews%',review_body)-1

    FROM reviews

    WHERE PATINDEX('%href="reviews%',review_body) > 0

    WHILE @reviewid > 0

    BEGIN

    SELECT @ptr = TEXTPTR(review_body)

    FROM reviews

    WHERE reviewid = @reviewid

    UPDATETEXT reviews.review_body @ptr @idx 13 'href="http://www.whatever.com/reviews'

    SET @reviewid = 0

    SELECT TOP 1 @reviewid = reviewid, @idx = PATINDEX('%href="reviews%',review_body)-1

    FROM reviews

    WHERE reviewid > @reviewid

    AND PATINDEX('%href="reviews%',review_body) > 0

    END

    [/font]

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • That is sort of what I did...here is my code. The only problem is that it will get only the first occurrence in a row and there are many occurences in some rows.

    declare @offset int

    declare @ptrval binary(16)

    declare upd_cursor cursor for

    select TEXTPTR(post_text), patindex('%old.server.com/forums%', post_text)-1

    from posts

    where post_text like '%old.server.com/forums%'

    open upd_cursor

    fetch next from upd_cursor into @ptrval, @offset

    while @@fetch_status = 0

    begin

    updatetext posts.post_text @ptrval @offset 21 'new.server.com/forums'

    fetch next from upd_cursor into @ptrval, @offset

    end

    close upd_cursor

    deallocate upd_cursor

  • so couldn't you just repeat say, 5 or 6 times, assuming the most times the link showed up int he article was 6 times max? or releat until nor rows affected? this'd be a one time process ayway, right?

    edit:

    ok as i read the code i posted, and pretty much what you have, the WHILE Loop does the work, for any single row/ + textfield in the database, i think it replaces ALL the items witht eh new text.

    so just sticking that in a cursor so all rows in the table is what I think you need to do;

    do you disagree?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I might do it 10 times or so, just to be safe. Allow some pad in case things change.

    You could probably calculate out the max in any column and use that as well.

  • i hate it when i second guess myself.

    the code i posted DOES replace EVERY instance in ALL rows in the text filed in a single pass.

    here is code as an example, and the same code i posted before.

    [font="Courier New"]

    CREATE TABLE reviews(reviewid INT IDENTITY(1,1) PRIMARY KEY,review_body TEXT)

    INSERT INTO reviews(review_body) SELECT REPLICATE('       <a href="reviews.html"> loved it</a><a href="reviews2.html"> liked it</a><a href="reviews32.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('               <a href="reviews.html"> loved it</a><a href="reviews3.html"> liked it</a><a href="reviews33.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                       <a href="reviews.html"> loved it</a><a href="reviews4.html"> liked it</a><a href="reviews34.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                               <a href="reviews.html"> loved it</a><a href="reviews5.html"> liked it</a><a href="reviews35.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                                       <a href="reviews.html"> loved it</a><a href="reviews6.html"> liked it</a><a href="reviews36.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                                             <a href="reviews.html"> loved it</a><a href="reviews7.html"> liked it</a><a href="reviews37.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                                                    <a href="reviews.html"> loved it</a><a href="reviews8.html"> liked it</a><a href="reviews38.html"> hated it</a>',4000)

    INSERT INTO reviews(review_body) SELECT REPLICATE('                                                          <a href="reviews.html"> loved it</a><a href="reviews9.html"> liked it</a><a href="reviews39.html"> hated it</a>',4000)

    SELECT * FROM reviews

    -- the table needs to have an ID column. my example has reviewID

    --example here is replacing href="reviews

    --with a full url like 'href="http://www.whatever.com/reviews

    DECLARE @reviewid INT, @ptr BINARY(16), @idx INT

    SET @reviewid = 0

    SELECT TOP 1 @reviewid = reviewid, @idx = PATINDEX('%href="reviews%',review_body)-1

    FROM reviews

    WHERE PATINDEX('%href="reviews%',review_body) > 0

    WHILE @reviewid > 0

    BEGIN

    SELECT @ptr = TEXTPTR(review_body)

    FROM reviews

    WHERE reviewid = @reviewid

    UPDATETEXT reviews.review_body @ptr @idx 13 'href="http://www.whatever.com/reviews'

    SET @reviewid = 0

    SELECT TOP 1 @reviewid = reviewid, @idx = PATINDEX('%href="reviews%',review_body)-1

    FROM reviews

    WHERE reviewid > @reviewid

    AND PATINDEX('%href="reviews%',review_body) > 0

    END

    SELECT * FROM reviews[/font]

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I overlooked the while loop in your code...funny, thats the first thime something like that has ever happened to me...:w00t:

    I will give that a try and see what happens. After looking more closely at it though, I think it will work perfectfully.

    Thanks for everyone's quick help.

  • Can't the TEXT column be changed to a VARCHAR(MAX) column?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/6/2008)


    Can't the TEXT column be changed to a VARCHAR(MAX) column?

    Not without tons of mods the the forum software. We're only talking about a half-million or so lines of PHP code.

  • Correct me if I'm wrong (I never worked with the text datatype), but I think this is what Jeff means :

    SELECT id, CONVERT(NVARCHAR(MAX), ColName) AS ColName INTO #temp FROM dbo.Base WHERE ColName LIKE '%Whatever%'

    UPDATE #temp SETColName = REPLACE(ColName, 'Search', 'Replace')

    UPDATE B SET B.ColName = CONVERT(NTEXT, Tmp.ColName) FROM Base B inner join #temp ON B.id = tmp.id

    OR EVEN a double convert with in place update :

    UPDATE Base SET ColName = CONVERT(NTEXT, REPLACE(CONVERT(NVARCHAR(MAX), ColName), 'Search', 'Replace')) WHERE PatIndex(Whatever) > 0

  • I see what you are saying, but the code posted earlier worked so it is no longer an issue.

  • Dan Epps (12/7/2008)


    Jeff Moden (12/6/2008)


    Can't the TEXT column be changed to a VARCHAR(MAX) column?

    Not without tons of mods the the forum software. We're only talking about a half-million or so lines of PHP code.

    Heh... better get started... the TEXT datatype isn't going to be around much longer.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Dan Epps (12/7/2008)


    Jeff Moden (12/6/2008)


    Can't the TEXT column be changed to a VARCHAR(MAX) column?

    Not without tons of mods the the forum software. We're only talking about a half-million or so lines of PHP code.

    Heh. Well as long as you are already replacing multiple occurrences of a string, do yourself a huge favor, and start executing those search & replace's on your PHP code then. Because the longer that you keep your data in TEXT columns, the more it will cost you in terms of stuff like this.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I hear you, but its not my code so I can't just go about changing it...gotta wait for the vendor to put out an update.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply