REPLACE Multiple Spaces with One

  • Yes I noticed too how that final 2-space call gets invoked very,very often by your 2^N variant.

    I think the max for N=6 is 751678.

    But I haven't found a formula yet either.

  • smartin108 (3/26/2011)


    I say P-O-2 is more efficient because the number of maximum string size handled by each level of REPLACE is usually greater than in the 65, 33... series:

    N=1 (REPLACE x 2) 4

    N=2 (REPLACE x 3) 10

    N=3 (REPLACE x 4) 38

    N=4 (REPLACE x 5) 398

    N=5 (REPLACE x 6) 11806

    N=6 (REPLACE x 7) 754702

    It is indeed more efficient. But maximum is a bit of a misnomer, thisenumbers are not the length of the longest string of spaces the algorithm will reduce to a single space but instead the number 1 less than the smallest number for which it doesn't work. For example N=2 works for 12, 13, and 16 but fails for 11 (and for 14 and for 15). Of course the maximum isn't really interesting, which is a pity, because it's easy to get a formula for it - the maximum string for which N=k works is 2 to the power 2k. The interesting number is the number you called the maximum (the algorithm is guaranteed to work for blocks of spaces of this and all smaller lengths), for which I haven't (yet) found a clean formula. I'll contimue to try.

    Tom

  • Hi All,

    I really like the REPLACE method and have used it successfully as follows. The REPLACE functions can either be nested (harder to maintain) or executed serially (my preference) as shown. This is definitely a CBE (crude but effective) method. I like it's simplicity, versatility, scalability and maintainability.

    -- Remove special characters (<h-tab>, <lf>, <cr>, ") and compress multiple spaces to one space character.

    --09 = Horizontal Tab

    update Definitions set ObjectText = replace(ObjectText, CHAR(09), ' ')

    --10 = Line Feed

    update Definitions set ObjectText = replace(ObjectText, CHAR(10), ' ')

    --13 = Carriage Return

    update Definitions set ObjectText = replace(ObjectText, CHAR(13), ' ')

    --34 = " Double quote.

    update Definitions set ObjectText = replace(ObjectText, CHAR(34), ' ')

    -- Convert each occurrence of ' ' (two spaces) to ' ' (one space) 8 times.

    --This will compress space blocks of up to 256 (2^8) characters in length guaranteeing only one space between non-blank characters.

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

    Because SQL is so liberal regarding spaces between clauses and multi-part entity names, I also found myself using the following constructs after removing all special characters and converting every double-space to a single-space. Recompress spaces a couple of times after this because changing '] space' or 'space [' to space will leave two spaces in a row.

    --'] .' Right square bracket followed by a space and a period changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, '] .', '.')

    --'. [' Left square bracket preceded by a period and a space changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, '. [', '.')

    --' .' Space and a period changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, ' .', '.')

    --'. ' Period and a space changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, '. ', '.')

    --'].' Right square bracket followed by a period changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, '].', '.')

    --'.[' Period followed by a left square bracket changed to '.'.

    update Definitions set ObjectText = replace(ObjectText, '.[', '.')

    --91 = [ Left square bracket changed to a space.

    update Definitions set ObjectText = replace(ObjectText, '[', ' ')

    --93 = ] Right square bracket changed to a space.

    update Definitions set ObjectText = replace(ObjectText, ']', ' ')

    Thanks, Bob.

  • DECLARE @myString varchar(500)='select * from gdgdfnkdngjrwrwrrwkg;select tuyututuut;'

    SELECT REPLACE(

    REPLACE(

    REPLACE( LTRIM(RTRIM(@myString)),' ',' '+CHAR(7))

    ,CHAR(7)+' ','')

    ,CHAR(7),'')

    SQL can do it with out the help of C#.. 😎 Happy Coding.. 🙂

  • I use Jeff's method all the time in SSIS, except I use a vertical tab as the "unlikely" character.

    REPLACE(REPLACE(REPLACE(LTRIM(«Column»)," "," " + "\v"),"\v" + " ",""),"\v","")

  • Hi Folks,

    How about adding double byte spaces to a Chinese character string?

    Say i have a string of length 20. I want it to be of length 30. So i add SPACE(10)

    This string is full of Chinese characters (big5). But the SPACE(10) is not. Now when i read the output through a chinese system/app, i get garbled characters/output as the space is not a double byte space.

    How would i go about solving this one? single byte space character to double byte space character.

    Thanks.

    BV

  • Can't this be done in a much more simple way than trying to pick a single character that could never appear? By choosing two characters that can't appear next to each other (and sure, use 'bell' and vertical tab, or whatever), the chance of breaking it is far less.

    SELECT REPLACE(REPLACE(REPLACE(OriginalString, ' ', '~|'), '|~', ''), '~|', ' ')

    FROM @demo;

    Rob Farley
    LobsterPot Solutions & Adelaide SQL Server User Group
    Company: http://www.lobsterpot.com.au
    Blog: http://blogs.lobsterpot.com.au

  • Actually Rob, you will see stuff like that in HL7 data all the time.

    I've used the vertical tab for billions of rows over the years; it has never failed.

  • Ok, cool. I've tended to use two strange characters (ones that can't be typed easily), to avoid the chance of hitting a match. But I agree that in normal text (which is where double-spaces might matter), a vertical space is probably never going to appear.

    Rob Farley
    LobsterPot Solutions & Adelaide SQL Server User Group
    Company: http://www.lobsterpot.com.au
    Blog: http://blogs.lobsterpot.com.au

  • Dave Pendleton (8/17/2012)


    I use Jeff's method all the time in SSIS, except I use a vertical tab as the "unlikely" character.

    REPLACE(REPLACE(REPLACE(LTRIM(«Column»)," "," " + "\v"),"\v" + " ",""),"\v","")

    Thank you for the support but the method is an order of magnitude slower than the method found in the discussion. Please see the beginning of the article for an update and which thread to see for an even faster method than mine.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Perhaps so, but I want to do this in an SSIS expression, not SQL.

    And which method, specifically? I didn't read the entire thread. Are you talking about the CLR option? My data sources aren't always SQL Server.

  • I'm worried by your comment "not so common collation Latin1_general_ci_ai". Whilst new to SQL some years ago I installed a third party system that recommended this collation. I have been using it as the default ever since (15 years or so). If this is the not so common, what is the common and should I change my default?

  • what rubbish, you could use Stuff() instead of using complicated methods.

    simply...

  • res.manish (8/18/2012)


    what rubbish, you could use Stuff() instead of using complicated methods.

    simply...

    What's absolute rubbish is when someone makes such a comment without any proof. Let's see your code and we'll find out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • balaji.v (8/17/2012)


    Hi Folks,

    How about adding double byte spaces to a Chinese character string?

    Say i have a string of length 20. I want it to be of length 30. So i add SPACE(10)

    This string is full of Chinese characters (big5). But the SPACE(10) is not. Now when i read the output through a chinese system/app, i get garbled characters/output as the space is not a double byte space.

    How would i go about solving this one? single byte space character to double byte space character.

    Thanks.

    BV

    I haven't tried it but I suspect it's just a collation problem between systems. Try using COLLATE with the particular collation you're using on the Chinese system. Also make sure that the language settings are correct for Chinese characters when you do the concatenation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 361 through 375 (of 425 total)

You must be logged in to reply to this topic. Login to reply