REPLACE Multiple Spaces with One

Question

Post reply

REPLACE Multiple Spaces with One

Viewing 15 posts - 361 through 375 (of 426 total)

You must be logged in to reply to this topic. Login to reply

Michael Meierruth SSChampion Points: 10051 More actions · Answer 1

Yes I noticed too how that final 2-space call gets invoked very,very often by your 2^N variant.

I think the max for N=6 is 751678.

But I haven't found a formula yet either.

TomThomson SSC Guru Points: 104768 More actions · Answer 2

smartin108 (3/26/2011)
I say P-O-2 is more efficient because the number of maximum string size handled by each level of REPLACE is usually greater than in the 65, 33... series:
N=1 (REPLACE x 2) 4
N=2 (REPLACE x 3) 10
N=3 (REPLACE x 4) 38
N=4 (REPLACE x 5) 398
N=5 (REPLACE x 6) 11806
N=6 (REPLACE x 7) 754702

It is indeed more efficient. But maximum is a bit of a misnomer, thisenumbers are not the length of the longest string of spaces the algorithm will reduce to a single space but instead the number 1 less than the smallest number for which it doesn't work. For example N=2 works for 12, 13, and 16 but fails for 11 (and for 14 and for 15). Of course the maximum isn't really interesting, which is a pity, because it's easy to get a formula for it - the maximum string for which N=k works is 2 to the power 2k. The interesting number is the number you called the maximum (the algorithm is guaranteed to work for blocks of spaces of this and all smaller lengths), for which I haven't (yet) found a clean formula. I'll contimue to try.

Tom

Bob-599816 Newbie Points: 9 More actions · Answer 3

Hi All,

I really like the REPLACE method and have used it successfully as follows. The REPLACE functions can either be nested (harder to maintain) or executed serially (my preference) as shown. This is definitely a CBE (crude but effective) method. I like it's simplicity, versatility, scalability and maintainability.

-- Remove special characters (<h-tab>, <lf>, <cr>, ") and compress multiple spaces to one space character.

--09 = Horizontal Tab

update Definitions set ObjectText = replace(ObjectText, CHAR(09), ' ')

--10 = Line Feed

update Definitions set ObjectText = replace(ObjectText, CHAR(10), ' ')

--13 = Carriage Return

update Definitions set ObjectText = replace(ObjectText, CHAR(13), ' ')

--34 = " Double quote.

update Definitions set ObjectText = replace(ObjectText, CHAR(34), ' ')

-- Convert each occurrence of ' ' (two spaces) to ' ' (one space) 8 times.

--This will compress space blocks of up to 256 (2^8) characters in length guaranteeing only one space between non-blank characters.

update Definitions set ObjectText = replace(ObjectText, ' ', ' ')

Because SQL is so liberal regarding spaces between clauses and multi-part entity names, I also found myself using the following constructs after removing all special characters and converting every double-space to a single-space. Recompress spaces a couple of times after this because changing '] space' or 'space [' to space will leave two spaces in a row.

--'] .' Right square bracket followed by a space and a period changed to '.'.

update Definitions set ObjectText = replace(ObjectText, '] .', '.')

--'. [' Left square bracket preceded by a period and a space changed to '.'.

update Definitions set ObjectText = replace(ObjectText, '. [', '.')

--' .' Space and a period changed to '.'.

update Definitions set ObjectText = replace(ObjectText, ' .', '.')

--'. ' Period and a space changed to '.'.

update Definitions set ObjectText = replace(ObjectText, '. ', '.')

--'].' Right square bracket followed by a period changed to '.'.

update Definitions set ObjectText = replace(ObjectText, '].', '.')

--'.[' Period followed by a left square bracket changed to '.'.

update Definitions set ObjectText = replace(ObjectText, '.[', '.')

--91 = [ Left square bracket changed to a space.

update Definitions set ObjectText = replace(ObjectText, '[', ' ')

--93 = ] Right square bracket changed to a space.

update Definitions set ObjectText = replace(ObjectText, ']', ' ')

Thanks, Bob.

mailatunni Valued Member Points: 73 More actions · Answer 4

DECLARE @myString varchar(500)='select * from gdgdfnkdngjrwrwrrwkg;select tuyututuut;'

SELECT REPLACE(

REPLACE(

REPLACE( LTRIM(RTRIM(@myString)),' ',' '+CHAR(7))

,CHAR(7)+' ','')

,CHAR(7),'')

SQL can do it with out the help of C#.. 😎 Happy Coding.. 🙂

David Pendleton SSC-Addicted Points: 410 More actions · Answer 5

I use Jeff's method all the time in SSIS, except I use a vertical tab as the "unlikely" character.

REPLACE(REPLACE(REPLACE(LTRIM(«Column»)," "," " + "\v"),"\v" + " ",""),"\v","")

balaji.v SSC Rookie Points: 26 More actions · Answer 6

Hi Folks,

How about adding double byte spaces to a Chinese character string?

Say i have a string of length 20. I want it to be of length 30. So i add SPACE(10)

This string is full of Chinese characters (big5). But the SPACE(10) is not. Now when i read the output through a chinese system/app, i get garbled characters/output as the space is not a double byte space.

How would i go about solving this one? single byte space character to double byte space character.

Thanks.

BV

Rob Farley SSCrazy Points: 2739 More actions · Answer 7

Can't this be done in a much more simple way than trying to pick a single character that could never appear? By choosing two characters that can't appear next to each other (and sure, use 'bell' and vertical tab, or whatever), the chance of breaking it is far less.

SELECT REPLACE(REPLACE(REPLACE(OriginalString, ' ', '~|'), '|~', ''), '~|', ' ')

FROM @demo;

Rob Farley
LobsterPot Solutions & Adelaide SQL Server User Group
Company: http://www.lobsterpot.com.au
Blog: http://blogs.lobsterpot.com.au

David Pendleton SSC-Addicted Points: 410 More actions · Answer 8

Actually Rob, you will see stuff like that in HL7 data all the time.

I've used the vertical tab for billions of rows over the years; it has never failed.

Rob Farley SSCrazy Points: 2739 More actions · Answer 9

Ok, cool. I've tended to use two strange characters (ones that can't be typed easily), to avoid the chance of hitting a match. But I agree that in normal text (which is where double-spaces might matter), a vertical space is probably never going to appear.

Rob Farley
LobsterPot Solutions & Adelaide SQL Server User Group
Company: http://www.lobsterpot.com.au
Blog: http://blogs.lobsterpot.com.au

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 10

Dave Pendleton (8/17/2012)
I use Jeff's method all the time in SSIS, except I use a vertical tab as the "unlikely" character.
REPLACE(REPLACE(REPLACE(LTRIM(«Column»)," "," " + "\v"),"\v" + " ",""),"\v","")

Thank you for the support but the method is an order of magnitude slower than the method found in the discussion. Please see the beginning of the article for an update and which thread to see for an even faster method than mine.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

David Pendleton SSC-Addicted Points: 410 More actions · Answer 11

Perhaps so, but I want to do this in an SSIS expression, not SQL.

And which method, specifically? I didn't read the entire thread. Are you talking about the CLR option? My data sources aren't always SQL Server.

TheSpyder SSC Enthusiast Points: 163 More actions · Answer 12

I'm worried by your comment "not so common collation Latin1_general_ci_ai". Whilst new to SQL some years ago I installed a third party system that recommended this collation. I have been using it as the default ever since (15 years or so). If this is the not so common, what is the common and should I change my default?

res.manish Newbie Points: 7 More actions · Answer 13

what rubbish, you could use Stuff() instead of using complicated methods.

simply...

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 14

res.manish (8/18/2012)
what rubbish, you could use Stuff() instead of using complicated methods.
simply...

What's absolute rubbish is when someone makes such a comment without any proof. Let's see your code and we'll find out.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 15

balaji.v (8/17/2012)
Hi Folks,
How about adding double byte spaces to a Chinese character string?
Say i have a string of length 20. I want it to be of length 30. So i add SPACE(10)
This string is full of Chinese characters (big5). But the SPACE(10) is not. Now when i read the output through a chinese system/app, i get garbled characters/output as the space is not a double byte space.
How would i go about solving this one? single byte space character to double byte space character.
Thanks.
BV

I haven't tried it but I suspect it's just a collation problem between systems. Try using COLLATE with the particular collation you're using on the Chinese system. Also make sure that the language settings are correct for Chinese characters when you do the concatenation.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)