February 9, 2005 at 11:07 am
I have a table with a column called displayname.
Values like 'John' or 'John Doe' are fine; values like 'JohnDoe' need to be converted to 'John Doe'.
The rule is simple: if an uppercase letter appears in the middle of a string, insert a single space in front of it.
Is there a way to do this other than parsing each string letter by letter and looking for an ascii value in the uppercase range? Can this be done using pattern matching on a database with a case insensitive collation?
Gordon Pollokoff
"Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones
February 9, 2005 at 11:55 am
This is not the prettiest code I've ever written but it works...
CREATE TABLE [Numbers] (
[PkNumber] [int] IDENTITY (1, 1) NOT NULL ,
CONSTRAINT [Pk_Number] PRIMARY KEY CLUSTERED
(
[PkNumber]
) ON [PRIMARY]
) ON [PRIMARY]
GO
Declare @i as int
set @i = 0
while @i < 8000
begin
Insert into dbo.Numbers Default values
set @i = @i + 1
end
GO
Create table #test (DisplayName varchar(100))
Insert into #test (DisplayName)
Select 'John Doe'
union all
select 'JaneDoe'
union all
select 'John'
union all
select 'AFreakNameWithLotsOfNames'
Select * from #test
--Select DisplayName, PkNumber from #test cross join dbo.Numbers where PkNumber between 2 and LEN(DisplayName) AND ASCII(SUBSTRING(DisplayName, PkNumber, 1)) BETWEEN 65 and 90 AND SUBSTRING(DisplayName, PkNumber - 1, 1) ' '
while @@rowcount > 0
begin
Update T set T.DisplayName = LEFT(T.DisplayName, PkNumber - 1) + ' ' + RIGHT(T.DisplayName, LEN(T.DisplayName) - PkNumber + 1) FROM #test T inner join (Select DisplayName, PkNumber from #test cross join dbo.Numbers where PkNumber between 2 and LEN(DisplayName) AND ASCII(SUBSTRING(DisplayName, PkNumber, 1)) BETWEEN 65 and 90 AND SUBSTRING(DisplayName, PkNumber - 1, 1) ' ') dtReplaces on T.DisplayName = dtReplaces.DisplayName
end
Select * from #test
drop table #test
I had to put the Update in a while because I couldn't think of a way to update the name in one shot when more than 1 replace had to occur (besides making a UDF that would accept the word, scan it letter by letter and make the whole replace there... which would be kind of dumb since it's exactly what I'm trying to avoid with the Numbers table).
I think either way there's not gonna be any super fast solution to this problem. But hopefully it's only a one time deal.
February 9, 2005 at 12:19 pm
A quick google search turned up this solution:
select Ltrim (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (
Replace (displayName
,'A', ' A')
,'B', ' B')
,'C', ' C')
,'D', ' D')
,'E', ' E')
,'F', ' F')
,'G', ' G')
,'H', ' H')
,'I', ' I')
,'J', ' J')
,'K', ' K')
,'L', ' L')
,'M', ' M')
,'N', ' N')
,'O', ' O')
,'P', ' P')
,'Q', ' Q')
,'R', ' R')
,'S', ' S')
,'T', ' T')
,'U', ' U')
,'V', ' V')
,'W', ' W')
,'X', ' X')
,'Y', ' Y')
,'Z', ' Z')
)
from mytable
where displayname not like '% %'
This query replaces every uppercase letter with ' ' and the uppercase letter and excludes records that already have the space (data specific since I know that all records are either one or two names). The LTRIM removes the space added to the beginning.
Gordon Pollokoff
"Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones
February 9, 2005 at 12:51 pm
I've ran this on my test data and it returns incorrect results :
John Doe
JaneDoe
John
AFreakNameWithLotsOfNames
becomes
J A N E D O E
J O H N
A F R E A K N A M E W I T H L O T S O F N A M E S
Is it possible that the paste you made to post this message might have changed something in the output or is it just a collation issue?
February 9, 2005 at 1:34 pm
I don't think this will be possible with a case-insensitive collation, even if you do use ASCII char values.
Example - uppercase D is ASCII 68. Try this, to place a space before the uppercase D:
Select replace('JodyDoe', Char(68), ' ' + char(68))
On my case-insensitive server, this matches both the 'd' and 'D' despite specifying an explicit ASCII value.
February 9, 2005 at 1:45 pm
Exactly my point ... my solution doesn't have this limit.
February 9, 2005 at 3:13 pm
I agree that Gordon's Google-based solution have problems when the collation is case insensitive, but why not force the collation to be case-sensitive just for the statement?
Replace displayName with (displayName collate Latin1_General_CS_AI), and this should do the trick.
February 10, 2005 at 1:09 am
Yup and it would be faster than my version since only 1 update is ran.
February 10, 2005 at 1:57 am
and I did forget to paste the collation bit!
Gordon Pollokoff
"Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply