May 14, 2008 at 6:21 am
OK, so I've nabbed this proper case function from somewhere a little while ago:
USE [Preqin]
GO
/****** Object: UserDefinedFunction [dbo].[f_ProperCaseENHANCED2] Script Date: 05/14/2008 12:59:45 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[f_ProperCaseENHANCED2] (@String VARCHAR(8000)) RETURNS VARCHAR(8000) AS BEGIN DECLARE @TempString VARCHAR(8000) DECLARE @ps VARCHAR(8000)
SET @ps = ''
-- lower case entire string
SET @TempString = lower(@String)
WHILE patindex('%[-( '']%',@TempString) > 0 BEGIN
-- Check to see if first character of @TempString is whitespace
IF (patindex('%[-( '']%',SUBSTRING(@TempString,1,1)) > 0)
BEGIN
SET @ps = @ps + SUBSTRING(@TempString,1,1)
END
ELSE -- @TempString starts with a Name
BEGIN
IF SUBSTRING(@TempString,1,2) = 'mc'
BEGIN
SET @TempString = SUBSTRING(@Tempstring,3,LEN(@TempString))
END
IF SUBSTRING(@TempString,1,3) = 'mac'
BEGIN
SET @TempString = SUBSTRING(@Tempstring,4,LEN(@TempString))
END
-- upper case first character and return string up to the next space
SET @ps = @ps + UPPER(SUBSTRING(@TempString,1,1)) +
SUBSTRING(@TempString,2,patindex('%[-( '']%',@TempString)-1)
END
-- truncation string that we have already processed
SET @TempString = SUBSTRING(@TempString,
patindex('%[-( '']%',@TempString)+1,LEN(@TempString))
-- Trim off spaces
SET @TempString = RTRIM(LTRIM(@TempString))
END
IF SUBSTRING(@TempString,1,2) = 'mc'
BEGIN
SET @TempString = SUBSTRING(@Tempstring,3,LEN(@TempString))
END
IF SUBSTRING(@TempString,1,3) = 'mac'
BEGIN
SET @TempString = SUBSTRING(@Tempstring,4,LEN(@TempString))
END
-- proper case last word/name
SET @ps = @ps + UPPER(SUBSTRING(@TempString,1,1)) +
SUBSTRING(@TempString,2,LEN(@TempString))
-- check for spaces in front of special characters
SET @ps = Replace(@PS,' -','-')
SET @ps = Replace(@PS,' ''','''')
RETURN (@PS)
END
Now, in the main this seems to work well, however, there are some modifications I would like to make to it, so I guess I need to actually understand what it's doing, which I kinda do, but not fully.
An 'initials' field in one of my tables contains a value 'a.b.' the propercase function converts it to 'A.b.' I would like it to become 'A.B.'
I understand the capitalization of the first letter of the word is achieved by this part:
UPPER(SUBSTRING(@TempString,1,1))
i.e. the substring in this case is 'a.b' and its taking 1 character and starting at the first character, the UPPER is then converting it to uppercase - so 'a.b.' becomes 'A.b'
However, I get a bit lost trying to work out how I tell it to capitalize the first letter after a full stop (period for the US guys :p )
I've been playing trying to sort some strings out for other problems, and used charindex, not sure if I could use it somehow like this:
charindex('.',@TempString,1)
Making the above something like:
UPPER(SUBSTRING(@TempString,charindex('.',@TempString,1),1))
But, A. I don't know if that will work, and B. I don't know how I could integrate it in to the above function.
Any help guys?
May 14, 2008 at 6:28 am
I think it will work if you add a dot to the separator list - i.e. change all instances of '%[-( '']%' to '%[-( ''.]%' 🙂
Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
May 14, 2008 at 6:35 am
Thanks Ryan, that worked a treat!
One thing which I didn't think would matter is the position of the '.' - I put it in the same place in each of the strings I replaced, however, it didn't work, THEN I put it in the exact position you specified in your post and it DID work.
What is the significance of the characters in the separator list? - I understand it needs to be '% %' in format, but I thought that the contents between the % signs could be in any order as long as it was consistently used throughout the function?
(I've never used, or even came across PATINDEX until today, and this is based on this function and a little reading on books online)
May 14, 2008 at 6:45 am
Where did you put it?
Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
May 14, 2008 at 8:00 am
The first time the original string...
'%[-( '']%'
Was replaced by
'%[-( ''].%'
And it did not work, it didn't capitalize anything!
So, I changed it to
'%[-( ''.]%'
As per your instructions and it worked
However, on further inspection, I changed it to this
'%[-.( '']%'
And it worked.
I thought it just had to be within the %% but by the looks if it it has to be within the [] too - I can only assume that was the reason it didn't work, and that the characters inside [] can be in any order, as long as its consistent in the function.
May 14, 2008 at 8:13 am
Yep - you're spot on! 🙂
The square brackets help you look for "any single character within the specified range ([a-f]) or set ([abcdef])"
http://msdn.microsoft.com/en-us/library/ms179859.aspx
Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
May 15, 2008 at 5:06 am
The purpose of this propercase function is basically to tidy names in our CRM system, on the whole it works quite well, however, I've tracked down a couple of issues:
Original String Becomes..
DePonte --> Deponte
van Tuyll --> Van Tuyll
de La Ferriere --> De La Ferriere
DeMartino --> Demartino
de Vaulx --> De Vaulx
RosenZweig --> Rosenzweig
DiCioccio --> Dicioccio
So... I need to make sure that 'van %' is always 'van %' and never Van VaN vAN etc
the same with de - i.e. 'de %' must always be 'de %'
(Note the space between the de / van and the % sign!)
The other cases, I am not sure what I can really do - I mean the 'RosenZweig' one - how on earth can you plan for that or something similar! - the DeMartino and DiCioccio are easier, but still a risk in my opinion - someone with a name like 'Devonshire' you would not want corrected to DeVonshire!!
Unless anyone has any cunning plans I think those ones are best left alone - unless I could do something clever to pick up a pattern of upper,lower,upper,lower.... and tell it to keep the string like this!
So I guess firstly the van and de need to be dealt with
The mc is dealt with like this:
IF SUBSTRING(@TempString,1,2) = 'mc'
BEGIN
SET @TempString = SUBSTRING(@Tempstring,3,LEN(@TempString))
As the string is converted to lower case, I dont need to worry about cases... so I tried the following
IF SUBSTRING(@TempString,1,3) = 'de '
BEGIN
SET @TempString = SUBSTRING(@Tempstring,4,LEN(@TempString))
END
IF SUBSTRING(@TempString,1,4) = 'van '
BEGIN
SET @TempString = SUBSTRING(@Tempstring,5,LEN(@TempString))
END
And received this error:
Msg 536, Level 16, State 5, Line 2
Invalid length parameter passed to the SUBSTRING function.
Any ideas?!
May 15, 2008 at 8:42 am
I'd discovered that if I remove the space from the patindex string it works.
However, I'm not fully aware of the consequences of removing this space character from that string - can anyone explain what it would change?!
May 16, 2008 at 9:00 pm
As you've found, it's nearly impossible to specify rules for every name. It may be better to use the rules to spot names that don't seem to fit the rules and then list them for a human being to examine.
May 19, 2008 at 4:45 am
I think that may have to be the case, there are many inconsistencies with the naming conventions - van for example - sometimes its lower case, sometimes its capitalized, usually depending on the persons geographic origin! - We have quite a lot of data on lots of these people, but there are limits lol.
I think catching most people and avoiding names entered in all caps will be a good start to tidying the DB, however, as you've pointed out, it's never going to be perfect!
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply