July 27, 2004 at 10:06 am
I'm quite new to SQL and have found myself into a bit of a conundrum..
I have a table where the names look like this:
Mr E G Ford
Mr Trevor Peacock
Mr Jones
Mr B Racoon.
I'm trying to get them into title,firstname and surname fields. Best thing I could come up with is using parsename and replacing the " " with a . Of course that would cause havoc on the firstname/initials.
Could anyone help?
July 27, 2004 at 11:13 am
Try something with this: http://www.sqlservercentral.com/columnists/sjones/tamestrings2.asp
July 28, 2004 at 2:28 am
Won't this still give me the same problems in the initials/forename field. So I end up with something like:
MR
E
G
Ford
Mr
Trevor
Peacock.
July 28, 2004 at 3:18 am
You need to decide what your business rules dictate should happen. For instance, with Mr E G Ford, what do you want to end up with in firstname? 'E', and throw the 'G' away? 'E G' ? What about Mr Jones, what does he get in firstname, ''? NULL?
Either alone or in consultation with whoever will be consuming the data, put together a definitive set of rules about what happens in every possible input case. Then you can code.
The tools you will be using for the coding are CHARINDEX (to find one string within another), SUBSTRING (to get part of a string), LEFT and RIGHT (which aren't necessary since SUBSTRING can do what they do, but they make things clearer), LEN (to get the length of a string), REVERSE (to reverse a string).
For example, to get the first 'word' of your input string (for your title field, say), you could use the expression
LEFT(inputname, CHARINDEX(' ', inputname)-1)
To extract the second 'word' of the input string, you could say
SUBSTRING(inputname, CHARINDEX(' ', inputname)+1, CHARINDEX(' ', inputname, CHARINDEX(' ', inputname)+1) - CHARINDEX(' ', indexname) - 1)
Here we want the substring that starts at (location of first space, plus one) and has length (location of second space, less location of first space, less one). The location of the second space is the location of the first space AFTER the first space (hope that makes sense). For example, for
Mr E G Jones 123456789012
we have
CHARINDEX(' ', inputname) = 3
CHARINDEX(' ', inputname, 3+1) = 5
so we would extract
SUBSTRING(inputname, 3+1, 5-3-1) = SUBSTRING(inputname, 4, 1) = 'E'
To extract the *last* 'word' of inputname (for the surname) we have to be a little sneaky since T-SQL lacks a 'reverse search' function (like VB's InStrRev, say). We use the fact that the last space in inputname is the first space in REVERSE(inputname), so we end up with
RIGHT(inputname, CHARINDEX(' ', REVERSE(inputname))-1)
Notice how using RIGHT rather then SUBSTRING saves us having to work out the length of inputname.
We could use the same technique if we decided that 'firstname' was to be filled with everything between the first and last spaces of inputname:
SUBSTRING( inputname, CHARINDEX(' ', inputname), LEN(inputname)-CHARINDEX(' ', REVERSE(inputname)+1-CHARINDEX(' ', inputname))
This has been just a taster of the wonderful world of string manipulation; I have neglected to include any kind of error checking, which you would have to do with real world data, probably in a preprocessing query. But hopefully you will be able to pick up some ideas.
July 28, 2004 at 5:03 am
That works a treat. I've put a bit of error checking around the fullname and then what comes out the other end and it's all hunky dorey.
Thanks a lot.
July 28, 2004 at 1:56 pm
Just a comment on the subject of name parsing. The last word in the string is not necessarily the surname. For example, the surname "Saint Raymond" is 2 words. Also, the first word is not necessarily the entire first name. My friends "first name" is "Mary Dee". As someone who constantly gets sorted into the wrong column based on my 2 word last name, I am sensitive to this problem. This is an issue of data collection and business rules, not a programming issue, but I wanted you to be aware of it.
Terri
To speak algebraically, Mr. M. is execrable, but Mr. C. is
(x+1)-ecrable.
Edgar Allan Poe
[Discussing fellow writers Cornelius Mathews and William Ellery Channing.]
July 29, 2004 at 4:14 am
I completley agree with you, it is a data collection issue. I've always prefered to keep my names into separate fields for this very reason. Hence the fact I was doing the above.
August 2, 2004 at 11:38 am
try this...
CREATE procedure dbo.xxx
@vchName varchar (50),
@vchPrefix varchar(50) output,
@vchFirstName varchar(50) output,
@vchMiddleName varchar(50) output,
@vchLastName varchar(50) output,
@vchSuffix varchar(50) output
as
set nocount on
declare @tiStringLength tinyint,
@tiSpaceLocator tinyint,
@tiPointer tinyint,
@tiCount tinyint,
@tiRecCount tinyint,
@vchStrHold varchar(50),
@vchOne varchar(50),
@vchTwo varchar(50),
@vchThree varchar(50),
@vchFour varchar(50),
@vchFive varchar(50)
select @tiRecCount = 0
select @vchName = @vchName + ' '
select @vchName = ltrim(@vchName)
select @tiCount = 1
WHILE @tiCount < 6
BEGIN
select @tiStringLength = datalength(@vchName)
select @tiSpaceLocator = patindex("% %",@vchName)
IF @tiSpaceLocator > 0
BEGIN
select @tiRecCount = @tiRecCount + 1
select @vchStrHold = LEFT(@vchName, @tiSpaceLocator)
IF @tiCount = 1
select @vchOne = @vchStrHold
IF @tiCount = 2
select @vchTwo = @vchStrHold
IF @tiCount = 3
select @vchThree = @vchStrHold
IF @tiCount = 4
select @vchFour = @vchStrHold
IF @tiCount = 5
select @vchFive = @vchStrHold
END
ELSE
BEGIN
select @vchStrHold = null
IF @tiCount = 1
select @vchOne = @vchStrHold
IF @tiCount = 2
select @vchTwo = @vchStrHold
IF @tiCount = 3
select @vchThree = @vchStrHold
IF @tiCount = 4
select @vchFour = @vchStrHold
IF @tiCount = 5
select @vchFive = @vchStrHold
END
select @tiPointer = @tiStringLength - @tiSpaceLocator
select @vchName = right(@vchName,@tiPointer)
select @vchName = ltrim(@vchName)
if @vchStrHold = ''
break
select @tiCount = @tiCount + 1
END
IF @tiRecCount = 1 --put value in last name
select @vchLastName = @vchOne
IF @tiRecCount = 2
IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')
BEGIN
select @vchPrefix = @vchOne
select @vchLastName = @vchTwo
END
ELSE
BEGIN
select @vchFirstName = @vchOne
select @vchLastName = @vchTwo
END
IF @tiRecCount = 3
BEGIN
IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')
BEGIN
select @vchPrefix = @vchOne
select @vchFirstName = @vchTwo
select @vchLastName = @vchThree
END
ELSE
IF @vchThree not like ('Jr%') and @vchThree not like ('Sr%')
and @vchThree not like ('III') and @vchThree not like ('3')
and @vchThree not like ('IV') and @vchThree not like ('4')
and @vchThree not like ('V') and @vchThree not like ('5')
BEGIN
select @vchFirstName = @vchOne
select @vchMiddleName = @vchTwo
select @vchLastName = @vchThree
END
IF @vchThree like ('Jr%') or @vchThree like ('Sr%')
or @vchThree like ('III') or @vchThree like ('3')
or @vchThree like ('IV') or @vchThree like ('4')
or @vchThree like ('V') or @vchThree like ('5')
BEGIN
select @vchFirstName = @vchOne
select @vchLastName = @vchTwo
select @vchSuffix = @vchThree
END
END
IF @tiRecCount = 4
BEGIN
IF @vchOne like ('Mr%') or @vchOne like ('Dr%') or @vchOne like ('Ms%')
IF @vchFour not like ('Jr%') and @vchFour not like ('Sr%')
and @vchFour not like ('III') and @vchFour not like ('3')
and @vchFour not like ('IV') and @vchFour not like ('4')
and @vchFour not like ('V') and @vchFour not like ('5')
BEGIN
select @vchPrefix = @vchOne
select @vchFirstName = @vchTwo
select @vchMiddleName = @vchThree
select @vchLastName = @vchFour
END
ELSE
BEGIN
select @vchPrefix = @vchOne
select @vchFirstName = @vchTwo
select @vchLastName = @vchThree
select @vchSuffix = @vchFour
END
ELSE
BEGIN
select @vchFirstName = @vchOne
select @vchMiddleName = @vchTwo
select @vchLastName = @vchThree
select @vchSuffix = @vchFour
END
END
IF @tiRecCount = 5
BEGIN
select @vchPrefix = @vchOne
select @vchFirstName = @vchTwo
select @vchMiddleName = @vchThree
select @vchLastName = @vchFour
select @vchSuffix = @vchFive
END
RETURN
GO
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply