January 13, 2010 at 10:50 am
I have name data to import into a table. This data is in the following
format:
DOE/JOHN F
JONES/MARY ALICE
SMITH/HAROLD
I need to parse this data into lastname, firstname, and middle
initial. Is there any TSQL I can quickly employ?
January 13, 2010 at 11:02 am
You can split on the slash pretty easily, but the rest is trickier. For example, will you have any names that don't have a middle name? How about ones with multiple "middle" names ("Billy Bob Sam Jones")? How about splitting names with titles in them, like Dr or Rev? Or can you be certain all the names will be in the same format?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 13, 2010 at 11:14 am
it takes a bit of manipulating, but you need to use substrings and charindex to find the slash, and grab the left and right parts of it; you have to do the same logic a second time on the "right" side of the string, and look for the space.
note my example where i added 'RUCKUS/BILLY JOE JIM BOB' what is his first name vs his middle name? you need to look at your data and see if there are any names with more than two spaces.
here's an example:
results:
LNAME FirstAndMiddle MNAME LNAME
------------------------ ------------------------ ------------------------ ------------------------
DOE JOHN F F JOHN
JONES MARY ALICE ALICE MARY
SMITH HAROLD HAROLD
RUCKUS BILLY JOE JIM BOB JOE JIM BOB BILLY
and here's the code:
SELECT 'DOE/JOHN F' AS FULLNAME INTO #TMP UNION ALL
SELECT 'JONES/MARY ALICE' UNION ALL
SELECT 'SMITH/HAROLD' UNION ALL
SELECT 'RUCKUS/BILLY JOE JIM BOB'
SELECT
--assumes the slash is ALWAYS in the data:
SUBSTRING(FULLNAME,1,CHARINDEX('/',FULLNAME) -1) AS LNAME, --minus one to remove the slash
SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30) AS FirstAndMiddle, --add one to remove the slash
CASE
--if there is a space, ASSUME the second word is a middle name, and not a two part first name
WHEN CHARINDEX(' ',SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30)) > 1
THEN SUBSTRING(SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30),CHARINDEX(' ',SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30)),30)
--else no middle name
ELSE ''
END AS MNAME,
CASE
--if there is a space, ASSUME the second word is a first name,
WHEN CHARINDEX(' ',SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30)) > 1
THEN SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,(CHARINDEX(' ',FULLNAME)-1) - CHARINDEX('/',FULLNAME)+ 1)
--else whole thing is first name
ELSE SUBSTRING(FULLNAME,CHARINDEX('/',FULLNAME)+ 1,30)
END AS LNAME
from #TMP
Lowell
January 13, 2010 at 11:53 am
Thank you for your quick and thorough replies. When I ran
the 2nd post SQL, I received the following message
Invalid length parameter passed to the SUBSTRING function.
January 13, 2010 at 12:37 pm
that would occur if any space existed in the last name, before the slash.
a name like Mac Bride/Bob Lee would make that happen... you'll need to tweak it based on your data.
from your example, i assumed the only space was between first and middle names.
Lowell
January 13, 2010 at 12:40 pm
Thank you. I'll try and tweek the SQL you prepared based on that assumption.
January 13, 2010 at 12:57 pm
Lowell (1/13/2010)
that would occur if any space existed in the last name, before the slash.a name like Mac Bride/Bob Lee would make that happen... you'll need to tweak it based on your data.
from your example, i assumed the only space was between first and middle names.
The lack of a slash would cause that error as well...
Seems like it's going to be quite a bit of work to get all the special patterns covered...
But that's the price one has to pay when dealing with (over-)denormalized data, I guess...
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply