December 29, 2004 at 7:53 am
I have a routine in Excel that will parse out a column with a person’s full name. I need to break the column up into three separate columns with T-SQL. Here are the scripts that I am using:
This section describes several worksheet functions you can use to split full names into the first and last name components.
To return the last name of the full name in A2, use the following formula.
=LEFT(A1,IF(ISERROR(FIND(" ",A1,1)),LEN(A1),FIND(" ",A1,1)-1))
To return the first name of the full name in A2, use the following formula.
=TRIM(IF(ISERROR(FIND(" ",A1,1)),A1,MID(A1,FIND(" ",A1,1)+1,
IF(ISERROR(FIND(" ",A1,FIND(" ",A1,1)+2)),LEN(A1),
FIND(" ",A1,FIND(" ",A1,1)+2))-FIND(" ",A1,1))))
To return the middle name of the full name in A2, use the following formula.
=TRIM(RIGHT(A1,LEN(A1)-IF(ISERROR(FIND(" ",A1,
FIND(" ",A1,FIND(" ",A1,1)+2))),LEN(A1),
FIND(" ",A1,FIND(" ",A1,FIND(" ",A1,1)+2))-1)))
Any suggestions would be appreciated.
December 29, 2004 at 8:16 am
Have you searched the script section here already. This is a FAQ, so very likely you find a generic solution there.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 29, 2004 at 8:21 am
Yes. I have found a few UDF that will split the field but I need something like this:
SELECT SUBSTRING(Col001,1,CHARINDEX(' ',Col001)) AS LName,
SUBSTRING(Col001,CHARINDEX(' ',Col001)+1, CHARINDEX(' ',Col001,CHARINDEX(' ',Col001)+1)-CHARINDEX(' ',Col001)+20) AS FName,
REVERSE(SUBSTRING(REVERSE(Col001),1,CHARINDEX('',REVERSE(Col001)))) AS MName
FROM table
I have not gotten this to work properly. Do you think this can work?
December 29, 2004 at 8:22 am
Can you post some sample data?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 29, 2004 at 8:22 am
There might be a script to help in the scripts section as well. Not to toot my horn, but I wrote some articles, all entitled Tame Those Strings (search on that) about string manipulation. Basically the IFs are replaced by CASE statements, the FIND with CHARINDEX, RTRIM or LTRIM replace TRIM.
December 29, 2004 at 8:36 am
Raw Data
Col001
AABY TAMARA J
AADNESEN MARJORIE
AARANT MAUREEN R
AARON MICHELLEE
AARON PATRICIA J
ABBA DAVID W
ABBATE ANTHONY P
SELECT SUBSTRING(Col001,1,CHARINDEX(' ',Col001)) AS LName,
SUBSTRING(Col001,CHARINDEX(' ',Col001)+1, CHARINDEX(' ',Col001,CHARINDEX(' ',Col001)+1)-CHARINDEX(' ',Col001)+20) AS FName,
REVERSE(SUBSTRING(REVERSE(Col001),1,CHARINDEX('',REVERSE(Col001)))) AS MName
FROM table
Gives me this:
LName FName MName
AABY TAMARA J
AADNESEN MARJORIE
AARANT MAUREEN R
AARON MICHELLEE
AARON PATRICIA J
ABBA DAVID W
ABBATE ANTHONY P
The AS MName is what is killing me. How can I get the MName field populated? I have looked at the Tame Those Strings articles but I am still lost.
December 29, 2004 at 8:45 am
Just a question. You are aware that there is a built-in function in Excel that can do this quite nicely? In English I suspect it to be called "Text in columns" and it can be found in the Data menu in the menu bar.
Here's the quickely recorded Excel macro
Sub Makro1()
'
' Makro1 Makro
' Makro am 29.12.2004 von Frank Kalis aufgezeichnet
'
'
Selection.TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=True, _
Semicolon:=False, Comma:=False, Space:=True, Other:=False, FieldInfo _
:=Array(Array(1, 1), Array(2, 1), Array(3, 1))
End Sub
Or does it need to be in T-SQL?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 29, 2004 at 8:52 am
Yes. I do alot of my data scrubbing in Excel but Excel is limited to 65,000 lines. I need to do a couple of million records.
December 29, 2004 at 8:56 am
Umpf, yes that's really a serious limitation in Excel. Will see what I can dig out this evening. However, it certainly pays to have a close look at Steve's article anyway.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 29, 2004 at 9:00 am
Thanks! This seems to work also for the LName but I may be making this harder than it is:
SELECT SUBSTRING(Col001,1,LEN(SUBSTRING(Col001,1,CHARINDEX(' ',Col001)))) AS LName FROM table
December 29, 2004 at 1:20 pm
SELECT
LName =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 3)
IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 3)
END,
FName =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 3)
IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 1)
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 2)
END,
MName =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 3) IS NULL THEN ''
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 1)
END
FROM table
This seems to work also but I still am not getting the correct results.
December 29, 2004 at 1:51 pm
Based on your sample data, it really seems to work with the exception of names like 'VAN HALEN EDDIE'. How many row are we talking about? Is this a one-time action?
Btw, have you seen one of the recent articles on the homepage here?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 29, 2004 at 2:19 pm
SELECT
Column1 =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL AND PARSENAME(REPLACE(Col001, ' ', '.'), 3) IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 2)
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 3)
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 4)
END,
Column2 =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL AND PARSENAME(REPLACE(Col001, ' ', '.'), 3) IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 1)
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 3)
END,
Column3 =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL AND PARSENAME(REPLACE(Col001, ' ', '.'), 3) IS NULL THEN ''
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL THEN PARSENAME(REPLACE(Col001, ' ', '.'), 1)
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 2)
END,
Column4 =
CASE
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL AND PARSENAME(REPLACE(Col001, ' ', '.'), 3) IS NULL THEN ''
WHEN PARSENAME(REPLACE(Col001, ' ', '.'), 4) IS NULL THEN ''
ELSE PARSENAME(REPLACE(Col001, ' ', '.'), 1)
END
FROM table
Gives me this:
A S CONSTRUCTION INC
A S CONSTRUCTION INC
A S CONSTRUCTION INC
AABY TAMARA J
AADNESEN MARJORIE
AARANT MAUREEN R
AARON MICHELLEE
AARON PATRICIA J
ABBA DAVID W
ABBATE ANTHONY P
ABBOTT CHARLES C
ABBOTT FRED
ABBOTT JAMES P
ABBOTT JULIA L
ABBOTT KEITH E
ABBOTT MIKE E
NULL NULL
As you can see, the elements are broken up into their own column but I can only assign 4. The last record will show up NULL NULL. This is going to be used constantly. I have millions of records I scrub threw all the time.
December 29, 2004 at 2:20 pm
Sorry. I guess the code tags did not work. By the way, the last record has 5 elements.
December 30, 2004 at 12:38 am
Yuk, while PARSENAME is really handy, you can split only up to 4 parts. As your possible outcomes ranges from 2 to apparently 5 parts, I suspect every T-SQL solution to be some kind of ugly. I think you can do this with DTS, but for that kind of question I'm the wrong one, since I don't use DTS. What about doing the scrubbing with some scripting language outside SQL Server and then import? Or use Access with its text import functionality (which is very similar to Excel) as an intermediate stage? Getting "better" raw data isn't possible, right?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply