Trying to separate a name field into two columns

  • I have a table with AGENT_NAME that contains last and first name separated by a space. I need to separate the last and first name out into two fields.

    Currently it looks like:

    AGENT_NAME

    ==============

    smith james

    I need the it to look like this:

    AGENT_NAME LASTNAME FIRSTNAME

    ====================================================

    smith james smith james

  • Substring and Charindex can do that pretty well. But first, what do you want to do if the name is "James W Smith", or "Morgan Mac Donald"? Do you have any records that would have something like "Dr J W Smith"?

    If it's just first-space-last, you can use this:

    declare @Name varchar(100);

    select @Name = 'James Smith';

    select right(@Name, len(@Name)-charindex(' ', @Name, 1)), left(@Name, charindex(' ', @Name, 1));

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Ok, that works for one name. I have a table with over 56,000 records that need the name separated. How do I do this with the entire table?

  • Yes, I do have some fields with the middle intial. How do I handle that?

  • If you have a large number of names to process I think reading the SQL Server Central Article

    The "Numbers" or "Tally" Table: What it is and how it replaces a loop. http://www.sqlservercentral.com/articles/TSQL/62867/

    This article walks you through splitting a comma separated list. It will get you pretty far down the road simply by substituting the space character for the comma. Then I think you will have to count how many spaces you have in each full name to decide if the second "name" is a middle initial or a last name. If there are two spaces in the name "Todd P Payne" then the second word "P" is the middle initial. But All this can get really messy if you have names like King Henry VIII or Dr. Jim Mac Donald Jr. Deciding if There are surnames or multiple middle names or last Names that have spaces in them could get real tricky.

    Todd

  • RubyRed (4/9/2009)


    I have a table with AGENT_NAME that contains last and first name separated by a space. I need to separate the last and first name out into two fields.

    Currently it looks like:

    AGENT_NAME

    ==============

    smith james

    I need the it to look like this:

    AGENT_NAME LASTNAME FIRSTNAME

    ====================================================

    smith james smith james

    It's a logically impossible task.

    You may build something what will work with James Smith correctly, but it will always fail for such names as :

    Ludwig van Beethoven

    Wolfgang Amadeus Mozart

    or, must be you favorite, from Mexico:

    El. Guillermo del Toro,

    San Juan del Norte

    _____________
    Code for TallyGenerator

  • Of course Sergiy is right, The idea of building a name parser that is 100% accurate when there are no rules about the names to parse is absolute lunacy. But, because people live in the real world, people are often expected to do the impossible, or at least come close.

    I found one article at

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=56499

    The focus of this article is to parse and format the name. I guess it is trying to bring some order to the chaos. I like this article because the author of the code challenges the readers to submit names that will break the code.

    Even Microsoft has Vb code that could be used in a CLR function.

    http://support.microsoft.com/kb/168799

    This function is much simpler and is not nearly as refined. But a CLR approach may, as logic and string manipulation get more complicated, be the way to increase the performance. Or perhaps even a bulk export, parse, and import.

    But in the end Sergiy is right, so don't promise perfection. In the end you have to go with a "That's good enough approach." I don't know about the rest of you but I hate it when the real world messes up my models!!

  • Here's one way to go:

    Select parsename(Replace(NameCol, ' ', '.'), 0) as LastName

    , parsename(Replace(NameCol, ' ', '.'), 1) as FirstName

    From YourTable

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (4/10/2009)


    Here's one way to go:

    Select parsename(Replace(NameCol, ' ', '.'), 0) as LastName

    , parsename(Replace(NameCol, ' ', '.'), 1) as FirstName

    From YourTable

    Simple solution :).

    But the question is is this requirement from Front-end application??

  • Barry:

    Where did you get using 0 in parsename? I'd never seen that before, and it's not in BOL for 2k5, so I tried this:

    select parsename('gus.gwynne',0)

    Result is null. Put in a 1, and I get the correct last name, but even that's no good if the name were "Sven Svenson XXIII" (one of my favorite ever fake names). You'd get "XXIII" as the last name. Same goes for Jr, Sr, PhD, etc., tacked onto the end of the name.

    Ruby:

    On the one I posted, you just switch out the variable for the column name, in the select statement. As mentioned, it won't work except in very simple cases.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Todd:

    I just looked at the name parser at SQLTeam that you linked to. It's a useless toy. You have to enter the format codes for each name. By the time you've done that, you might as well hand-parse the whole table yourself. It doesn't save any work, it's just elaborate code for the sake of elaborate code.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Ruby:

    What I'm going to suggest is use a simple split, then select all rows where you end up with a space in the first or last name, after it's been split. That should narrow down the number you'll have to look at quite a bit, as most will probably be two-piece-names.

    Once you have it in that format, look at the variations you have. See if they follow patterns you can break up in a set-based fashion. Keep in mind that "Mac" is tricky, because it can be "MacDonald", or it can be "Macy", and those might follow different rules (something I found in name-case code).

    But if you do a simple split first, it should reduce the amount of work to do significantly.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (4/13/2009)


    Barry:

    Where did you get using 0 in parsename? I'd never seen that before, and it's not in BOL for 2k5, so I tried this:

    select parsename('gus.gwynne',0)

    From flawed reasoning and a failure to double-check.

    See, I wrote up my initial "solution"and then I thought "wait, I can't remember if it is 1's based or 0's based. Well I have to test it anyway, so I'll just test the one's" like so:Select parsename(Replace('smith james', ' ', '.'), 1) as LastName and it returns "james" so I think "OK, it's zero-based" and finish the query and then post it as you saw it finally thinking "I just checked it, so no need to check it again!"

    Heh. Of course, it is actually one-based, but from the right, not the left ... 🙁 So thanks for the catch, Gus.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ah. Makes sense. Thanks.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Sorry you feel the coded is useless. I was thinking of using format Name function from SqlTeams posting something like the code below, . or perhaps altering the logic in the function to a stored procedure with output parameters for each of the name parts.

    something like

    [Code]

    --Create sample table

    CREATE TABLE [dbo].[FullNames]

    (

    [FullName] [varchar](255) NOT NULL,

    CONSTRAINT [PK_Names] PRIMARY KEY CLUSTERED ([FullName] ASC) ON [PRIMARY]

    );

    GO

    --add sample Data

    INSERT INTO dbo.FullNames(FullName) Values ('Todd Payne');

    INSERT INTO dbo.FullNames(FullName) Values ('Todd P. Payne');

    INSERT INTO dbo.FullNames(FullName) Values ('Todd Payne Jr.');

    INSERT INTO dbo.FullNames(FullName) Values ('Payne, Todd Jr.');

    INSERT INTO dbo.FullNames(FullName) Values ('Dr. Todd Payne Sr.');

    INSERT INTO dbo.FullNames(FullName) Values ('Todd P. Mac Donald');

    GO

    --Parse out First Last and middle initial using SQLTeams Function

    SELECT

    FullName,

    dbo.FormatName([FullName],'F') as FirstName,

    dbo.FormatName([FullName],'L') as LastName,

    dbo.FormatName([FullName],'m') as MiddleInitial

    FROM dbo.FullNames;

    Go

    --Results

    FullName FirstName LastName MiddleInitial

    -------------------- ---------- --------------- -------------

    Dr. Todd Payne Sr. Todd Payne

    Payne, Todd Jr. Todd Payne

    Todd P. Mac Donald Todd Mac Donald P

    Todd P. Payne Todd Payne P

    Todd Payne Todd Payne

    Todd Payne Jr. Todd Payne

    [/code]

    This example may not be the the most efficient method but I would find it very useful for a one time conversion on large data set. I used to have to perform data-conversions for Utility billing systems and I would have really liked to have some code like this when all I had was a single name field. It is not perfect but at least a step in the right direction. I guess I am just a hack but I would hate having to write the procedure from scratch.

    Todd

    Todd

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply