Get Last Name, First name and Middle Name from Full Name

  • We have a need to break out the first, middle and last names from a full name string. I've got the last and middle name/initial close enough for what we need but can't get the right code to extract the first name. Any help is greatly appreciated!

    /* Query to parse the full_name field and extract

    Last Name, First Name and MI

    nheyen 23-Nov-2009

    */

    CREATE TABLE #names (full_name varchar(254))

    INSERT INTO #names VALUES ('Simpson, Homer J')

    INSERT INTO #names VALUES ('Simpson, Bartholomew JoJo')

    INSERT INTO #names VALUES ('Simpson,Maggie')

    INSERT INTO #names VALUES ('Bouvier Simpson, Margorie "Marge"')

    INSERT INTO #names VALUES ('Simpson, Lisa M.')

    INSERT INTO #names VALUES ('Burns,Charles Montgomery "Monty"')

    INSERT INTO #names VALUES ('Smithers, Waylon Jr.')

    SELECT full_name

    ,SUBSTRING(full_name,1,(CHARINDEX(',',full_name))-1) AS [Last Name]

    ,CASE

    /* if no period in MI */

    WHEN RIGHT(full_name,2) LIKE ' [a-Z]'

    THEN LTRIM(SUBSTRING(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,LEN(full_name)),1,CHARINDEX(' ',full_name)-2))

    /* if period after MI */

    WHEN RIGHT(full_name,1) LIKE '.'

    THEN LTRIM(SUBSTRING(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,LEN(full_name)),1,CHARINDEX(' ',full_name)-3))

    /* if middle name used instead of MI */

    WHEN RIGHT(full_name,2) LIKE '[a-Z][a-Z]'

    THEN LTRIM(SUBSTRING(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,LEN(full_name)),1,LEN(full_name)))

    /* no middle name or MI */

    ELSE LTRIM(SUBSTRING(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,LEN(full_name)),1,LEN(full_name)))

    END AS [First Name]

    ,CASE

    WHEN RIGHT(full_name,2) LIKE ' [a-Z]' THEN RIGHT(full_name,1) /* no period after MI */

    WHEN RIGHT(full_name,3) IN ('Jr.','Sr.') THEN NULL

    WHEN RIGHT(full_name,2) IN ('Jr','Sr') THEN NULL

    WHEN RIGHT(full_name,1) = '.' THEN SUBSTRING(full_name,LEN(full_name)-1,1) /* has period after MI */

    ELSE NULL /* cannot determine */

    END AS [MI]

    FROM #names

    DROP TABLE #names

  • I should add this is part of a much larger extract to provide data to a third party, so I don't know if I can get by using variables. The speed isn't critical, it is only run once a month at most.

    Thanks!

    Norman

  • Can you please provide the results you expect based on that sample data?

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Sorry, should have done that - here is what I would like to get

    Simpson, Homer JSimpsonHomer J

    Simpson, Bartholomew JoJoSimpsonBartholomewNULL

    Simpson,MaggieSimpsonMaggieNULL

    Bouvier Simpson, Margorie "Marge"Bouvier SimpsonMargorieNULL

    Simpson, Lisa M.SimpsonLisa M

    Burns,Charles Montgomery "Monty"BurnsCharlesNULL

    Smithers, Waylon Jr.SmithersWaylonNULL

    Basically if there is no obvious middle initial given, return a NULL instead of trying to figure it out. What I'm trying to get is the first substring after the comma.

    Simpson Homer J

    simpson Bartholomew

    Simpson Maggie

    Bouvier Simpson Margorie

    Simpson Lisa M

    Burns Charles

    Smithers Waylon

  • To answer your question, you can get the first substring after the comma by doing something like this:

    CASE WHEN CHARINDEX(' ',full_name,CHARINDEX(',',full_name)+1) > 0

    THEN LTRIM(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,CHARINDEX(' ',full_name,CHARINDEX(',',full_name)+1)))

    ELSE RIGHT(full_name,DATALENGTH(full_name)-CHARINDEX(',',full_name))

    END

    This uses the third (optional) parameter of CHARINDEX to set a starting point, and uses the , as the starting point (adds one to it so that you don't find the immediate space) and finds the first space after it. If there is no space, you want to use the rest of the name, so you have the case statement.

    Because I hadn't done any heavy string manipulation in a while, I took it a few steps further. This works for the sample data... but like most string manipulation, it will fail under certain circumstances. This is just off the top of my head with no real effort to cover situations outside of those presented by the sample data, so it probably won't really work for you, but I had fun writing it, and you're welcome to play with it if you want.

    I added the calculated columns for readability (Yes, believe it or not, that IS the easy to read version, without those calculated fields or CTE's (2000 forum), it'd be much, much worse.

    CREATE TABLE #names (full_name varchar(254))

    INSERT INTO #names VALUES ('Simpson, Homer J')

    INSERT INTO #names VALUES ('Simpson, Bartholomew JoJo')

    INSERT INTO #names VALUES ('Simpson,Maggie')

    INSERT INTO #names VALUES ('Bouvier Simpson, Margorie "Marge"')

    INSERT INTO #names VALUES ('Simpson, Lisa M.')

    INSERT INTO #names VALUES ('Burns,Charles Montgomery "Monty"')

    INSERT INTO #names VALUES ('Smithers, Waylon Jr.')

    ALTER TABLE #Names

    ADD TL AS DATALENGTH(REPLACE(full_name,', ',',')),-- Length of the original Name

    LN AS CHARINDEX(',',full_name)-1,-- Length of the Last Name

    TrimLN AS

    DATALENGTH(

    ISNULL(

    REPLACE(

    STUFF(full_name,

    CHARINDEX('"',full_name), -- First "

    CHARINDEX('"',full_name,CHARINDEX('"',full_name)) -- Second "

    ,'')

    ,', ',',')

    ,REPLACE(full_name,', ',','))

    ),

    TrimName AS

    ISNULL(

    REPLACE(

    STUFF(full_name,

    CHARINDEX('"',full_name), -- First "

    CHARINDEX('"',full_name,CHARINDEX('"',full_name)) -- Second "

    ,'')

    ,', ',',')

    ,REPLACE(full_name,', ',','))

    SELECTfull_name,

    LEFT(full_name,LN) LastName,

    CASEWHEN CHARINDEX(' ',RIGHT(TrimName, (TrimLN-LN)-1),1) > 0

    THEN LEFT(RIGHT(TrimName, (TrimLN-LN)-1),CHARINDEX(' ',RIGHT(TrimName, (TrimLN-LN)-1),1))

    ELSE RIGHT(TrimName, (TrimLN-LN)-1)

    END FirstName,

    CASE WHEN CHARINDEX(' ',LTRIM(RTRIM(RIGHT(TrimName, TrimLN-LN-1)))) = 0 THEN NULL

    WHEN REPLACE(RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName)),'.','') IN ('Jr','Sr') THEN NULL

    ELSE RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName))

    END MiddleName,

    CASE WHEN CHARINDEX(' ',LTRIM(RTRIM(RIGHT(TrimName, TrimLN-LN-1)))) = 0 THEN NULL

    WHEN REPLACE(RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName)),'.','') IN ('Jr','Sr') THEN NULL

    ELSE LEFT(RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName)),1)

    END MiddleInitial,

    CASE WHEN CHARINDEX(' ',LTRIM(RTRIM(RIGHT(TrimName, TrimLN-LN-1)))) = 0 THEN NULL

    WHEN REPLACE(RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName)),'.','') IN ('Jr','Sr') THEN RIGHT(TrimName, TrimLN - CHARINDEX(' ',TrimName))

    ELSE NULL

    END Suffix,

    CASE WHEN CHARINDEX('"',full_name,CHARINDEX('"',full_name)+1) - (CHARINDEX('"',full_name)+1) > 0

    THEN

    SUBSTRING(full_name,

    CHARINDEX('"',full_name)+1, -- First "

    CHARINDEX('"',full_name,CHARINDEX('"',full_name)+1) - (CHARINDEX('"',full_name)+1)-- Second "

    )

    ELSE NULL END NickName

    FROM #names

    DROP TABLE #names

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Thank you Seth, new functions to learn about. I'll try this when I get back to work tomorrow. (We had a big snow storm and I'm at home today.)

    I've been working with this for a week or so and kind of running out of things to try.

    Norman

  • Divid'n'Conquer...

    CREATE TABLE #names (full_name varchar(254))

    INSERT INTO #names VALUES ('Simpson, Homer J')

    INSERT INTO #names VALUES ('Simpson, Bartholomew JoJo')

    INSERT INTO #names VALUES ('Simpson,Maggie')

    INSERT INTO #names VALUES ('Bouvier Simpson, Margorie "Marge"')

    INSERT INTO #names VALUES ('Simpson, Lisa M.')

    INSERT INTO #names VALUES ('Burns,Charles Montgomery "Monty"')

    INSERT INTO #names VALUES ('Smithers, Waylon Jr.')

    ;

    WITH

    cteStart AS

    (

    SELECT SUBSTRING(full_name,1,CHARINDEX(',',full_name)-1) AS LastName,

    LTRIM(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,254)) AS FirstNameStart

    FROM #Names

    )

    ,

    cteMiddle AS

    (

    SELECT LastName,

    SUBSTRING(FirstNameStart,1,ISNULL(NULLIF(CHARINDEX(' ',FirstNameStart),0),254)) AS FirstName,

    REPLACE(LTRIM(SUBSTRING(FirstNameStart,NULLIF(CHARINDEX(' ',FirstNameStart),0),254)),'.','') AS MiStart

    FROM cteStart

    )

    SELECT LastName,

    FirstName,

    CASE WHEN LEN(MiStart) = 1 THEN MiStart ELSE NULL END AS MI

    FROM cteMiddle

    LastNameFirstNameMI

    SimpsonHomer J

    SimpsonBartholomew NULL

    SimpsonMaggieNULL

    Bouvier SimpsonMargorie NULL

    SimpsonLisa M

    BurnsCharles NULL

    SmithersWaylon NULL

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I recently had a need to accomplish the same task. I cam across this UDF that satisfied all of my criteria. Hope it helps.

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=56499

  • Thank you Jeff -

    When I copy and paste and run this code, I get errors. Is this SQL2000 code? Or does it need SQL2005?

    The errors are:

    Msg 156, Level 15, State 1, Line 11

    Incorrect syntax near the keyword 'WITH'.

    Msg 170, Level 15, State 1, Line 17

    Line 17: Incorrect syntax near ','.

    Any thoughts?

    Norman

  • Thank you Seth,

    I think this will do just what I need - apologies to those who read my erroneous message about errors.

    Thanks again,

    Norman

  • n.heyen (12/10/2009)


    Thank you Jeff -

    When I copy and paste and run this code, I get errors. Is this SQL2000 code? Or does it need SQL2005?

    The errors are:

    Msg 156, Level 15, State 1, Line 11

    Incorrect syntax near the keyword 'WITH'.

    Msg 170, Level 15, State 1, Line 17

    Line 17: Incorrect syntax near ','.

    Any thoughts?

    Norman

    Ahg....my apologies. I forgot I was in a 2k forum. I'll see if I can flip it to 2k code tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/10/2009)


    Ahg....my apologies. I forgot I was in a 2k forum. I'll see if I can flip it to 2k code tonight.

    Way to try and cheat Jeff. :hehe:

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Heh... try hell! I did cheat! I just got caught. :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's a bit of code for SQL Server 2000 that's equivalent to the 2k5 code I "cheated" with.:-D It uses "derived tables" instead of CTE's. Both are sometimes referred to as "inline views".

    CREATE TABLE #names (full_name varchar(254))

    INSERT INTO #names VALUES ('Simpson, Homer J')

    INSERT INTO #names VALUES ('Simpson, Bartholomew JoJo')

    INSERT INTO #names VALUES ('Simpson,Maggie')

    INSERT INTO #names VALUES ('Bouvier Simpson, Margorie "Marge"')

    INSERT INTO #names VALUES ('Simpson, Lisa M.')

    INSERT INTO #names VALUES ('Burns,Charles Montgomery "Monty"')

    INSERT INTO #names VALUES ('Smithers, Waylon Jr.')

    SELECT LastName,

    FirstName,

    CASE WHEN LEN(MiStart) = 1 THEN MiStart ELSE NULL END AS MI

    FROM (

    SELECT LastName,

    SUBSTRING(FirstNameStart,1,ISNULL(NULLIF(CHARINDEX(' ',FirstNameStart),0),254)) AS FirstName,

    REPLACE(LTRIM(SUBSTRING(FirstNameStart,NULLIF(CHARINDEX(' ',FirstNameStart),0),254)),'.','') AS MiStart

    FROM (

    SELECT SUBSTRING(full_name,1,CHARINDEX(',',full_name)-1) AS LastName,

    LTRIM(SUBSTRING(full_name,CHARINDEX(',',full_name)+1,254)) AS FirstNameStart

    FROM #Names

    )dtStart

    )dtMiddle

    As a sidebar for those that need to make such a "backwards" conversion, I didn't have to change any of the SELECT's, really... I just had to rearrange them from being CTE's to being "derived tables". Of course, that won't work if you made a recursive CTE. And, if a CTE is used more than once, you'll have to make the same derived table more than once.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/10/2009)


    Heh... try hell! I did cheat! I just got caught. :blush:

    I mean, if you were gonna cheat, you could have at least done nickname as well. Poor Monty.:-D

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply