Split string field based on character position

  • I have a name column that is currently set-up as follows:

    NAME

    Doe, John H.

    I would like to split it into the following three columns:

    Last_Name

    Doe

    First_Name

    John

    Middle_Initial

    H.

    Does anyone have an algorithm or string function that can accomplish this task? (If there is a string function that can split a character string based on the position of a specific character (i.e., "," or "."), I think that I can figure this out. But any ideas are appreciated).

  • See the splitter example in the following article...

    http://www.sqlservercentral.com/articles/TSQL/62867/

    Also... if guaranteed to only have 3 parts, lookup PARSENAME in Books Online.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You just have to be little careful while using any Split function as you are dealing with 2 delimiters here - comma and space

    See this example

    EDIT NOTE : My XML tags in query are getting vanished, so replace the '[ ' and ' ]' with XML tags; after copying the query

    DECLARE @t TABLE (id INT, data VARCHAR(100))

    INSERT INTO @t SELECT 1,'Doe, John H.'

    INSERT INTO @t SELECT 2,'Pardeshi, Mangal'

    INSERT INTO @t SELECT 3,'Sharma, A B'

    ;WITH cte AS (

    SELECT

    id,

    CAST('[ i ]' + REPLACE(replace(data,',',''), ' ','[ /i ][ i ]') + '[ /i ]' AS XML) AS data[/size][/size]

    FROM @t

    )

    ,Numbers as

    (

    select 1 as number

    UNION ALL

    SELECT Number+1

    from Numbers

    WHERE Number <3

    )

    SELECT Id, [1] as LastName, [2] FirstName, [3] MiddleName

    FROM (

    SELECT

    c.id,

    p.number as Position,

    x.value('.','VARCHAR(100)') AS Name

    FROM cte c

    CROSS JOIN numbers p

    CROSS APPLY data.nodes('/i[position()=sql:column("number")]') n(x)

    ) m

    PIVOT

    (

    MAX(Name)

    FOR Position IN ([1],[2],[3])

    )P

    John Smith

  • Try this function;

    set ANSI_NULLS ON

    set QUOTED_IDENTIFIER ON

    go

    -- Select * From [dbo].[fnSplit] ('A,b' , ',')

    ALTER FUNCTION [dbo].[fnSplit]

    (@pString varchar(5000),@pSplitChar char(1))

    returns @tblTemp table (tid int,value varchar(1000))

    as

    begin

    declare @vStartPositionint

    declare @vSplitPositionint

    declare @vSplitValuevarchar(1000)

    declare @vCounterint

    set @vCounter=1

    select @vStartPosition = 1,@vSplitPosition=0

    set @vSplitPosition = charindex( @pSplitChar , @pString , @vStartPosition)

    if (@vSplitPosition=0 and len(@pString) != 0)

    begin

    INSERT INTO @tblTemp

    (

    tid,

    value

    )

    VALUES

    (

    1,

    @pString

    )

    return--------------------------------------------------------------->>

    end

    set @pString=@pString+@pSplitChar

    while (@vSplitPosition > 0 )

    begin

    set @vSplitValue = substring( @pString , @vStartPosition , @vSplitPosition - @vStartPosition )

    set @vSplitValue = ltrim(rtrim(@vSplitValue))

    INSERT INTO @tblTemp

    (

    tid,

    value

    )

    VALUES

    (

    @vCounter,

    @vSplitValue

    )

    set @vCounter=@vCounter+1

    set @vStartPosition = @vSplitPosition + 1

    set @vSplitPosition = charindex( @pSplitChar , @pString , @vStartPosition )

    end

    return

    end

    Hope this will help.

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Sometimes, winning is not an issue but trying.
    You can check my BLOG
    [font="Arial Black"]here[/font][/url][/right]

  • Here's a quick sample using Jeffs' suggestion of PARSENAME:

    DROP TABLE #Sample

    CREATE TABLE #Sample (id INT, PersonName VARCHAR(100))

    INSERT INTO #Sample (id, PersonName)

    SELECT 1,'Doe, John H.' UNION ALL

    SELECT 2,'Pardeshi, Mangal' UNION ALL

    SELECT 3,'Sharma, A B' UNION ALL

    SELECT 4,'Doe John ' UNION ALL

    SELECT 5,'Doe John A B ' UNION ALL

    SELECT 6,'Smiffy'

    SELECT *,

    CASE WHEN Surname = SUBSTRING(PersonName, 1, LEN(Surname)) THEN 'ok' ELSE 'check' END AS Status

    FROM (

    SELECT id, PersonName,

    REPLACE(REVERSE(PARSENAME(REPLACE(REPLACE(REVERSE(PersonName+'#'), ' ,', '.'), ' ', '.'), 2)), '#', '') AS Forename1,

    REPLACE(REVERSE(PARSENAME(REPLACE(REPLACE(REVERSE(PersonName+'#'), ' ,', '.'), ' ', '.'), 3)), '#', '') AS Forename2,

    REPLACE(REVERSE(PARSENAME(REPLACE(REPLACE(REVERSE(PersonName+'#'), ' ,', '.'), ' ', '.'), 4)), '#', '') AS Forename3,

    REPLACE(REVERSE(PARSENAME(REPLACE(REPLACE(REVERSE(PersonName+'#'), ' ,', '.'), ' ', '.'), 1)), '#', '') AS Surname

    FROM #Sample

    ) d

    Returns:

    id PersonName Forename1 Forename2 Forename3 Surname Status

    ---- ---------------- ---------- ---------- ---------- ---------- ------

    1 Doe, John H. John H Doe ok

    2 Pardeshi, Mangal Mangal NULL NULL Pardeshi ok

    3 Sharma, A B A B NULL Sharma ok

    4 Doe John John NULL Doe ok

    5 Doe John A B NULL NULL NULL NULL check

    6 Smiffy NULL NULL NULL Smiffy ok

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks everyone! I greatly appreciate the help. Now I have a few options to try.

  • gresford (3/2/2009)


    Thanks everyone! I greatly appreciate the help. Now I have a few options to try.

    Actually, I'd avoid any that use a Recursive CTE or While loop as they tend to hamper performance. The article I posted above shows why.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply