need help with loading a string

  • Hi,

    Please help me ,

    I have a string column in my source table which has the following sample data

    ----------------------------------------------------

    String

    ----------------------------------------------------

    122 campbell ave., west haven, CT 00000-0000

    248 mac arthur blvd., irving, TX 11111-1111

    ----------------------------------------------------

    the destination table has the following columns

    Address city state Zip ZipExtension

    I have to load the address in the adress field, city into city field, state into state, zip into zip, and zip extension into zip extension.

    Thanks in advance.

  • How consistent is the source data? Does it ALWAYS have commas where you've listed them in your samples?

    If so, substrings based on the commas should do what you need. But you'll need to verify the consistency of the data before going down that route.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hi,

    Thank you for your reply

    The source data is consistent. the only problem is there is no ',' in the string between the state field and the zip and the zip extention

  • Try something like this:

    DECLARE @String VARCHAR(100);

    SELECT @String = '122 campbell ave., west haven, CT 00000-0000';

    SELECT

    SUBSTRING(@String,

    0,

    CHARINDEX(',', @String)),

    SUBSTRING(@String,

    CHARINDEX(',', @String)+1,

    CHARINDEX(',', @String, CHARINDEX(',', @String)+1)-CHARINDEX(',', @String)-1),

    SUBSTRING(@String,

    CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+2, 2),

    SUBSTRING(@String,

    CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+5, 5),

    SUBSTRING(@String,

    CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+11, 4);

    It uses charindex to find the first and second commas, then uses those to determine string breaks, then breaks the up state and Zip based on length and position.

    It will only work if the data is completely consistent in formatting.

    Otherwise, I recommend moving out of T-SQL and into a RegEx in .NET. That'll probably be more efficient even if the data is consistent.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thank you soo much....

  • There is one more added complexity , in some recordds the state has for example

    Arizona

    New Jersy

    instead of just their abbreviation

  • sample data

    ----------------------------------------------------

    String

    ----------------------------------------------------

    122 campbell ave., west haven, Arizona 00000-0000

    248 mac arthur blvd., irving, New York 11111-1111

    123 xyzs, abcd, New Jersy 22222-2222

    ----------------------------------------------------

  • kevin4u06 (11/22/2010)


    There is one more added complexity , in some recordds the state has for example

    Arizona

    New Jersy

    instead of just their abbreviation

    Build a table of long names and short names and do a post split lookup.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • kevin4u06 (11/22/2010)


    There is one more added complexity , in some recordds the state has for example

    Arizona

    New Jersy

    instead of just their abbreviation

    The spaces in this are going to complicate things greatly here if you try to deal with it.

    In this solution, I'm initially skipping over the state to get the zip, then going back to get the state.

    This method is pretty similar to GSquared's. I'm using nested CTEs to get the position of the commas and dash, and then using substring to get the data.

    How does this work for you?

    -- See how this starts off with a table and data in it?

    -- If you had provided us the data in this format,

    -- it would have made things easier for all of the

    -- volunteers on this site to help you out.

    DECLARE @test-2 TABLE (String varchar(500));

    INSERT INTO @test-2

    SELECT '122 campbell ave., west haven, Arizona 00000-0000' UNION ALL

    SELECT '248 mac arthur blvd., irving, New York 11111-1111' UNION ALL

    SELECT '123 xyzs, abcd, New Jersy 22222-2222' UNION ALL

    SELECT '122 campbell ave., west haven, CT 00000-0000' UNION ALL

    SELECT '248 mac arthur blvd., irving, TX 11111-1111';

    WITH CTE1 AS

    (

    -- get the position of the first comma

    SELECT String, Pos1 = CharIndex(',',String)

    FROM @test-2 t1

    ), CTE2 AS

    (

    -- get the position of the second comma

    SELECT String, Pos1, Pos2 = CharIndex(',', String, Pos1+1)

    FROM CTE1

    ), CTE3 AS

    (

    -- get the position of the dash separating the zip code

    SELECT String, Pos1, Pos2, Pos3 = CharIndex('-', String, Pos2+1)

    FROM CTE2

    )

    -- get the individual strings

    SELECT String,

    Addr = LEFT(String, Pos1-1),

    City = Substring(String, Pos1+1, Pos2-Pos1-1),

    -- for the state, get the difference of the position of

    -- the dash, and the start of the city. This minus 6

    -- characters (the first 5 of the zip, and the space

    -- before) is the width of the state, regardless of

    -- it being 2 characters, or more, or with spaces.

    [State] = ltrim(Substring(String, Pos2+1, Pos3-Pos2-6)),

    Zip5 = Substring(String, Pos3-5, 5), -- assume always 5

    Zip4 = Substring(String, Pos3+1, 4) -- assume always 4

    FROM CTE3;

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thanks a lot guyz that was helpful

  • WayneS (11/22/2010)


    kevin4u06 (11/22/2010)


    There is one more added complexity , in some recordds the state has for example

    Arizona

    New Jersy

    instead of just their abbreviation

    The spaces in this are going to complicate things greatly here if you try to deal with it.

    In this solution, I'm initially skipping over the state to get the zip, then going back to get the state.

    This method is pretty similar to GSquared's. I'm using nested CTEs to get the position of the commas and dash, and then using substring to get the data.

    How does this work for you?

    My favorite method of dealing with such a thing is taking the data provider out for a world famous pork chop dinner. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Sorry for the late response. I was off for the thanks giving weekend. You didnt post your solution here.

  • kevin4u06 (11/29/2010)


    Hi Jeff,

    Sorry for the late response. I was off for the thanks giving weekend. You didnt post your solution here.

    Heh... I was off for the Thanks Giving weekend, myself. Besides, you've been given some good code examples by the others and some good suggestions like deploying a State Name table with both long names and abbreviations. Is there something else you require?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. I finally got the code i need. Thanks evry1 for helping me out.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply