need help with loading a string

Question

need help with loading a string

kevin4u06

SSCrazy

Points: 2093
More actions
November 22, 2010 at 2:01 pm

#229540

Hi,
Please help me ,
I have a string column in my source table which has the following sample data
----------------------------------------------------
String
----------------------------------------------------
122 campbell ave., west haven, CT 00000-0000
248 mac arthur blvd., irving, TX 11111-1111
----------------------------------------------------
the destination table has the following columns
Address city state Zip ZipExtension
I have to load the address in the adress field, city into city field, state into state, zip into zip, and zip extension into zip extension.
Thanks in advance.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

How consistent is the source data? Does it ALWAYS have commas where you've listed them in your samples?

If so, substrings based on the commas should do what you need. But you'll need to verify the consistency of the data before going down that route.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

kevin4u06 SSCrazy Points: 2093 More actions · Answer 2

Hi,

Thank you for your reply

The source data is consistent. the only problem is there is no ',' in the string between the state field and the zip and the zip extention

GSquared SSC Guru Points: 260824 More actions · Answer 3

Try something like this:

DECLARE @String VARCHAR(100);

SELECT @String = '122 campbell ave., west haven, CT 00000-0000';

SELECT

SUBSTRING(@String,

0,

CHARINDEX(',', @String)),

SUBSTRING(@String,

CHARINDEX(',', @String)+1,

CHARINDEX(',', @String, CHARINDEX(',', @String)+1)-CHARINDEX(',', @String)-1),

SUBSTRING(@String,

CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+2, 2),

SUBSTRING(@String,

CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+5, 5),

SUBSTRING(@String,

CHARINDEX(',', @String, CHARINDEX(',', @String)+1)+11, 4);

It uses charindex to find the first and second commas, then uses those to determine string breaks, then breaks the up state and Zip based on length and position.

It will only work if the data is completely consistent in formatting.

Otherwise, I recommend moving out of T-SQL and into a RegEx in .NET. That'll probably be more efficient even if the data is consistent.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

kevin4u06 SSCrazy Points: 2093 More actions · Answer 4

kevin4u06

SSCrazy

Points: 2093

November 22, 2010 at 2:45 pm

#1254077

Thank you soo much....

kevin4u06 SSCrazy Points: 2093 More actions · Answer 5

There is one more added complexity , in some recordds the state has for example

Arizona

New Jersy

instead of just their abbreviation

kevin4u06 SSCrazy Points: 2093 More actions · Answer 6

sample data

----------------------------------------------------

String

----------------------------------------------------

122 campbell ave., west haven, Arizona 00000-0000

248 mac arthur blvd., irving, New York 11111-1111

123 xyzs, abcd, New Jersy 22222-2222

----------------------------------------------------

Jeff Moden SSC Guru Points: 1004361 More actions · Answer 7

kevin4u06 (11/22/2010)
There is one more added complexity , in some recordds the state has for example
Arizona
New Jersy
instead of just their abbreviation

Build a table of long names and short names and do a post split lookup.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95460 More actions · Answer 8

kevin4u06 (11/22/2010)
There is one more added complexity , in some recordds the state has for example
Arizona
New Jersy
instead of just their abbreviation

The spaces in this are going to complicate things greatly here if you try to deal with it.

In this solution, I'm initially skipping over the state to get the zip, then going back to get the state.

This method is pretty similar to GSquared's. I'm using nested CTEs to get the position of the commas and dash, and then using substring to get the data.

How does this work for you?

-- See how this starts off with a table and data in it?

-- If you had provided us the data in this format,

-- it would have made things easier for all of the

-- volunteers on this site to help you out.

DECLARE @test-2 TABLE (String varchar(500));

INSERT INTO @test-2

SELECT '122 campbell ave., west haven, Arizona 00000-0000' UNION ALL

SELECT '248 mac arthur blvd., irving, New York 11111-1111' UNION ALL

SELECT '123 xyzs, abcd, New Jersy 22222-2222' UNION ALL

SELECT '122 campbell ave., west haven, CT 00000-0000' UNION ALL

SELECT '248 mac arthur blvd., irving, TX 11111-1111';

WITH CTE1 AS

(

-- get the position of the first comma

SELECT String, Pos1 = CharIndex(',',String)

FROM @test-2 t1

), CTE2 AS

(

-- get the position of the second comma

SELECT String, Pos1, Pos2 = CharIndex(',', String, Pos1+1)

FROM CTE1

), CTE3 AS

(

-- get the position of the dash separating the zip code

SELECT String, Pos1, Pos2, Pos3 = CharIndex('-', String, Pos2+1)

FROM CTE2

)

-- get the individual strings

SELECT String,

Addr = LEFT(String, Pos1-1),

City = Substring(String, Pos1+1, Pos2-Pos1-1),

-- for the state, get the difference of the position of

-- the dash, and the start of the city. This minus 6

-- characters (the first 5 of the zip, and the space

-- before) is the width of the state, regardless of

-- it being 2 characters, or more, or with spaces.

[State] = ltrim(Substring(String, Pos2+1, Pos3-Pos2-6)),

Zip5 = Substring(String, Pos3-5, 5), -- assume always 5

Zip4 = Substring(String, Pos3+1, 4) -- assume always 4

FROM CTE3;

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

kevin4u06 SSCrazy Points: 2093 More actions · Answer 9

kevin4u06

SSCrazy

Points: 2093

November 22, 2010 at 3:47 pm

#1254105

Thanks a lot guyz that was helpful

Jeff Moden SSC Guru Points: 1004361 More actions · Answer 10

WayneS (11/22/2010)
kevin4u06 (11/22/2010)
There is one more added complexity , in some recordds the state has for example
Arizona
New Jersy
instead of just their abbreviation
The spaces in this are going to complicate things greatly here if you try to deal with it.
In this solution, I'm initially skipping over the state to get the zip, then going back to get the state.
This method is pretty similar to GSquared's. I'm using nested CTEs to get the position of the commas and dash, and then using substring to get the data.
How does this work for you?

My favorite method of dealing with such a thing is taking the data provider out for a world famous pork chop dinner. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

kevin4u06 SSCrazy Points: 2093 More actions · Answer 11

Hi Jeff,

Sorry for the late response. I was off for the thanks giving weekend. You didnt post your solution here.

Jeff Moden SSC Guru Points: 1004361 More actions · Answer 12

kevin4u06 (11/29/2010)
Hi Jeff,
Sorry for the late response. I was off for the thanks giving weekend. You didnt post your solution here.

Heh... I was off for the Thanks Giving weekend, myself. Besides, you've been given some good code examples by the others and some good suggestions like deploying a State Name table with both long names and abbreviations. Is there something else you require?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

kevin4u06 SSCrazy Points: 2093 More actions · Answer 13

Thanks Jeff. I finally got the code i need. Thanks evry1 for helping me out.