October 12, 2007 at 1:32 am
Hi There,
I have a table, holding both streetnames and housenumbers/Housenumberintervals/letters in the same column,
StreetnamesNumbersLetters
Appelstreet, city 345 A
James Hight 107-109
Upper 4 - 8
Sunstreet 2
and I need to seperate streetnames, housenumbers and letters in three seperate columns, like this
Streetnames Numbers Letters
Appelstreet, city 345 A
James Hight 107
James Hight 109
Upper 4
Upper 6
Upper 8
Sunstreet 2
I have no idea how to this, all help is appreciated.
joejoe
October 12, 2007 at 2:11 am
Information may have been lost during the concatenation/storing this data in a single column. Probably it will not always be possible to get this information back. E.g 13-19 may need to be expanded to 13, 15 A, 15 B, 17, 19. Do you have any data to validate the results? Is your data consistent, i.e. does every entry satisfy the simple rule of {street name} {number|range} [{letter}] format. If it does you can first expand the ranges, and then split the data based on the location of the number (CHARINDEX, LEFT, RIGHT). If not, you will need to identify the rows that do not follow the above syntax, and and decide what to do with them. If there are too many exceptions, see if these can be parsed using a different syntax.
It is a pain to check whether a particular column in a table satisfies a regular expression. You probably want to use an application written in a language like C# that has libraries supporting regular expression matching.
Regards,
Andras
October 12, 2007 at 2:43 am
Is the database really the best place to do this?
However providing the data is always in the format specified then this is how I would do it
Create a temp table and fill it with your data
DECLARE @temp TABLE (StreetnamesNumbersLetters varchar(100),Streetnames varchar(30),Numbers varchar(30),Letters varchar(30))
insert into @temp
select 'Appelstreet, city 345 A','','','' union
select 'James Hight 107-109','','','' union
select 'Upper 4 - 8','','','' union
select 'Sunstreet 2','','',''
Move any data with no number !!!!
UPDATE @temp
SET Streetnames = StreetnamesNumbersLetters,StreetnamesNumbersLetters = ''
WHERE StreetnamesNumbersLetters <> ''
AND PATINDEX('%[0-9]%',StreetnamesNumbersLetters) = 0
Move text up to first digit
UPDATE @temp
SET Streetnames = LEFT(StreetnamesNumbersLetters,PATINDEX('%[0-9]%',StreetnamesNumbersLetters)-1),
StreetnamesNumbersLetters = STUFF(StreetnamesNumbersLetters,1,PATINDEX('%[0-9]%',StreetnamesNumbersLetters)-1,'')
WHERE StreetnamesNumbersLetters <> ''
Move any data from letter onwards (if present)
UPDATE @temp
SET Letters = SUBSTRING(StreetnamesNumbersLetters,PATINDEX('%[a-z]%',StreetnamesNumbersLetters),255),
StreetnamesNumbersLetters = STUFF(StreetnamesNumbersLetters,PATINDEX('%[a-z]%',StreetnamesNumbersLetters),255,'')
WHERE StreetnamesNumbersLetters <> ''
AND PATINDEX('%[a-z]%',StreetnamesNumbersLetters) > 0
Move single numbers
UPDATE @temp
SET Numbers = StreetnamesNumbersLetters,StreetnamesNumbersLetters = ''
WHERE StreetnamesNumbersLetters <> ''
AND CHARINDEX('-',StreetnamesNumbersLetters) = 0
Create additional rows for number range
INSERT INTO @temp (StreetnamesNumbersLetters,Streetnames,Numbers,Letters)
SELECT '',t.Streetnames,CAST(n.number as varchar),t.Letters
FROM @temp t
INNER JOIN dbo.Numbers n
ON n.Number BETWEEN CAST(LTRIM(RTRIM(LEFT(t.StreetnamesNumbersLetters,CHARINDEX('-',t.StreetnamesNumbersLetters)-1))) as int)+1
AND CAST(LTRIM(RTRIM(SUBSTRING(t.StreetnamesNumbersLetters,CHARINDEX('-',t.StreetnamesNumbersLetters)+1,255))) as int)
AND n.Number % 2 = CAST(LTRIM(RTRIM(LEFT(t.StreetnamesNumbersLetters,CHARINDEX('-',t.StreetnamesNumbersLetters)-1))) as int) % 2
WHERE t.StreetnamesNumbersLetters <> ''
Move first number
UPDATE @temp
SET Numbers = LEFT(StreetnamesNumbersLetters,CHARINDEX('-',StreetnamesNumbersLetters)-1),StreetnamesNumbersLetters = ''
WHERE StreetnamesNumbersLetters <> ''
Far away is close at hand in the images of elsewhere.
Anon.
October 12, 2007 at 9:36 pm
If you want something slick to split addresses and validate them at the same time, look into a product called "ZP4".
--Jeff Moden
Change is inevitable... Change for the better is not.
October 17, 2007 at 5:23 pm
Hi Jeff,
I looked into the ZP4, and truely looks very handy. But I don't have any adresses in USA, only in Europe.
But thanks for the tip π
October 17, 2007 at 5:47 pm
David,
Thanks for posting the scripts that I needed. I understand most of it when I see it, but I doubt I ever have the skills to write anything like it!
I have now,
- moved any data with no number
- moved text up to first digit
- moved any data from letter onwards (if present)
- moved single numbers
But the "Create additional rows for number range" part fails with the following message:
Msg 208, Level 16, State 1, Line 39
Invalid object name 'msp.dbo.Numbers'.
I have tried to rename the Number column with several combinations (n.Numbers & Number & n.number) with no luck. Any suggestions?
Best joejoe
October 18, 2007 at 1:53 am
...but I doubt I ever have the skills to write anything like it!
Sure you will, we all started as beginners π
Now we try to elevate ourselves to Jeffs' level π
Invalid object name 'msp.dbo.Numbers'.
Sorry my bad, used one of my tables in testing and forgot to remove db name, I have removed the reference from the post
However the Numbers table is just that, a table of numbers, this site has many examples
Far away is close at hand in the images of elsewhere.
Anon.
October 18, 2007 at 8:47 pm
joe joe (10/17/2007)
Hi Jeff,I looked into the ZP4, and truely looks very handy. But I don't have any adresses in USA, only in Europe.
But thanks for the tip π
Ack... my bad... didn't realize the continent never mind the country π
--Jeff Moden
Change is inevitable... Change for the better is not.
October 18, 2007 at 8:48 pm
Now we try to elevate ourselves to Jeffs' level
Heh... nah... it's not worth all the nosebleeds due to lack of Oxygen π
--Jeff Moden
Change is inevitable... Change for the better is not.
October 19, 2007 at 2:23 am
I'm going back to Andras' post.
If you want to spilt the adderss 'Upper 4 - 8' you really do need to know the actual addresses used in 'Upper'. It may be that originally Upper had 3 buildings named 'Upper 4', 'Upper 6' and 'Upper 8', but these were replaced over time with a single building with one street door called 'Upper 4 - 8'. Equally, Upper may have had consecutive numbering ('Upper 4', 'Upper 5', etc.) - this is not uncommon in Europe.
Whatever was done to 'Upper' in the past, if you send the same mail to 'Upper 4', 'Upper 6', 'Upper 8' you will look less than professional.
I would advise against splitting addresses such as 'Upper 4 - 8' unless you know that such a split makes sense with what is on the ground today.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop HΓ©lder CΓ’mara
October 19, 2007 at 10:26 am
Unless you are going to use some sort of address validation software to verify your change you will most likely completely scramble the data more than humpty dumpty. I would discourage against it. Even getting people to enter data in a valid manor is difficult how will you maintain it? There are just too many possibilities.
As good as it sounds to break it up like that, any attempt with out some sort of method to validate your choice is going to possibly lose customer data.
October 19, 2007 at 10:30 am
A quick search found these guys http://www.grcdi.nl/dmtools.htm never used them so I have no recommendation to give, but they have a 30 day trial π How long will it take for you to re-format your data? However keeping it valid is the tougher part.
October 21, 2007 at 1:59 pm
Hi Guys,
Thanks for all the advices. No panic, I'm not a dba for a sql server in production, and I'm not using the addresses for contacting purposes. With the addresses in the split format I will be able to, validate and geocode them for visualisation.
But thanks for your concern π
October 21, 2007 at 2:03 pm
You bet... thanks for the feedback!
--Jeff Moden
Change is inevitable... Change for the better is not.
October 21, 2007 at 9:27 pm
In the U.S. I would use a service like Melissa Data that will accept a street address in pretty much any format and return the address broken down into it's USPS (postal service) components (e.g. house number, street, zip/postal code, etc.) I'm sure that there are similar services for the rest of the world (at least I would hope so) search for "data quality" and the country...
I would not spend a great deal of time trying to parse the addresses myself, too many variables/exceptions, you'll go crazy and probably end up with garbage.
Joe
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply