Efficient way of breaking addresses into numbers and words

  • Hi ALL,

    I have an address field which contains numbers and words.

    I have num1, num2 ,wrd1,wrd2......wrd8 columns i.e. 2 numbers and 8 words columns.

    Now from the address field i have to pick the largest no and insert into num1..and second largest no into num2. Similarily largest word (in length) into wrd1, 2nd largest into wrd2....etc.

    What would be the most effiecient way to do the above process ?

    Rgds,

    Rajesh

  • If there are no errors in your data (i.e each field contains 9 commas and the first two entries in your list are always positive integers), and if I understand what you want correctly, then you could try the following, which I believe is pretty efficient:

     

    -- Build and populate a Numbers table

    create table Numbers(n int identity(1, 1) primary key, Dummy char(1))

    go

    insert Numbers (Dummy) select top 8000 '' from sysobjects o1 cross join sysobjects o2

    go

    alter table Numbers drop column Dummy

    go

    -- Create test data

    declare @Adress table(id int identity(1, 1) primary key, adr varchar(8000))

    insert @Adress (adr) select '12,34534,fsdg,regsd,sdfg,sdfg,asd,dsf,r,hhhhhhhhhhh'

    insert @Adress (adr) select '123,45,fsxxdg,reagsd,sdssfg,sqdfg,asqd,dsaaf,ra,hhh'

    -- Split data and collect it in another table

    declare @AdressParts table(id int identity(1,1) primary key, adrid int, adrpart varchar(8000))

    insert @AdressParts (adrid, adrpart)

    select id, substring(adr, n, charindex(',', adr + ',', n) - n)

    from Numbers cross join @Adress

    where

    substring(',' + adr, n, 1) = ','

    order by id, n

    -- Find max number and max length

    select

    adrid,

    max(case when id % 10 in (1, 2) then cast(adrpart as int) else 0 end),

    max(case when id % 10 not in (1, 2) then len(adrpart) else 0 end)

    from @AdressParts

    group by adrid

     

  • ...Sorry, I misunderstood... 

    Replace the last part by

    select

    adrid,

    adrpart

    from @AdressParts

    order by

    adrid,

    case when id % 10 in (1,2) then 0 else 1 end,

    case when id % 10 in (1,2) then cast(adrpart as int) else 0 end,

    case when id % 10 not in (1, 2) then len(adrpart) else 0 end

    This orders the adress parts as requested (I think ). Concatenate it to obtain the ordered string back, if you need it.

     

  • The efficient way is to shop for software that does address cleanup for you. No sense in recreating this particular wheel.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply