April 6, 2011 at 10:27 am
Is there a way to get how similar two values are that are on the same row in the same table? I have a table that has address info that was joined from two tables. This table now has both addresses from the two tables and I would like to know how similar those addresses are. Here's an example of what I want to get
ID AddrFromTable1 AddrFromTable2 SimilarityForAddrFromTable1AndAddrFromTable2
1 123 Main St 123 Main St 1.00
2 1 Fake St 1 Fake Street .92
Is there a way to do this in SSIs using the Fuzzy Logic?
April 6, 2011 at 11:12 am
for addresses, there are a number of services that will standardize your address info for a fee;
if your cheap or just like a challenge (like me) what I've done is add a column which I update with the "CleanAddress"
i get that by updating the address through a collection of replaces so that any abbreviations I've encountered so var are converted to a standardized description; [st],[st.][street] and [str] all are replaced to a standard keyword.
from experience, the replaces sometimes need to occur in a specific order, as Saint / [St] / [St.] needs to be taken into consideration as well.
for example, for me, i stretch abbreviations out to their full names; i know i posted an example int he forums here, let me see if my google fu is up to the task of finding that example again.
Lowell
April 6, 2011 at 11:23 am
Thanks Lowell, I didn't need it to do any clean up so i was just curious on how do get the fuzzy number between the two addresses. I actually got it to work but had to use the two separate tables. In the Fuzzy Lookup section Link the two tables on ID as well as the address fields I wanted to have a fuzzy value. Then have a conditional split after the Fuzzy Lookup section where it returns only records where ID similarity = 1.
The result only returned where the ID similarity = 1 which was 215 records out of 215 records from original table, but now I wonder if it could potentially fuzzy match on an ID where it's not equal on both tables because the better address match was for a different ID on the lookup table. If this happen how to i explicit say in the Fuzzy Matching to link on exact ID (similarity = 1 all the time for this lookup column) and then give similarity values for the address lookup columns.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply