SSIS Fuzzy Lookup Logic

Question

SSIS Fuzzy Lookup Logic

modelenoir

SSC Journeyman

Points: 93
More actions
March 25, 2009 at 11:51 am

#202562

What logic/algorithms does the fuzzy lookup object use to reconcile matches? I can't really find anything about it on the web anywhere.
I accept the possibility that there just isn't an answer outside of Microsoft source code, so please tell me to forget about it if you think this isn't something that is available to the public.
Thanks,
Mark

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

albertarun SSC Veteran Points: 231 More actions · Answer 1

Recently I happened to worked with Microsoft SSIS consultants face to face to improve one of the fuzzy matching process I have developed. But the thing is when comes to understanding the logic used to derive the _Similarity score from individual similarity score...... he didnt had much to explain. The only take away from the meeting was the _similarity score is based on the frequency of tokens appearing in the Error Tolerance Index table.

But if you are looking for the surface knowledge of how it works... see these articles.

http://msdn.microsoft.com/en-us/library/ms345128(SQL.90).aspx

brian.anderson SSC Enthusiast Points: 146 More actions · Answer 2

If I was a betting man, I would say they used the Jaro-Winkler distance metric

http://en.wikipedia.org/wiki/Jaro-Winkler

...if I was a betting man that is 😎

brian.anderson SSC Enthusiast Points: 146 More actions · Answer 3

or maybe even Levenshtein:

http://en.wikipedia.org/wiki/Levenshtein_distance

One of those pretty much gets the job done and is easy to implement as a class 😉