March 25, 2009 at 11:51 am
What logic/algorithms does the fuzzy lookup object use to reconcile matches? I can't really find anything about it on the web anywhere.
I accept the possibility that there just isn't an answer outside of Microsoft source code, so please tell me to forget about it if you think this isn't something that is available to the public.
Thanks,
Mark
February 26, 2010 at 10:35 pm
Recently I happened to worked with Microsoft SSIS consultants face to face to improve one of the fuzzy matching process I have developed. But the thing is when comes to understanding the logic used to derive the _Similarity score from individual similarity score...... he didnt had much to explain. The only take away from the meeting was the _similarity score is based on the frequency of tokens appearing in the Error Tolerance Index table.
But if you are looking for the surface knowledge of how it works... see these articles.
http://msdn.microsoft.com/en-us/library/ms345128(SQL.90).aspx
April 24, 2012 at 2:42 pm
If I was a betting man, I would say they used the Jaro-Winkler distance metric
http://en.wikipedia.org/wiki/Jaro-Winkler
...if I was a betting man that is 😎
April 24, 2012 at 2:44 pm
or maybe even Levenshtein:
http://en.wikipedia.org/wiki/Levenshtein_distance
One of those pretty much gets the job done and is easy to implement as a class 😉
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply