October 7, 2002 at 8:24 pm
Is there any way to compare two string in the database..
let's say there is a table which contains company name .
for example.
ABC Inc
ABC Corp.
ABC LTD.
I like to compare one company to another and find out if it matches then display the name. comparing may be 50% to 60% .I may want to display the company name which are not 100% compatible.
appreciated if you could help me.
October 8, 2002 at 1:18 am
Exactly what do you mean by 50% If you mean the first half, then you can use something like this:
select * from yourtable
where substring(companyname,1,len(companyname)/2) = substring('ABC Corporation',1,len(companyname)/2)
But if you're just looking for those that are similar but not necessarily in the beginning of the name, then there aren't really any good alternative, but you can try SOUNDEX. Don't think you'll get to good hits though. Here is an example of what you can do with soundex (and it's companion function difference that returns the difference between two soundex-codes, with 0 being biggest difference, and 4 = closest match):
declare @a varchar(20), @b-2 varchar(20), @C varchar(20), @d varchar(20)
set @a = 'ABC Inc'
set @b-2 = 'ABC Corp.'
set @C = 'ABC LTD.'
set @d = 'Something else'
select soundex(@a), soundex(@b), soundex(@c), soundex(@d), difference(@a,@d)
--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
Edited by - chrhedga on 10/08/2002 01:21:14 AM
October 8, 2002 at 4:21 am
Soundex is decent for this. Not great, but built in and worth considering. What we do is preprocess the columns to remove "noise" words like company, corporation, co, etc, plus remove all spaces and punctuation. A lot of work so it makes sense to persist and index the result. Then we do the comparison. MUCH higher match rate. We also check various combinations of name + phone, name + address, since for some companies they have businesses everywhere (McDonalds for example).
Andy
October 8, 2002 at 4:29 am
Other than that you may try to build a cursor and do character by character comparisons, but this will be extremly slow and quite possibly memory intensize. You are getting into a realm of intrepretation here and it is always subject to what you actually mean by compatible.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
October 8, 2002 at 11:07 am
Thanks for your help.
I don't think I can use soundex function for this. May be I can match all the characters in one string and then calculate how many character matches and also I can decide the level of matching like 50%,60%,70%.
October 8, 2002 at 4:31 pm
What exactly do you need to accomplish? Are you sure it's not enough with siundex and the difference-function?
Maybe you should do this outside of SQL, or in an extended proc or something. What you're looking for is fuzzy logic, comparing the words by 2- or 3-grams etc. We're doing this in our search engine software, but that's pretty advanced.
--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
Edited by - chrhedga on 10/08/2002 4:33:19 PM
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply