December 23, 2005 at 8:57 am
I have several fields for match, is possible to define a user function for compare and return Match score in a SQL statment ?
December 23, 2005 at 9:06 am
Not exactly sure what you mean by Score., But have you looked at functions Soundex, and Difference?
They return a value based off comparison, but not a score per say.
I have not worked with them much though.
December 23, 2005 at 9:25 am
For example, user input address, then list out the simular address with sorting by matching score.
December 25, 2005 at 10:50 pm
December 26, 2005 at 8:05 am
But there are not mentioned anything to generate Matching score,
in mysql, there has Match() function, in MSSQL it seems only able to get the search ranking by Full text search with Index Server.
December 27, 2005 at 10:54 am
Try a position compare of the two strings where you start with a max score, like 100, and subtract based on the number and/or size of the difference. For example, Bryan and Brian are similar, and soundex will mess with you on these, but are legitimate names. You may calculate a score based only on the single character being different. You may also want to incorporate a factor for the actual length of the data being compared.
If you measure the amount of difference between two characters, you may want to take into account keyboard location (for typos and such).
Anyway, basaed on how you interpret your information, you may establish that a 100 score is an exact match, and then set a low end for what would be a possible match. Anything below that would be defaulted as not a possible match.
January 3, 2006 at 7:42 pm
I made one as follows,
CREATE FUNCTION match (@p1 varchar(100), @p2 varchar(100))
RETURNS int as
BEGIN
declare @mark-3 int
declare @tstr varchar(100)
declare @f varchar(100)
declare @i int
select @mark-3 = 100
select @tstr = rtrim(@p1)
select @i = charindex(' ', @tstr)
if len(rtrim(@p1)) <> len(rtrim(@p2))
while len(@tstr)>0
begin
select @i = charindex(' ', @tstr)
if @i > 0
begin
select @f =upper( left(@tstr, @i))
select @tstr = substring(@tstr,@i+1,99)
end
else
begin
select @f = upper(@tstr+' ')
select @tstr=''
end
if charindex(@f , upper(@p2+' ')) =0
if charindex(@f, upper(@p1+' ')) <> charindex(@f , upper(@p2+' '))
end
return @mark-3
January 3, 2006 at 8:24 pm
You do not need to enable full text features to use SOUNDEX or DIFFERENCE...
DECLARE @Str1 VARCHAR(100)
DECLARE @Str2 VARCHAR(100)
SET @Str1 = '129 Green Ave'
SET @Str2 = 'Green St'
SELECT DIFFERENCE(@Str1,@Str2)
SET @Str2 = 'Green Ave'
SELECT DIFFERENCE(@Str1,@Str2)
SET @Str2 = '120 Green Ave'
SELECT DIFFERENCE(@Str1,@Str2)
The higher the number on DIFFERENCE, the better the match. 0 is absolutely no match... 4 is a nearly perfect match.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2006 at 8:35 pm
As the article warns, this method makes use of a correlated subquery that forms a triangular join which is half a cross join. It will crush any hopes of performance on tables over 10-20k rows. Might get a bit more out of it with properly indexed tables but this method of counting is definitely not scalable.
Also, I don't know why but the article warns "Some of these examples only work with Microsoft SQL Server 6.5 because they use derived tables in the FROM clause. " Be advised that derived tables in the FROM clause work just fine in version 7 and 2000 and well as all versions of MSDE. I can't speak for version 2005 but don't think it likely that they would destroy this important functionality. I strongly suspect that the article was first written when version 6.5 came out, hence the warning.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply