August 18, 2004 at 10:06 pm
I am looking for a way (or product) that will do a full text search as follows:
· Search logic allows wild cards and reverse order names. For example, name John Smith (i.e., %johnsmith%) returns:
- John Smith,
- John Smith, esq.
- Smith, John
- John Smith Bakery, Inc., etc.
I doesn't seem like a big problem until I found out that the users will be entering the name without spaces (i.e. johnsmith)
Any ideas would be appreciated.
August 19, 2004 at 8:05 am
Sorry I know of no tool that can do that kind of fuzzy logic. Might be a cool tool but you might be able to do in the background a search for each individual letter as opposed to the word. Then you just have to determine how to deal with situations where all the letters appear but not the words they may expect. You would have to come up with a weighting based on closeness of characters and the fewer sets together in a word the less likely the should appear.
August 19, 2004 at 1:46 pm
I'm not saying that this will run lightning fast, but I believe the following would work. Below is a SQL function definition that will take a string and create various combined mutexes. You can then perform a LIKE on the results within your where clause. If you want it to perform better you could store the results of this function as a new column in the table (update this new column whenever the source column(s) change), but it would still be a LIKE clause against a large varchar. Perhaps even a new table with a foreign key instead of a column (due to excessive row length and subsequent excessive page storage use). I only did 2 word mutexes here. To find "John B. Smith" from "Smith, John B." would take some more work but could be done. It will run even slower so only support it if there is a need to.
EXAMPLE QUERY:
SELECT * FROM MyTable WHERE dbo.NameMutate(ClientName) LIKE '%johnsmith%'
EXAMPLE FUNCTION RESULTS:
print dbo.NameMutate('John Smith')
print dbo.NameMutate('John Smith, esq.')
print dbo.NameMutate('Smith, John' )
print dbo.NameMutate('John Smith Bakery, Inc., etc.')
GO ----->
JohnSmith,JohnSmith,SmithJohn
JohnSmithesq,JohnSmith,Johnesq,SmithJohn,Smithesq,esqJohn,esqSmith
SmithJohn,SmithJohn,JohnSmith
JohnSmithBakeryIncetc,JohnSmith,JohnBakery,JohnInc,Johnetc,SmithJohn,SmithBakery,SmithInc,Smithetc,BakeryJohn,BakerySmith,BakeryInc,Bakeryetc,IncJohn,IncSmith,IncBakery,Incetc,etcJohn,etcSmith,etcBakery,etcInc
ALTER FUNCTION NameMutate
(@List varchar(200))
RETURNS varchar(8000)
AS
begin
declare @IndexStart smallint
declare @IndexEnd smallint
declare @Length smallint
declare @Delim char(1)
declare @Word varchar(200)
declare @Row smallint
declare @AllWords table ( AWord varchar(200), WordNumber smallint )
declare @Results varchar(8000)
set @Results = ''
set @IndexStart = 1
set @IndexEnd = 0
set @Delim = ','
--First, convert all possible delimeters to single delimeter
set @List = REPLACE( @List, ' ', @Delim )
set @List = REPLACE( @List, '.', @Delim )
-- Now compress out all consequtive instances of a delimeter.
set @List = RTRIM( @List ) -- just in case @Delim becomes a space since LEN does an implicite RTRIM
set @Length = LEN( @List )
set @List = REPLACE( @List, @Delim + @Delim, @Delim )
while @Length > LEN( @List )
begin
set @Length = LEN(@List)
set @List = REPLACE( @List, @Delim + @Delim, @Delim )
end
--Make certain the raw input becomes a result
set @Results = REPLACE( @List, @Delim, '' )
--Now parse out each word into a table
set @Row = 0
while @IndexStart <= @Length
begin
set @Row = @Row + 1
if @Row >= 999 return @Results -- hard limiter just in case
set @IndexEnd = charindex(@Delim, @List, @IndexStart)
if @IndexEnd = 0
set @IndexEnd = @Length + 1
set @Word = substring(@List, @IndexStart, @IndexEnd - @IndexStart)
set @IndexStart = @IndexEnd + 1
INSERT INTO @AllWords
VALUES ( @Word, @Row )
end
--Now create a combinatorial mutation of every word pairing (hope there are few)
set @IndexEnd = ( SELECT COUNT(*) FROM @AllWords )
set @Row = 1
while @Row <= @IndexEnd
begin
SELECT @Word = AWord
FROM @AllWords
WHERE WordNumber = @Row
SELECT @Results = @Results + @Delim + @Word + AWord
FROM @AllWords
WHERE WordNumber != @Row
set @Row = @Row + 1
end
return @Results
end
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply