September 5, 2005 at 9:15 am
Is there any way to perform Sql Server 2000 full text search using Soundex() function? I've developed a customized search engine in ASP.NET using Sql Server 2000, that works fine with exact and prefix searaching using CONTAINS() predicate, but I don't know how to use CONTAINS() with SOUNDEX().Is there any expert who could help me. Thanks a lot.
September 6, 2005 at 12:59 am
CONTAINS is Microsoft proprietry so I don't think you can use SOUNDEX with it. The code below may be what you can do with word sound. Run a search for CONTAINS in SQL Server BOL (books online). Hope this helps.
USE Northwind
GO
SELECT ProductName
FROM Products
WHERE CONTAINS(ProductName, ' FORMSOF (INFLECTIONAL, dry) ')
GO
Kind regards,
Gift Peddie
Kind regards,
Gift Peddie
September 6, 2005 at 1:21 am
Actually, CONTAINS is not proprietry to Microsoft as it is in the SQL/MM spec for Full Text Search. Is is how Microsoft SQL Server implments Full Text Search (FTS) within its RDBMS. No matter, SOUNDEX and FTS queries can be combined, for example:
-- CONTAINS query that Searches for names that start with "Mich".
select au_lname, au_fname FROM authors -- returns 0 rows
where contains(au_fname, '"Mich*"')
-- Combined SOUNDEX OR CONTAINS query that
-- searches for names that sound like "Michael".
select au_lname, au_fname FROM authors -- returns 2 rows
where contains(au_fname, '"Mich*"') or SOUNDEX(au_fname) = 'M240'
However, Full-text Search (FTS) does not use SOUNDEX directly, but
it can be used in combination with SOUNDEX (see above example). Additionally, you may want to review the following links as well as the below TSQL examples of combining CONTAINS & SOUNDEX:
You may want to look at some of the improved soundex algorithms as well as
the Levenshtein Distance algorithm You should be able to search Google to
find more code examples, for example: 'METAPHONE soundex "sql server" fuzzy
name search' and I quickly found - "Double Metaphone Sounds Great" at
http://www.winnetmag.com/Article/ArticleID/26094/26094.html You can freely
download the code in a zip file that has several a user-defined function
(UDF) that implement Double Metaphone.
Below are some additional SOUNDEX links:
http://www.merriampark.com/ld.htm
http://www.bcs-mt.org.uk/nala_006.htm
Hope that helps!
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
John T. Kane
September 6, 2005 at 1:58 am
I was not clear but CONTAINS as implemented in Microsoft FULL TEXT is proprietry. And combing it with SOUNDEX string function is redundant because some version of CONTAINS gives the same result as SOUNDEX. BTW when CONTAINS was implemented in SQL Server 7.0 in 1999 it was in the standard but not implemented in other RDBMS(relational database management systems). Hope this helps.
Kind regards,
Gift Peddie
Kind regards,
Gift Peddie
September 6, 2005 at 6:59 am
From a performance perspective, if you are running code like
select au_lname, au_fname FROM authors -- returns 2 rows
where contains(au_fname, '"Mich*"') or SOUNDEX(au_fname) = 'M240'
Try to have a calculated column that stores the value of SOUNDEX(au_fname) rather than calculating it on the fly every time as SQL will need to evaluate the function for every row! If you store the SOUNDEX (or metaphone) value in a colum in the table (you could use a trigger or populate during an insert in your Stored proc), then you can do a direct match with an index, making things MUCH faster.
So you would do...
select au_lname, au_fname FROM authors -- returns 2 rows
where contains(au_fname, '"Mich*"') or au_fname_soundex = 'M240'
with an index on the column au_fname_soundex.
Cheers
September 6, 2005 at 7:10 am
wouldn't it be quicker/better to do a union between the 2?
select au_lname, au_fname FROM dbo.authors where contains(au_fname, '"Mich*"')
UNION
select au_lname, au_fname FROM dbo.authors where au_fname_soundex = 'M240'
Assuming that it's actually going to be used to search through a lot of records (it's probably overkill for small DB's)
September 6, 2005 at 9:36 am
Gift,
While Microsoft's implementation of CONTAINS may be proprietry, CONTAINS is actually part of the SQL/MM standard, specificlly:
The SQL/MM spec for Full-Text Search is maintained under: "Information technology -- Database languages -- SQL multimedia and application packages -- Part 2: Full-Text" at http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=31368&scopelist= (for purchase). However, I already have a pdf version of this spec and I've reviewed it and CONTAINS is part of this standard, but FREETEXT is unique to SQL Server and is proprietry to Microsoft. Each RDBMS vendor is free to implement this ISO standard in with its own methods and while MS was first to implment this is debatable as Oracle IntraMedia/Context/Text also has simalar Full-text Search features at this time.
Combining the SOUNDEX string function with CONTAINS is not redundant as the two SQL query examples above proves. Finally, I worked at Microsoft during the early development of SQL Server 7.0 with the Full-text Search Dev team and FTS was first incorporated in Beta3 of SQL 7.0 in mid-1998.
Kind Regards,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
John T. Kane
September 6, 2005 at 11:41 am
I am sorry I was off by a year 1998 and I did not say the other RDBMS(relational database management system ) vendors cannot implement CONTAINS but I know as of Oracle 10g release 1 CONTAINS is not in ORACLE.
And I don't pay for ANSI SQL the US Goverment pays for ANSI SQL so their copy is always on the web free, got my first copy pre Google days in the mid 1990s, the current version is ANSI SQL 2003 and yes I have a copy.
Kind regards,
Gift Peddie
Kind regards,
Gift Peddie
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply