CONTAINS showing results when shouldn't

  • I have the following query that returns 1 record.

    select * from Resume

    where Resume_ID = 4187300

    AND CONTAINS(Resume_Desc, '"Series 7"')

    That would be great but if I do a search of the 'Resume_Desc' field I do not have a phrase that is 'Series 7'.

    So I found another phrase without a numeric value and then removed one character 'n' from Centers.

    select * from Resume

    where Resume_ID = 4187300

    AND CONTAINS(Resume_Desc, '"AAMCO Ceters"')

    The result is as expected and no results show up.

    Any ideas why my first query is showing the result when it shouldn't?

    Thanks for any advice.

  • Very good question.

    Something important to understand about SQL Server's full text indexing is that not every word gets indexed. There are words that are intentionally filtered out because they generally waste space and aren't used for real searching. In SQL 2005, these were referred to as noise words, which were contained in text files for each language in the SQL Server install directory. This concept was replaced in SQL 2008 with stoplists, which basically perform the same function but are contained within the database itself instead of external files. The default system stoplist is in the resource database.

    Before going further, we should put two and two together and recognize that the relevant point is that '7' is, by default, listed as a noise word in 2005 and in the stoplist for 2008. All single digits are intentionally not indexed. So, your query basically completely ignores the '7' as it's not part of the full text index. You could replace '7' with any other single digit and you would get the same results.

    If we want to change that, and include these single digits in our index, we can do that. First we want to create a new stoplist based on the default system one so we have something we can edit. In my example, I'm working in a database named Dev with a single table named dbo.FTS_Test.

    use Dev

    go

    create fulltext stoplist Dev_FT_Stoplist from system stoplist;

    go

    Now one thing that might be interesting is to take a look and see what the stopwords are ...

    select sw.*

    from sys.fulltext_stopwords as sw

    join sys.fulltext_stoplists as sl

    on sl.stoplist_id = sw.stoplist_id

    where sw.language_id = 1033

    and sl.name = 'Dev_FT_Stoplist'

    I just want to stick to English (thus the 1033) ... obviously that can be changed as needed. So we can see all the single digits in there and we want to get rid of them:

    alter fulltext stoplist Dev_FT_Stoplist

    drop '0' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '1' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '2' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '3' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '4' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '5' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '6' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '7' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '8' language 1033;

    alter fulltext stoplist Dev_FT_Stoplist

    drop '9' language 1033;

    If we run our stopwords query again we can see that those single digits are now gone from our custom stoplist.

    The last step will be to associate our custom stoplist with the table the data in question resides in.

    alter fulltext index on dbo.FTS_Test

    set stoplist Dev_FT_Stoplist;

    As soon as I did this, the query began returning the literal results for '"Series 7"'. Due to settings with the full text index, it may be that you would need to rebuild it before it would reflect the changes in the stoplist.

    Good luck.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Thanks for the info. That is a great reply.

    If I run the following shouldn't I see a row with 'name' having the value of 7? I haven't changed anything and both of these queries have no rows.

    select * from sys.fulltext_stopwords

    select * from sys.fulltext_stoplists

    I have also found that 'Series 66' is the same as 'Series 7'.

    Also, I tried to create stoplist but I get an error:

    Use databasename

    go

    create fulltext stoplist FT_ResumeSL FROM system stoplist;

    go

    Result was:

    Msg 102, Level 15, State 1, Line 1

    Incorrect syntax near 'stoplist'.

  • 😀

    Hello

    Use databasename

    go

    create fulltext stoplist FT_ResumeSL FROM system stoplist;

    go

    at previous sentence i found the problem in table name itself sould be in square bracktes [system stoplist]

    that's all man

  • I think I figured out the issue of creating the stop list . When I run:

    select * from sys.databases where name = 'dbname'

    The compatibility_level is 80 but the sql server using is SQL SERVER 10.50.1600.

    I believe I need to change to 100. Are there any issues with changing a production database from 80 to 100?

    Thanks.

  • Ok you can to try that ,But just make a recovery point before that So you can rollback if needed:w00t:

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply