A Google-like Full Text Search

  • I'm having some problems getting this to work with VS 2005 \ 2.0. If someone has a working sample would you mind sending me the files? Much appreciated!!!

  • I've downloaded the latest Irony build (42496), compiled Irony.dll and included the SearchGrammar.cs file from the Irony.Samples project, but am running into problems. Irony has dropped the Irony.Compiler name space and I can't locate the replacement for the compiler.

    Any chance the sample will be updated?

  • Strange... if you take SearchGrammar from Irony.Samples, it should compile together with latest Irony. In fact you can convert queries now right in the Grammar Explorer window. Don't take any code from download in this article - use all from Irony's latest source set - except the part of executing the query. It is commented out in SearchGrammar from Irony samples project but I think you can easily recover the code - it should not change much.

  • The Irony project(s) compile fine. I've tried to update Mike's sample with the new Irony.dll and the new SearchGrammar.cs, without success. I understand you to say not to use Mike's sample, and that's OK too.

    My problem is in knowing how to incorporate my search entry text field to use Irony; that's why I was trying to use Mike's sample. It comes down to a couple lines of code. From Mike's sample in the button click event.

    AstNode root = _compiler.Parse(SourceQueryText.Text.ToLower());

    FtsQueryTextBox.Text = SearchGrammar.ConvertQuery(root, SearchGrammar.TermType.Inflectional);

    Where _compiler is of type LanguageCompiler in the sammple; and from a post on the Irony site I see that this was changed to Compiler type, but I'm not finding that either. Also, the AstNode type has changed too; I think I resolved that but the compiler type change has me stumped.

    Once I get my search box string parsed by Irony I can call my sproc with the SearchGrammar.ConvertQuery results to perform the FTS.

  • Look at RunSample method in SearchGrammar in Irony download - it does this directly for Grammar Explorer

  • Thanks for pointing me in the right direction. I've managed to get it working, both in my code and in Michael's sample.

    First I made a helper class

    using Irony.Parsing;

    using Irony.Samples.FullTextSearch;

    namespace XXXXX

    {

    public class SearchGrammarHelper

    {

    SearchGrammar _grammar;

    LanguageData _language;

    Parser _parser;

    ParseTree _parseTree;

    public SearchGrammarHelper()

    {

    _grammar = new SearchGrammar();

    _language = new LanguageData(_grammar);

    _parser = new Parser(_language);

    _parseTree = null;

    }

    public string QueryStart(string searchEntry)

    {

    string qryParm = string.Empty;

    if (string.IsNullOrEmpty(searchEntry) == false

    && searchEntry.Trim().Length > 0)

    {

    _parser.Parse(searchEntry.Trim(), "<source>");

    _parseTree = _parser.Context.CurrentParseTree;

    qryParm = _grammar.RunSample(_parseTree);

    }

    return qryParm;

    }

    }

    }

    Then replaced the Convert button click event code to this

    SearchGrammarHelper gh = new SearchGrammarHelper();

    FtsQueryTextBox.Text = gh.QueryStart(SourceQueryText.Text.ToLower());

    As a test harness this got the sample running. Thanks again.

    Once I get my site up I'll write a blog about the process and link back to both the Irony site and Micahels's article. That'll be a few weeks down the road yet.

    Steve

  • I have successfully implemented the FTS in my web app using Irony, but I may have found a bug in the SearchGrammar.cs, or possibly a limitation.

    I have a table of quotations. Initially only the quote column was included in the FT index, but I've expanded this to include the author column as well; this is working fine on simple searches.

    I did a search on (John*) and got all the expected John, Johnson, etc. I then tried to narrow the results by looking for and additional word in one of the quotations. In this case I was anticipating John Dewey's "Education, therefore, is a process of living and not a preparation for future living."

    My search entry was (John* education), without the parens, of course. This produced the following search query "(\"john*\" AND FORMSOF (INFLECTIONAL, education) )" which produces no results in the app.

    Plugging the search query into SSMS produces an error - Syntax error near 'john*\' in the full-text search condition '(\"john*\" AND FORMSOF (INFLECTIONAL, education) )'.

    I don't know if this is an error or a limitation of the serchGrammar routine. Just thought I'd pass it along in case anyone has a thought on how to overcome this, or a different way of formulating the search request.

    Steve

  • Mike C (10/7/2008)


    Martin Nyborg (10/7/2008)


    Thanks for the great article. As soon as I read it. I started to implement full text search on our most used table.

    I have altered the code to use CONTAINSTABLE(tblsite,*,@ftsQuery) because I want to be able to search in many fields.

    But it is not working.

    I have this field list with full text index on (Site (PK), SiteName, Address, HouseNumber, Zip, City)

    I can search for "Vejle" (danish city) and I get result from columns SiteName and Address, so that works.

    Now I want to search for "Vejle" and "17". I want to find the city "vejle" and all streets with housenumber "17" but the result set is empty. Can any one help me out on this? I think I have tried all search combination's

    The problem you're encountering is that you're searching for ("17" AND "Vejle"), and iFTS stipulates that they must exist together in the same column. In order to search for "17" in one column and "Vejle" in another column, you must create two FTS predicates ANDed together like this:

    CONTAINS (tblsite, HouseNumber, "17")

    AND CONTAINS (tblsite, City, "Vejle")

    This can be done, but adds considerable complexity to the query creation.

    Thanks

    Mike C

    This is good stuff.

    My question is what if the user switches the two inputs (Vejle, 17). The SQL would do a compare with the wrong columns to data. Is there a way around this?

    Thanks

    Ken

  • Please help me.

    I want to remove stop words from the query(it's a custom stop word list)

    How can I Do that?

  • careusa2003 (3/17/2010)


    This is good stuff.

    My question is what if the user switches the two inputs (Vejle, 17). The SQL would do a compare with the wrong columns to data. Is there a way around this?

    Thanks

    Ken

    Hi Ken,

    Quick question - how does your program know that the inputs have been switched? That actually determines your options for handling it.

    Thanks

    Mike C

  • desmati (3/26/2010)


    Please help me.

    I want to remove stop words from the query(it's a custom stop word list)

    How can I Do that?

    Hi Desmati,

    If you create a custom stopword list on SQL 2008 the server will remove the stopwords at index time and when it parses your full-text search queries. All you have to do is specify that your full-text index use the custom stoplist. No reason really to try to duplicate this functionality yourself... Just add a custom stoplist to your SQL 2008 database and specify it when you build the full-text index.

    Thanks

    Mike C

  • Thanks for your reply.

    Is that possible on SQL 2005 too?

  • I have a very fast function in both c# and tsql that returns if a word is stop word or not.

    I tried to change the EBNF and the ConvertQuery in various way, but I failed!

    Would you please help me to develop this without sql 2008?

    Thanks,

    Desmati.

  • desmati (3/26/2010)


    Thanks for your reply.

    Is that possible on SQL 2005 too?

    On SQL 2005 they have the concept of "noise word lists". These are essentially the same thing, but they're stored as text files in the file system. These files have names like "noiseenu.txt" (U.S. English noise word text file) and are located in a subdirectory of your SQL Server instance directory (C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTData\, for instance). You can edit it with any text editor and save again. I don't recall whether or not you need to bounce the service afterwards on 2005 (don't recall if the noiseword list is cached in memory, but you may as well bounce it to be sure). Then you have to rebuild your full-text indexes.

    So yes, it is possible to do something similar on SQL 2005, but it's not exactly the same. With 2008 you can create multiple custom stopword lists and they're all stored in the database instead of in the file system.

    Mike C

  • desmati (3/26/2010)


    I have a very fast function in both c# and tsql that returns if a word is stop word or not.

    I tried to change the EBNF and the ConvertQuery in various way, but I failed!

    Would you please help me to develop this without sql 2008?

    Thanks,

    Desmati.

    To be honest with you I don't see the benefit of duplicating the same functionality in this case. Basically you need to parse your tokens out of the input string using the rules of whatever language you've decided on and then you need to compare your custom stopword list to these tokens. You could use Irony to do this, but you'll also need to traverse your parse tree to stream the tokens or put them in an array of some sort. There are probably a lot of examples on how to do this out there if you Google them up. Check out the other samples on the Irony project website for examples of how to tokenize your input strings.

    Mike C

Viewing 15 posts - 91 through 105 (of 166 total)

You must be logged in to reply to this topic. Login to reply