full text word counts

  • Hi, Is there any way to get statistic from a full text catalog?

    I have a full text catalog comprised of one varchar(4000) field that represent job posting type messages.  I have another table of 130 rows that contains vendor names (microsoft, cisco, etc)

    What I want to do is have a list of each vendor and the number of messages that contain the vendor name.

    I could use a cursor and just do a like %vednor name% but it seems to me that the full text catalog would already contain this information if I could just pull it out. If I can't get the information from the catalog I'd like to find a way I can build the results without resorting to a cursor. Not sure if I can do a join with a like statement.

    Any ideas appriciated.

     

     

    Dave

    Trainmark.com IT Training B2B Marketplace
    (Jobs for IT Instructors)

  • This was removed by the editor as SPAM

  • Unfortunately, with SQL Server 2000 there is no supported way to get either the unique non-noise word lists or statistics of these word lists (word frequency, etc.) directly from the MSSearch maintained FT Catalog :-(. However, starting with SQL Server 2005 Beta2, Microsoft will release a command-line utiltity called cidump.exe. This utility will dump or provide limited access to the contents of a FT catalog.

    In regards to getting a list of each vendor and the number of messages that contain the vendor name, at this time using a cursor most likely would be you best bet. However, I believe I have a better solution use a set based approch, but I need time to develop it. Could you provide more details on your tables, i.e., CREATE TABLE <table_name> structures?


    John T. Kane

  • Thanks I appriciate your help. The Tables have a very basic structure:

    Vendors

    uid  INT ident

    vendorName varchar(100)

    Messages

    email varchar(100)

    phone varchar(50)

    message varchar(2000)

    ===

    As a note: I am updating the message posting section of the site to ask the poster to explicitly identify the vendor and location so that these stats will be easier to compile.  However I would still like to be able to classify the over 12,000 job postings in the database to pre-populate these stats.

    The site this will be used on is

    http://www.trainmark.com

    IT Training Providers and Instructor brokers use this site to post about 250 training gigs a month.  I am trying to enhance a few features without breaking what is already working and these kind of stats are going to help.

    Cheers

    Dave

     

     

     

     

     

     

     

     

    Dave

    Trainmark.com IT Training B2B Marketplace
    (Jobs for IT Instructors)

  • I am working on something similar to this based on the way others get the word count in these comm:

    SELECT Col1, LEN(Col1) - LEN(REPLACE(Col1, ' ', '')) + 1

    FROM Table1

    ...where the thing to change is the ' ' in REPLACE and maybe add some mathematics.

    ---

    edit:

    I think I got it. I am just figuring out how to do my first article post here.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply