Wanted: Dictionary of "Articles of Speech"

  • I'm building a content managment system for unstructured data.  I need to parse a user query, google type, into its individual words.  I don't care about words like he, she, the, them, those etc... Does anyone know where i can get a list of all these types of words.  I want to load it into a table so that i can exclude unimportant words from the user query. thanks

  • I think you need to search for Stopwords on the net.  An example I found is below.  There are different kinds of stopwords - which can be any sort of exclustion list, so you might have to hunt around for one that meets your needs.

    Regards,

    David McKinney. 

    http://translate.e-khmer.net/index.php?title=Search_stopwords.txt

  • Thanks. that was a big help.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply