Splitting varchar fields into separate nonclustered index fields

  • Hi all, extreme newbie here. MS has a limitation in SQL server 2005 that says you cannot have an index key greater than 900 bytes. Makes sense except when trying to index the field to improve performance 🙂 . The documentation also says that you can break the fields up into separate chunks to get around this limit. I have at times descriptions in fields that can be >900 bytes. I want to index these fields to improve performance. Would it make sense to create several separate 'chunks' of the 900 bytes just for indexing and then present the original descriptive field to the user after he/she performs the search? Or would this just be more I/O time? Any examples of how to create this (ok, being lazy here, but i know you wizards have probably done this a million times).

    thanks for any and all input

  • brd123 (4/20/2009)


    Would it make sense to create several separate 'chunks' of the 900 bytes just for indexing and then present the original descriptive field to the user after he/she performs the search?

    Probably not. Not unless you divide the field up based exactly on the way it's queried.

    Why are you looking to index large description fields? Do people often do exact text matches on 900+ character fields?

    Have you looked at full text indexing?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail thanks for the reply. Just looking for a shortcut to help improve system performance on searches. Full-text indexing unfortunately is not an option. Here is a real-life example.

    User wants to find all items/products in database (this is using a common web storefront product) that contain the words 'civil war'. Product name, description, id and stocking unit fields are searched using the charindex function. User can select whether to do partial word matching or by using word boundaries, and all words or any word, in any order ("civil war" or "war civil"): e.g., "civil warroom" in a partial word search would return a row but "civil war" on a word boundary search would not.

    It takes about 7 seconds to perform this search on >7k of items using a Dell XPS 400, dual processor. And that is just in testing, on one query. Once we go live, with many people running queries i can only imagine the performance hit. Eh, maybe I am worrying too much. What do you think? :crazy:

  • brd123 (4/20/2009)


    User wants to find all items/products in database (this is using a common web storefront product) that contain the words 'civil war'. Product name, description, id and stocking unit fields are searched using the charindex function. User can select whether to do partial word matching or by using word boundaries, and all words or any word, in any order ("civil war" or "war civil"): e.g., "civil warroom" in a partial word search would return a row but "civil war" on a word boundary search would not.

    The only thing that is going to help for that scenario is full text indexing. Normal SQL indexes are do not work for searching within strings. Why is it not an option?

    Have you done performance testing at full production load? If not, why not? 7 sec for a search is already pushing it. If that was me trying to find something to buy from you, 10 sec long searches will have me going elsewhere.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail, I guess the main reason is that I don't know enough about full-texting in SQL Server but I am assuming it probably has its own language thus necessitating the need to rewrite the search procedure.

    Do these fields that have full text capabilites on them work with SQL 92 syntax or whatever syntax sql server 2005 uses? I.e., can i perform something like this:

    SELECT description from product

    WHERE CHARINDEX(" civil war ",description) > 0

    Like I said, I am looking to shortcut this but, in the end it seems the more i ask, the more I am getting an answer where there is no free lunch. Dangggggg

  • for others reading brd's issue, we kind of kicked around searching via CHARINDEX on the same issue here:

    needing to develop a sql search for all words,...

    obviously, that would cost a table scan, so performance might suck, so I understand why he is revising it now and wants to try and index it.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • brd123 (4/20/2009)


    Hi Gail, I guess the main reason is that I don't know enough about full-texting in SQL Server but I am assuming it probably has its own language thus necessitating the need to rewrite the search procedure.

    Not it's own language, just different functions, but yes, you'll need to change the search procedures.

    SELECT description from product

    WHERE CHARINDEX(" civil war ",description) > 0

    SELECT description from product

    WHERE CONTAINS(description, 'Civil War')

    It's a lot more powerful that that though. These are all things that the full text can do too.

    SELECT description from product

    WHERE CONTAINS(description, 'American NEAR "Civil War"')

    SELECT description from product

    WHERE CONTAINS(description, '"Civil War" AND Texas')

    Check books online for more details. There's also the Containstable, FreeText and FreeTextTable functions.

    Like I said, I am looking to shortcut this but, in the end it seems the more i ask, the more I am getting an answer where there is no free lunch. Dangggggg

    TANSTAAFL

    Seriously, to do what you want you've got three options.

    1) Implement full text search and make minor changes to your search procedures

    2) Roll your own version complete with a keywords table. I tried this route once and gave up. It's a lot of work.

    3) Accept that the queries will be dog slow.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Lowell, the reply you came up with before works and IMHO is ingenius (but i am just a grasshopper) but yes, it is taking a bit of time to run the queries. Any suggestions on how to implement this in fulltext indexing would be most welcome and appreciated. You guys rock btw.

  • yeah i think this is a situation where you just have to jump right in with full text indexing. what you need is exactly what it was designed for, and I think you are hesitating just because it's unfamiliar.

    throw an hour or so at googling full text here on SSC, then create your first index on your description column.

    try some of Gail's examples and see if you can get a handle on the way it works; it's not all that scary after a couple of examples.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • if it helps, this article, because it has screenshots, let me set up a full text on one of my tables in about a minute:

    http://www.userscape.com/helpdesk/index.php?pg=kb.page&id=142

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • thanks, i'll take a look

  • thanks i'll take a look

  • thanks...once again (what the heck is wrong with this forum..this is the 3rd time i have tried to post this) i will take a look at this later on today...

  • brd123 (4/21/2009)


    (what the heck is wrong with this forum..this is the 3rd time i have tried to post this)

    Seeing as it's posted 3 times, I'd say absolutely nothing.... 😀

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply