Indexing nvarchar or varchar larger than 900 bytes

  • Hi there,

    I'm in the process of indexing a few columns in a few tables, most of which are of type nvarchar(515). This means that the byte size if over the 900 byte limit. I also have another column which is nvarchar(max). I was wondering how you go about indexing these datatypes?

    I'm thinking of looking at full text indexing but i'm just wondering overall what my options are. I've looked at creating a nonclustered index and adding my column in an INCLUDE() clause but I'm not too sure that creating my index on a column I don't really need for the sake of including my other column is a very smart choice...

    Thanks in advance for the help,

    Greg

  • ducon (12/13/2009)


    I'm in the process of indexing a few columns in a few tables, most of which are of type nvarchar(515). This means that the byte size if over the 900 byte limit. I also have another column which is nvarchar(max). I was wondering how you go about indexing these datatypes?

    You don't.

    The 900 byte limit is a hard and fast one. If you'll be doing partial matches (which is very likely with the varchar(max)), consider full text indexing.

    Will you be doing exact matches on that nvarchar column? Will there be other columns referenced in the where clause? Can you give some more detail on what you'll be doing with these columns?

    I've looked at creating a nonclustered index and adding my column in an INCLUDE() clause but I'm not too sure that creating my index on a column I don't really need for the sake of including my other column is a very smart choice...

    Bear in mind that the include columns are just there to allow the index to be covering. SQL cannot seek on a column that's in the include. It can only seek on a left-based subset of the index key.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi,

    Thanks for the reply. I'm gonna be talking with a developer of the application and see exactly what the plans for the data are. I got a general idea (e.g. "lookup using this field") but not the exact patterns of queries that will be used, which will help in that case.

    I'm going to look at whether I can also reduce the length of some of those columns because some were rather large for no other reasons than 'just in case' so that might be something I can do too.

    I was thinking the same regarding the INCLUDE() clause for index creation, it will simply be included but not used for a seek operation. That makes sense though.

    Greg

  • This may be of use to you. http://www.sqlservercentral.com/articles/Indexing/68439/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you. Although I am quite familiar with most of the things covered here, I'm going to have a look at the nonclustered index section and its INCLUDE paragraph.

    Regards,

    Greg

  • Ummmm.... do you mean that you're going to do a join on a >900 character VARCHAR or that someone using a GUI is going to type more than 900 characters to do a speedy lookup?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For a couple of them it's a join (custom etl process). But i've talked with the devs and the length can be reduced to a varchar(50), the original size was decided arbitrarily and 50 is what the application supports, so these are not much of an issue anymore.

    The other ones would be used to build a dynamic IN() statement by a custom reporting application that does not support joins on one-to-many/many-to-many tables for now. This will eventually change but for now, that column would contain basically a list of comma separated IDs. I have already mentioned my concerns over those but without much success so I'm trying to deal with it the best I can.

    Greg

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply