Indexing first n positions of a VARCHAR field?

  • A colleague mentioned in passing the other day that he thought it's possible to just build an index on the first few characters of a VARCHAR field, but wasn't sure of how to do it.

    Is he right? Is it possible, and if so, how? We have a table similar to the following, which is crying out for smaller indexes.

    CREATE TABLE [demo] (

     [some_ID] [int] IDENTITY (1, 1) NOT NULL ,

     [varcharfield1] [varchar] (50) NULL ,

     [varcharfield2] [varchar] (200) NOT NULL

    )

    The varchar fields are generally not filled (for example, varcharfield2 has length 200, but is only generally filled to 20 or 30 characters), and I would like to build an index on the first section of the field.

    Can anyone help?

    Regards from Heidelberg, Germany

    Jon

  • Jon

    The number of bytes occupied in the varchar field will be equal to the number of characters stored. So, if you insert a value which is 20 characters long, then only 20 bytes are stored, not 200 (as would be the case with a char(200) field).

    Similarly, the index entry will only occupy 20 bytes (plus pointer information).

    Therefore, go ahead and build your indexes - SQL Server behaves just the way you want it to.

    Regards

    Rob

  • The other option that you have is to use a computed column and place an index on it.

    Say for example you have the desire to create an index on the first 20 characters of the varchar column you would add a new column

    ALTER TABLE DEMO ADD LEFT_VARCHAR_FIELD1 AS (LEFT(VARCHARFIELD1,20))

    CREATE INDEX LEFT_VC1 ON DEMO(LEFT_VARCHAR_FIELD1)

    By creating an index on the computed column you store the actual values (whereas otherwise the values would be computed on the fly) in that column and makes it quick and searchable. I use this a lot as there are many queries that have a left(col, 3) and a right(col, 2) in production here.

    You have to change a couple of options for users connecting to the server which can cause one or two problems that need to be rectified, but all in all it's worth it, I took a query down from 12 mins to under 15 seconds by adding the Indexed computed column.



    Shamless self promotion - read my blog http://sirsql.net

  • stacenic,

    By using an index on a computed column you will materialize that column and it wont be calculated on the fly.

    So an additional 20 bytes will be stored within your table.

    In addition by using a seprate indexed column you will have to rewrite all of your queries (in our xase several thousends).

    In fact me too I would have needed that feature. I think Oracle is providing that.



    Bye
    Gabor

  • Hi all

    Many thanks for the help. We're talking about very few queries and users on the table, so I can probably live with adding the indexed computed column.

    Regards

    Jon

  • What about indexing on len(field_name)?

    I'm interested in speeding up queries that order by len(field_name) without changing the application.

    Can a computed field be created on len(field_name)? I guess it probably can. But would the query use that field to do the order by or would the query have to be changed to specifically order by (new_computer_field_name)?

    Thanks!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply