to INCLUDE or not to INCLUDE

  • which is better and why?

    create index index1 on table1 (col1, col2, col3)

    or create index on table1 (col1)

    include(col2, col3)

    imagine 2 scenarios:

    1. col1, col2 and col3 are all in the where clause

    2. col1, col2 and col3 are all in the select clause.

    thanks.

  • winston Smith (1/7/2009)


    which is better and why?

    create index index1 on table1 (col1, col2, col3)

    or create index on table1 (col1)

    include(col2, col3)

    Depends what you're trying to accomplish

    The first fully covers and supports queries of these forms:

    SELECT col1, col2, col3 FROM Table1 WHERE col1 = @var1 AND col2 = @var2 AND col3 = @var3

    SELECT col1, col2, col3 FROM Table1 WHERE col1 = @var1 AND col2 = @var2

    SELECT col1, col2, col3 FROM Table1 WHERE col1 = @var1

    The second only supports and covers a query of this form

    SELECT col1, col2, col3 FROM Table1 WHERE col1 = @var1

    If the columns are needed for filtering they need to be in the index key. If they're only needed to be retrieved (select only) then they can be in the include.

    Columns in the include are only present in the leaf levels of the nonclustered index. Columns in the key are present at all levels.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm not sure of the benefits of INCLUDE here. Why wouldn't you put all the columns in the index? This has puzzled me from day one. The index size is essentially the same in either case.

  • Steve Jones - Editor (1/7/2009)


    Why wouldn't you put all the columns in the index? This has puzzled me from day one. The index size is essentially the same in either case.

    Depends on the size of the columns. If they're all INTs, there's little gain. The main benefit comes when the columns that need to be in the index (to cover some really nasty query) are large. There's that irritating 900 byte, 16 column limitation on index keys, so if the index absolutely has to have the varchar(1000) column in it, otherwise the 250 million row table gets scanned, it has to go as an include column.

    The other benefit comes in keeping the higher levels of the index as small as possible. If the index key is really large, that's at all level, leaf, intermediate and root. The bigger the index keys are, the more pages are needed at the intermediate levels and the deeper the index gets. Deeper indexes mean more page reads in a seek, which emans more IO impact and slower queries.

    Personally, I like my indexes as small as possible, so unless a column is going to be searched on, I'd rather have is as an include than a key column.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm going to pretty much echo Gail on this one. If the column will be used in a Where or Join clause, put it in the key, but if it won't, put it in the included columns list.

    I've tested this a lot, and it makes a difference in size and speed, depending on the data type.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • one more question. take the first index:

    create index on table1 (col1)

    include(col2, col3)

    in this index, the index key is Col1 and in the leaf level, the data from col2 and col3 are stored, so if a query is:

    select col2, col3 from table1 where col1 = 555 ,

    the actual clustered index/heap never needs to be accessed. am i correct in this idea?

    take the other index:

    create index on table1 (col1,col2, col3)

    if this is the only index on the table bar the clustered index, and we run the same query, and the optimizer happens to pick this index, is all the data from col1,col2 and col3 stored in the index or will the table/heap/clustered index have to be queried?

  • winston Smith (1/8/2009)


    the actual clustered index/heap never needs to be accessed. am i correct in this idea?

    Correct, all the info is in the index.

    if this is the only index on the table bar the clustered index, and we run the same query, and the optimizer happens to pick this index, is all the data from col1,col2 and col3 stored in the index or will the table/heap/clustered index have to be queried?

    All three columns (as you specified) are part of the index key.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply