Index design

  • Hi there,

    I have a doubt related with Indexes.

    While deciding upon columns to be part of Index, what should be considered a wise decision:

    Selectivity

    OR

    Usage ?

    Usage as in... The columns which are used in JOIN/WHERE clauses.

    By Selectivity, I mean column like date or some auto-incremental values etc

    For eg. I may have 1 datetime column which has very high selectivity. But its involvement in queries is less as compared with another column of char data type.

    I think usage is more crucial considering performance.

    It could be that for latter consideration, there are multiple columns involved which is not the case with former.

    Thanks

  • Well, that depends on whether you want your indexes to be selective or do you want them to be useful and used.

    The clustered index benefits from unique values and an ever increasing column. Nonclustered indexes, in my opinion, should be tailored to the queries that will be run on the table. That said, indexing a column that has only 2 values in a million row table is probably a waste of time, even if queries do filter by it. If it's not fairly selective or covering, then SQL may not use it.

    If you have queries that filter on 4 columns, then an index on those 4 columns is more useful to SQL, for that query, than 4 indexes each containing 1 column.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Use is most important. What's the point to indexing a column you never select by?

    A good way to do it is pick the columns that appear in Where/Join the most, and index those, and put the one with the highest selectivity at the front of the index.

    Then, check your execution plans, and make sure the index is doing what it needs to, in terms of avoiding bookmark lookups, seeks instead of scans, etc.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • How about Selectivity AND Usage.

    Gail sums up all the reasons better than I will. You need to take both into account when designing your indexes. Simple usage is not enough without also taking into account selectivity. The most selective possible column is a waste of space & time if the column isn't used in JOIN and WHERE predicates.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GSquared (10/14/2008)


    A good way to do it is pick the columns that appear in Where/Join the most, and index those, and put the one with the highest selectivity at the front of the index.

    I prefer to pick the order of columns based on how the queries filter on them.

    If I have several queries that filter (equality) on A and B, a couple that filter on A alone and one or two that filter on A, B, C, then the best order for columns for an index to support all of those is A, B, C, regardless of which is more selective

    Similarly, if a couple of frequently run queries filter A = @a and C > @C, then I want the index columns to be ordered A, C even iff C is the most selective of the two columns.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/14/2008)

    I prefer to pick the order of columns based on how the queries filter on them.

    Similarly, if a couple of frequently run queries filter A = @a and C > @C, then I want the index columns to be ordered A, C even iff C is the most selective of the two columns.

    Thanks for your inputs. The picture is much clearer. 🙂

    Suppose, I have Year, Month as columns.

    In a table, they are ordered as Y,M

    the WHERE clause of query goes like this:

    year=2007

    ANDmonth=2

    queries are based on such pattern.

    So, the index should be (Y,M). Right?

  • So, the index should be (Y,M). Right?

    Or (M, Y). It won't make a difference with this query, since it's an equality.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply