SQL choosing to use the worst possible index

  • Hello All

    I have a box with SQL Server 2008 R2, at the latest SP and CU, 16 cores and 196gb of ram (I also tested it at a copy of the db on SQL 2012, same behavior)

    I have:

    one table with around 300 million rows

    a primary key and clustered index assigned to one column ( mainid on this example)

    two over indexes, on two other columns (id2 and id3), no included columns or anything like that

    I ran a query like this:

    Select mainid,id2

    from bigtable

    where mainid >= @lowerlimit and mainid < @upperlimit

    the number of rows between those two limits are only 40~ ids

    When I check the exec plan sql decides to use any of the other two indexes

    Before you suggest it, I already rebuild all those indexes, no fragmentation in there now and still the same behavior

    If I let the query run forever since it is scanning the whole table

    If I force the use of the clustered index I get the results in less than one second

    I already flushed all the stats as well

    any ideas of why it is behaving in such a way?

    Thanks in advance

  • Can you please attach the actual execution plan? I have seen examples where big tables and small selections render the normal statistics not optimal. Filtered statistics might be an option. Also the ddl of the table and the indexes?

  • ricardo_chicas (3/14/2015)


    Hello All

    I have a box with SQL Server 2008 R2, at the latest SP and CU, 16 cores and 196gb of ram (I also tested it at a copy of the db on SQL 2012, same behavior)

    I have:

    one table with around 300 million rows

    a primary key and clustered index assigned to one column ( mainid on this example)

    two over indexes, on two other columns (id2 and id3), no included columns or anything like that

    I ran a query like this:

    Select mainid,id2

    from bigtable

    where mainid >= @lowerlimit and mainid < @upperlimit

    the number of rows between those two limits are only 40~ ids

    When I check the exec plan sql decides to use any of the other two indexes

    Before you suggest it, I already rebuild all those indexes, no fragmentation in there now and still the same behavior

    If I let the query run forever since it is scanning the whole table

    If I force the use of the clustered index I get the results in less than one second

    I already flushed all the stats as well

    any ideas of why it is behaving in such a way?

    Thanks in advance

    Is the clustered index UNIQUE or "just" a clustered index?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It is also the primary key, so yes, it is unique

  • Check parameters are identical datatype to the column datatype. I encountered a situation where the parameter was an nvarchar and the column was varchar. It was table scanning instead of seeking. When I changed parameter to varchar ithin an index seek. Not sure if int/bigint could do similar.

  • Thanks,

    The parameters are of the same data type

    Funny thing, if instead of the parameter I use the values directly it will provide a correct exec plan, that is why I flushed the proc cache.

    I am at a loss on this one

  • Try parametrized but with a RECOMPILE hint. SQL will then not store a plan but compile a plan optimized for the specified parameters. If then the clustered index is used it is clear that SQL has trouble finding an optimal plan. Without the RECOMPILE hint SQL compiles a more 'conservative' plan anticipating other parameters than the ones specified at compilation.

    Using hint is generally not good practice. In this case it's a trade off between compile time and performance improvement. The query seems simple enough so the penalty of recompilation seems small.

  • The recompile works, the same as forcing the proper index with a hint, but in this case what I really want to figure out is what is SQL thinking, if all the stats are fresh, the index are not fragmented, why is it behaving in such a way.

    As to add a new factor, I have 4 other databases that have the same structures and on those an equivalent query uses the appropriated index.

    To me this really seems like a bug

    I can't just force the index or add a recompile in there because I do not know how many other cases like this could be in there....

  • It almost sounds like a "simple" case of "bad" parameter sniffing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/15/2015)


    It almost sounds like a "simple" case of "bad" parameter sniffing.

    That sounds right, but, what can I do about it?

    I can't just go and change tons of code because of this...

  • Maybe plan guides?

  • Is the range you are using recently added rows? Recently added values don't appear in the statistic for the table so optimiser can't use stats to estimate record count. However the way I fixed this in the past is to give the optimiser a hint to tell it the gals to optimise for using values already in the statistics for the table. Use the OPTIMIZE FOR query hint. Use a couple of values that are close. They don't need to be recent, just a couple of values the optimiser will recognise in the stats and use the plan you want.

  • Are @lowerlimit and @upperlimit local variables or parameters? If they're local variables, you're probably seeing, well, the opposite of parameter sniffing. it's using average values rather than specific values against the statistics for your table. That's why if you hard code the values in it's a better plan. That's parameter sniffing being helpful (which it is most of the time).

    To fix this... change those variables to parameters so that you can take advantage of the sniffing process. You could put a recompile hint on there and then local variables will be sniffed as well. Make sure your statistics are up to date. Other parameter sniffing fixes are to use a query hint with OPTIMIZE for to either a specific set of values or to UNKNOWN to get generic values. Not sure that would help in this case.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Can we see the query and execution plan please? There's lots of guessing happening, but hard to say what's really happening without seeing details.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes, those are local variables

    The option(recompile) or the optimIze hint work, but, MS SQL shouldn't be doing this, I have a clustered index for a reason, I need to trust that it will work as expected not with this random behavior,

    I left an update stats of all the tables running since rebuilding all the indexes at that table didn't work, I will check the effects in a couple hour.

    Sorry I can't share and exec plan or the query, I am bound by contract not to release the smallest piece of code, but the example I provided is very close to the actual thing, the only thing I didn't is that those variables are varchar(23).

    Since it seems that I can't trust the optimizer anymore even in a very simple query as this, is there a way to look for this behavior by using the dmv tables?

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply