SQL choosing to use the worst possible index

Question

SQL choosing to use the worst possible index

ricardo_chicas

SSCertifiable

Points: 5024
More actions
March 14, 2015 at 1:10 pm

#295955

Hello All
I have a box with SQL Server 2008 R2, at the latest SP and CU, 16 cores and 196gb of ram (I also tested it at a copy of the db on SQL 2012, same behavior)
I have:
one table with around 300 million rows
a primary key and clustered index assigned to one column ( mainid on this example)
two over indexes, on two other columns (id2 and id3), no included columns or anything like that
I ran a query like this:
Select mainid,id2
from bigtable
where mainid >= @lowerlimit and mainid < @upperlimit
the number of rows between those two limits are only 40~ ids
When I check the exec plan sql decides to use any of the other two indexes
Before you suggest it, I already rebuild all those indexes, no fragmentation in there now and still the same behavior
If I let the query run forever since it is scanning the whole table
If I force the use of the clustered index I get the results in less than one second
I already flushed all the stats as well
any ideas of why it is behaving in such a way?
Thanks in advance

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply

Bouke Bruinsma SSCrazy Points: 2189 More actions · Answer 1

Can you please attach the actual execution plan? I have seen examples where big tables and small selections render the normal statistics not optimal. Filtered statistics might be an option. Also the ddl of the table and the indexes?

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 2

ricardo_chicas (3/14/2015)
Hello All
I have a box with SQL Server 2008 R2, at the latest SP and CU, 16 cores and 196gb of ram (I also tested it at a copy of the db on SQL 2012, same behavior)
I have:
one table with around 300 million rows
a primary key and clustered index assigned to one column ( mainid on this example)
two over indexes, on two other columns (id2 and id3), no included columns or anything like that
I ran a query like this:
Select mainid,id2
from bigtable
where mainid >= @lowerlimit and mainid < @upperlimit
the number of rows between those two limits are only 40~ ids
When I check the exec plan sql decides to use any of the other two indexes
Before you suggest it, I already rebuild all those indexes, no fragmentation in there now and still the same behavior
If I let the query run forever since it is scanning the whole table
If I force the use of the clustered index I get the results in less than one second
I already flushed all the stats as well
any ideas of why it is behaving in such a way?
Thanks in advance

Is the clustered index UNIQUE or "just" a clustered index?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 3

ricardo_chicas

SSCertifiable

Points: 5024

March 14, 2015 at 9:01 pm

#1784082

It is also the primary key, so yes, it is unique

roger.price-1150775 Default port Points: 1458 More actions · Answer 4

Check parameters are identical datatype to the column datatype. I encountered a situation where the parameter was an nvarchar and the column was varchar. It was table scanning instead of seeking. When I changed parameter to varchar ithin an index seek. Not sure if int/bigint could do similar.

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 5

Thanks,

The parameters are of the same data type

Funny thing, if instead of the parameter I use the values directly it will provide a correct exec plan, that is why I flushed the proc cache.

I am at a loss on this one

Bouke Bruinsma SSCrazy Points: 2189 More actions · Answer 6

Try parametrized but with a RECOMPILE hint. SQL will then not store a plan but compile a plan optimized for the specified parameters. If then the clustered index is used it is clear that SQL has trouble finding an optimal plan. Without the RECOMPILE hint SQL compiles a more 'conservative' plan anticipating other parameters than the ones specified at compilation.

Using hint is generally not good practice. In this case it's a trade off between compile time and performance improvement. The query seems simple enough so the penalty of recompilation seems small.

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 7

The recompile works, the same as forcing the proper index with a hint, but in this case what I really want to figure out is what is SQL thinking, if all the stats are fresh, the index are not fragmented, why is it behaving in such a way.

As to add a new factor, I have 4 other databases that have the same structures and on those an equivalent query uses the appropriated index.

To me this really seems like a bug

I can't just force the index or add a recompile in there because I do not know how many other cases like this could be in there....

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 8

It almost sounds like a "simple" case of "bad" parameter sniffing.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 9

Jeff Moden (3/15/2015)
It almost sounds like a "simple" case of "bad" parameter sniffing.

That sounds right, but, what can I do about it?

I can't just go and change tons of code because of this...

Bouke Bruinsma SSCrazy Points: 2189 More actions · Answer 10

Bouke Bruinsma

SSCrazy

Points: 2189

March 15, 2015 at 1:39 pm

#1784123

Maybe plan guides?

roger.price-1150775 Default port Points: 1458 More actions · Answer 11

Is the range you are using recently added rows? Recently added values don't appear in the statistic for the table so optimiser can't use stats to estimate record count. However the way I fixed this in the past is to give the optimiser a hint to tell it the gals to optimise for using values already in the statistics for the table. Use the OPTIMIZE FOR query hint. Use a couple of values that are close. They don't need to be recent, just a couple of values the optimiser will recognise in the stats and use the plan you want.

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 12

Are @lowerlimit and @upperlimit local variables or parameters? If they're local variables, you're probably seeing, well, the opposite of parameter sniffing. it's using average values rather than specific values against the statistics for your table. That's why if you hard code the values in it's a better plan. That's parameter sniffing being helpful (which it is most of the time).

To fix this... change those variables to parameters so that you can take advantage of the sniffing process. You could put a recompile hint on there and then local variables will be sniffed as well. Make sure your statistics are up to date. Other parameter sniffing fixes are to use a query hint with OPTIMIZE for to either a specific set of values or to UNKNOWN to get generic values. Not sure that would help in this case.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Can we see the query and execution plan please? There's lots of guessing happening, but hard to say what's really happening without seeing details.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 14

Yes, those are local variables

The option(recompile) or the optimIze hint work, but, MS SQL shouldn't be doing this, I have a clustered index for a reason, I need to trust that it will work as expected not with this random behavior,

I left an update stats of all the tables running since rebuilding all the indexes at that table didn't work, I will check the effects in a couple hour.

Sorry I can't share and exec plan or the query, I am bound by contract not to release the smallest piece of code, but the example I provided is very close to the actual thing, the only thing I didn't is that those variables are varchar(23).

Since it seems that I can't trust the optimizer anymore even in a very simple query as this, is there a way to look for this behavior by using the dmv tables?