March 20, 2013 at 5:14 am
I've just had a designer come to me with a really odd problem. A query, joining two tables on a primary key, example below:
SELECT <fields> FROM <table1> INNER JOIN <table2> ON <table1.primarykey> = <table2.indexed field>
The query runs optimally, within a couple of seconds .. until .. an additional field is added to the SELECT fields from <table2>.
I can see what is happening. Until the additional field is added, all of the fields in the SELECT are available in an index; it is an old index migrated from SQL2000 with the fields as part of the index list and not simply INCLUDEd. Stats show the index is being used and to good use.
However the moment the new field from table 2 is added, which is not being covered by the index, rather than the optimizer looking for another qualifying index, it is changing from the non-clustered index to a clustered index scan of table 2 which is processing nearly 2 million rows instead of the small subset it was doing before, and the execution time increases twenty fold.
As this is running under a development environment we have tried everything from
1: Creating a similar index with all the fields (a very wide index and not my first choice), but this is ignored.
2: Adding the additional fields to the existing index in the INCLUDE - also ignored.
3: Dropped the previously working index - This did cause it to revert back to a better existing index - but obviously any other queries benefiting from it would now degrade, and the moment we recreated the index it was back to it's primary key scan.
I've cleared the cache plans, updated statistics, rebuilt the indexes, everything I can think of.
It will work if we put an index hint in place and tell it the index we think it should be using, and the performance is back to a couple of seconds. I know hints are the last resort, but I can't think of anything else I can investigate before we choose this as the final solution.
Any advice would be greatly appreciated. 😀
---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.
March 20, 2013 at 6:19 am
Did you update statistics with FULLSCAN after the 2000 upgrade? It's a compulsory step.
Other than that, we'd need full table and index definitions and the actual execution plans to see for sure.
SQL Server will definitely favour scans over seek+key lookup if it estimates (using statistics) that you're returning anything over a few percent of the overall table size, as key lookups are very expensive, but a fully covering index should be used in most cases...
I'm assuming the query is actually more complicated than the example posted above, so really just guessing without the actual scenario.
March 20, 2013 at 8:42 am
Yup, all tables had their statistics updated with FULLSCAN after migration, and the migration actually occurred 4 years ago and statistics have been updated frequently since then to ensure that they are as up to date as possible. I just reran it with FULLUPDATE as well just in case.
The query actually isn't that much more complex, it is literally selecting 5 fields from one table and 3 from the other using an INNER JOIN on the primary clustered key for the second table. Quite literally the query reads:
SELECT event.eventid, event.eventdate, event.chargediv, booking.bookingid, booking.type, booking.fee, booking.issubscription, booking.iscancelled
FROM event
INNER JOIN booking
ON event.eventid = booking.eventid
WHERE event.eventdate >= '2013-01-01'
event.eventid is a primary clustered key, as is booking.bookingid
an index exists on the booking table on the fields eventid, eventdate, bookingid, type, fee
another index exists on the booking table on the field eventid, eventdate
and one last one on event as the eventdate
By removing the issubscription and iscancelled fields the plan the server comes up with is an index scan on the first index, runs in 2 seconds and has 5700 logical reads and 5700 read aheads.
The plan is an index scan on event followed by a computer scaler, and index seek on booking, joined via an inner join nested loop. Both pipes are carrying around 3k-4k records, cpu and i/o costs are minimal.
Add the extra fields and rather than using the index on eventid and eventdate, it uses the index on eventid, eventdate, it scans the bookingid primary key (approximately 1, 800, 000 records), logical reads is not 64600 and read aheads is 64600 (approx), taking just over 20 minutes.
The plan now doesn't change with regards to the event, but the booking changes to a clustered index scan carrying all 1.8m records, joined via a hash match inner join. The I/O for the clustered scan and the cpu for the scan and hash match are very high.
The total number of rows returned is just under 4000.
Adding an hint to tell it to use the index drops the reads back and reduces the time back to 2 seconds.
As daft as it sounds, it is almost as though the optimiser sees the longer index as the best match until it spots the additional field requested, at which point it doesn't look for the shorter index, but instead falls back to it's scan, and we can replicate it time and again by adding and removing the field from the select. I've even dropped all indexes, recreated the ones needed from scratch, calculated the spread of data in the indexes and the statistics, and everything points to the right index - except the optimiser!
I'll keep digging, thanks for the thoughts though.
---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.
March 20, 2013 at 8:47 am
Please post the following:
DDL for the two tables, including the indexes (leave off extended properties if you have those)
Actual execution plans for the good and bad query executions. Save the plans as .SQLPLAN files and attach them to a post on this thread.
March 20, 2013 at 9:00 am
Agreed, we need the actual execution plans and DDL to help.
March 20, 2013 at 5:21 pm
event.eventdate (as any other column used for a range selection, i.e. >=, <, BETWEEN, etc.) must be made a 1st column of a clustered index on the table. Or - if there are 2 or more such columns in the table - 1st column of a covering index.
_____________
Code for TallyGenerator
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply