I ran into a peculiarly puzzling situation with a query I have been tuning of late. The query runs some aggregations and performs a range scan based on a date value passed to the query. The primary target for this query is to be consumed by reporting services. The query executes (from a stored proc) three times in a single report. This is another case were the report currently outperforms the previous version by several fold. I just want it to run a little faster – it is an executive report afterall.
While tuning the query itself, I removed some of the aggregations (not consumed by the report and not needed for accuracy), and that bought me a small performance gain. Next was to remove a couple of “or” operations. Again, that only bought me a small performance gain.
The next step was to check my indexing. Looking at the maintenance plan for the query, indexes were being used – or so it seemed. Based on the execution plan, the query was performing two Index Scans prior to a Hash Match. In the first Index Scan, the query was expecting an Output of two columns (SalesOppID and LastUpdate) from IDX_SalesOpportunity_SalesStatusIDSalesContactID. This is depicted in the next two images.
This makes a little sense since the index does contain the LastUpdate column. But the index is not covering, and we do not have a Key lookup.
CREATE NONCLUSTERED INDEX [IDX_SalesOpportunity_SalesStatusIDSalesContactID] ON [dbo].[SalesOpportunity]
(
[SalesStatusID] ASC,
[SalesContactID] ASC,
[ProspectID] ASC,
[CampaignID] ASC,
[LastUpdate] ASC
) ON [PRIMARY]
Out of curiosity, I decided to see what would happen if I commented out the lines that required the LastUpdate field. I ended up with the following:
Interesting, the query optimizer decided that we should use IDX_LeadTypeID which has the following definition:
CREATE NONCLUSTERED INDEX [IDX_LeadTypeID] ON [dbo].[SalesOpportunity]
(
[LeadTypeID] ASC
) ON [PRIMARY]
Once again, I have a query that is using a bad index. There is no key lookup and the index is not covering. I find this very peculiar. Thus I proceeded to test a little more. I dropped that Index and reran the query. With each group of tests, it is important to note, I cleared the ProcCache. After dropping the index, my execution plan showed no performance increase and still showed a funny result. The query moved onto a new index that was not covering and there was no key lookup.
Now, let’s make this a little more interesting. I do have a Clustered Index on SalesOppID – defined as follows:
ALTER TABLE [dbo].[SalesOpportunity] ADD CONSTRAINT [PK_SalesOpportunity] PRIMARY KEY CLUSTERED
(
[SalesOppID] ASC
)WITH (
PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF
, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90
) ON [PRIMARY]
I am puzzled as to why my query will not use the Clustered Index. I decided to do the standard stuff – DBCC CheckDB, DBCC UpdateUsage etc. There is no corruption, and the query still tries to perform the same way. I could recreate the clustered index on the LastUpdate field. Since there are Foreign Key relationships that I do not want to drop and recreate, I will just create an NC index on the appropriate two fields. After doing that, I get the following:
The index is covering and pulling the correct columns. With this index the query is about 3 times faster than previously. Some of that is due to the change from a Hash Match to a Merge Join from the two tables (there is still a hash match but it is after the merge and is now less costly).
The moral is, even though an index may be used by the query – it may not be a good index choice. I am still working on tuning this query, I think I can squeeze better performance out of it.
Execution Plans can be downloaded here.