In Part I, I discussed some of the peculiarities and troubleshooting done in relation to a peculiar execution plan. In this post, I will discuss some of what I did to further tune the query.
First, a little more background into the design of this query. This query uses some dynamic SQL to try and build a dynamic where clause based on the value passed to a single parameter. I need to be able to create three different where clauses that could essentially create a range-scan on a date field. In two of the three queries could be an inequality date comparison rather than an equality comparison. The specific requirements are to have the report return 1 of the following: 1) 2010 data only, 2) < 2010 data, and 3) all data. This was initially accomplished with a query similar to:
Case when convert(varchar(4), @RecordDate, 112) = '2010' then ' convert(varchar(4), RecordDate, 112) = ''2010'''
when convert(varchar(4), @RecordDate, 112) < '2010' then ' isnull(RecordDate,''1/1/2000'') < ''1/1/2010'''
else ' isnull(RecordDate,''1/1/2001'') > ''1/1/2000'''
End
One of the first comments might be, “That query is not SARG-able.” True it is not a SARG query. That will be the last item to be evaluated. The slowness of the query in this case has less to do with the SARG-ability of it than it does with the inequality scan. The puzzle in the query was “How to make the query an equality comparison” rather than an < or > comparison. The above query uses the indexes as discussed in the prior post – it is just not as fast as it could be.
It is at this point that I am reminded of an integral part of my last two presentations at my local PASS meetings. And it dawns on me that an understanding of the data is truly necessary. Most of the RecordDate values are Null since there was no backfill on the Column after it’s implementation. Prior to it’s implementation there was no Date control on the table. The table is in a warehouse and only contains data since the beginning of 2009. It is also necessary to understand that there is a minimal chance that there will be record additions to the table after 2010. Thus we now have a method to be able to get the equality comparisons. The following might satisfy this need:
Case when convert(varchar(4), @RecordDate, 112) = '2010' then ' convert(varchar(4), RecordDate, 112) = ''2010'''
when convert(varchar(4), @RecordDate, 112) < '2010' then ' dateadd(yy, datediff(yy, 0, isnull(RecordDate,''1/1/2009'')), 0) = ''1/1/2009'' '
else ' isnull(RecordDate,''1/1/2009'') = isnull(RecordDate,''1/1/2009'') '
End
This change is now using a Clustered Index seek. This clustered Index seek is the most costly piece of the Execution plan. However, the overall runtime of the query has improved by 1 sec.
Thinking I could still improve the query, I decided that a new column called ReportYear would help out the cause. The change would allow for the query to be SARG-able, the column would have a default on it to change RecordDate to the first day of the year recorded in the RecordDate field. I figured this idea would also still permit me to do an Eqaulity search in my Where clause. Let’s see the results:
Wow, I have just come full circle back to the First Post in this series. I just lost the clustered Index seek, but I did not lose any of the performance. To see if I gained anything by making the query SARG-able, let’s look at the IO stats between the two.
Top image represents the First Query and the bottom image represents the Second Query. We have an improvement of about 600,000 logical reads. That is a substantial improvement. That improvement comes even with an Index Scan on an Index that is non-covering. Now, to tune the index. Much the same as the first article, I added an index to the SalesOpportunity table on LastUpdate and SalesOppID. This creates a covering index and removes the hash match join in favor of the merge join in the execution plan. Another effect is the IO Stats as displayed below.
We can see here that there was once again an improvement made in the logical reads. Another ~10,000 logical reads improvement. Overall, both queries run in the same amount of time – but the second query uses less resources. Again, if I change the clustered index on this table – I am certain I would see an improvement.
I am still in search of the reason behind the use of a bad index for this query.
Execution Plans can be downloaded here.