April 12, 2016 at 6:58 am
Grant Fritchey (4/12/2016)
Doug.F (4/11/2016)
I was thinking that as well, so I went ahead and ran a:UPDATE STATISTICS dbo.TableName WITH FULLSCAN
On all tables that are being joined with the SQL query, unfortunately this had no appreciable impact on the execution plan that is being selected.
I also went ahead and did a rebuild on the indexes that I am joining on, that did not change the execution plan either.
I can include the actual T-SQL query if that would add any additional insight into my problem.
Rebuilding an index is done to defragment the index. That won't affect the execution plan. However, rebuilding the index will result in updated statistics. If you think your statistics are out of date, either update the statistics, or rebuild the index. Doing both results in two updates to your statistics.
The first question I have is, is the data the same in both environments? 100 rows in each table, etc. If not, that alone could be the problem.
Thank you for the information in regards to rebuilding indexes, updating the statistics. The main reasoning, was simply that it wouldn't hurt for me to rebuild the indexes, even though the statistics had already been updated.
In regards to the data between the two environments, there was a slight difference in the number of records within the two - while testing this the number does not differ significantly. They both have nearly the same number of records in the source, and the destination and other tables it is being joined against have the same number of records.
April 12, 2016 at 7:31 am
Grant Fritchey (4/12/2016)
Both plans are timing out. That means that, depending on the CPU cycles and estimates from the statistics, you could see different plans on different days on either environment. It won't necessarily be consistent. As everyone else has pointed out, the estimates are off. That alone could easily explain what's going on. Statistics are the principal driver for the optimizer to make choices on the plans. Combine these differences with the timeout and you've largely explained what's going on.
That's exactly what I was concerned with, especially considering I wasn't understanding the underlying problem itself. I wasn't honestly paying enough attention to the estimates, even though I know that they are based upon the statistics for a table and make the determination on what type of Execution Plan the Query Optimizer selects.
Grant Fritchey (4/12/2016)
You have lots of implicit conversion warnings in there. You might want to double check how this query is put together. You have an estimated cost of 9 & 13 (dev & prod), which is awfully low for parallel execution. I suspect you're still operating with the default cost threshold for parallelism of 5. I'd bump that number up some. These functions, LEFT(SPUV.strFirstName,1) and CAST(SPUV.dteDOB AS DATE), are causing table scans and preventing good index use.
I will look into the default cost threshold for parallelism shortly, I can almost assure you it's the default setting - I've read about it previously, but obviously not enough.
In regards to the functions that are causing table scans, that was intended. As much as I would absolutely love not to do this, the nature of the existing ETL processes and systems, prevents me from doing as I would like - which is correctly configuring the datatypes and basically doing a bunch of data cleanup and reindexing. This was actually a compromise that I had to arrive at with my Head DBA, just for this much matching.
Grant Fritchey (4/12/2016)
Considering the size of this query overall, it's not that big, the combination of the functions and the parallel execution are what's leading to the timeout in the optimizer. Cleaning those up will help arrive at a more consistent plan. Then, getting better row estimates by ensuring the statistics are up to date will also get you a better plan. For example, production is doing a LOOP join against 25,000 rows because it thinks it's only getting 187.
I'm certain now, that the underlying cause was that the Estimated Number of Rows was being thrown off due to a filter being applied based upon a GROUP BY's COUNT() function. Which actually should be a fairly straightforward fix, but I've definitely gotten some good ideas on things I can further research.
If you have any recommendations on material concerning this subject, I'm always interested in learning more. Specifically, what I've been looking for recently is a more comprehensive understanding, or how the Estimated Execution Plan is arrived at - so that I can avoid this type of situation in the future >.<
April 12, 2016 at 8:04 am
Doug.F (4/12/2016)
If you have any recommendations on material concerning this subject, I'm always interested in learning more. Specifically, what I've been looking for recently is a more comprehensive understanding, or how the Estimated Execution Plan is arrived at - so that I can avoid this type of situation in the future >.<
Check my signature.
I am updating the executions plan book, but it's a few months away from completion.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 3 posts - 16 through 17 (of 17 total)
You must be logged in to reply to this topic. Login to reply