This month, Robert Pearl is helping to host a Performance Story contest. The contest is being sponsored by Safe Peak, and you can see more about it – here.
The timing on this contest actually fits well with something I have already been working on. The only delay has been related to getting everything scrubbed and obfuscated. There is a lot that could be included with this particular experience, I plan on touching on two of the key areas.
Backstory
I have been working a lot lately with building reports in SSRS. These reports had been working great from IE hitting directly to the SSRS Server in one environment. When we migrated the reports to the production environment – we started seeing some serious timeout issues.
There was one report in particular that was giving more grief than others. I had already gone through everything I could find with regards to SSRS in trying to help speed these reports up to acceptable times.
Having gone through what could be done in SSRS, I next turned my attention to the main stored procedure creating the data-set for the report.
Stored Procedure
Although the report was not even making it to this particular stored procedure – I wanted to take a look at it and make any improvements there that I could. It made sense to eliminate it as a possible source of the problem.
I opened up the stored procedure and found a fair amount if Business Logic. I came to a section of the proc that contained code that at first sight, made my head spin a bit. I thought for sure I was seeing double because of all of the logic.
Once I finally got my head wrapped around it, I noticed a lot of similarities and wanted to start from that particular section with my tuning. Here is an obfuscated version of what I saw.
[codesyntax lang=”tsql”]
DECLARE @CompareParam INT = 200 SELECT tccv.MyColumn from dbo.table1 as cb INNER JOIN dbo.table2 tcfbt on cb.column1 = tcfbt.column1fk INNER JOIN dbo.table3 as cpd on cb.column2 = cpd.column2fk INNER JOIN dbo.table4 as tcfcpt on cpd.column3 = tcfcpt.column3fk INNER JOIN dbo.table5 as tcfcpct on cpd.column4 = tcfcpct.column4fk INNER JOIN dbo.table6 cp on cb.column5 = cp.column5fk INNER JOIN dbo.table7 cal on cp.column6 = cal.column6fk INNER JOIN dbo.table8 tgmc on cal.column7 = tgmc.column7fk INNER JOIN dbo.table9 tccv on tgmc.column8 = tccv.column8fk INNER JOIN dbo.table10 fps on cp.column9 = fps.column9fk where cb.column5 = @CompareParam and ( tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3 and ( ( -- use case 1 cp.column5fk = @CompareParam and fps.CodeVal not in ('Val1','Val2') and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4) ) --use case 2 or ( cp.column5fk = @CompareParam and tccv.Date1 < cp.Date2 and not exists ( select 1 from dbo.table11 cvcp INNER JOIN dbo.table6 prevcp on cvcp.column5 = prevcp.column5fk INNER JOIN dbo.table10 prevfps on prevfps.column9fk = prevcp.column9 where prevcp.column6 = cp.column6 and prevcp.Date4 < cp.Date2 and cvcp.MyColumnfk = tccv.MyColumn and prevfps.CodeVal in ('Val1','Val2') ) and exists ( select 1 from dbo.table6 as pastcp INNER JOIN dbo.table10 as pastfps on pastcp.column9 = pastfps.column9fk where pastcp.column6 = cp.column6 and pastcp.Date4 < cp.Date2 and tccv.Date1 between pastcp.Date2 and pastcp.Date4 and pastfps.CodeVal in ('Val1','Val2') ) ) ) ) or ( -- same as the other side of 'OR' just above tcfbt.CodeVal = 'Live' and not exists ( select 1 from dbo.table12 cvh where tccv.MyColumn = cvh.MyColumnfk ) and ( ( -- use case 1 cp.column5fk = @CompareParam and fps.CodeVal not in ('Val1','Val2') and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4) ) --use case 2 or ( cp.column5fk = @CompareParam and tccv.Date1 < cp.Date2 and not exists ( select 1 from dbo.table11 cvcp INNER JOIN dbo.table6 prevcp on cvcp.column5fk = prevcp.column5fk INNER JOIN dbo.table10 as prevfps on prevcp.column9 = prevfps.column9fk where prevcp.column6 = cp.column6 and prevcp.Date4 < cp.Date2 and cvcp.MyColumnfk = tccv.MyColumn and prevfps.CodeVal in ('Val1','Val2') ) and exists ( select 1 from dbo.table6 as pastcp INNER JOIN dbo.table10 as pastfps on pastcp.column9 = pastfps.column9fk where pastcp.column6 = cp.column6 and pastcp.Date4 < cp.Date2 and tccv.Date1 between pastcp.Date2 and pastcp.Date4 and pastfps.CodeVal in ('Val1','Val2') ) ) ) )
[/codesyntax]
I want you to look closely at both sides of that OR condition in the where clause before proceeding to look at the image of the execution plan that this query generates.
Again, my head started spinning when I first looked at the execution plan. It was the size of the overall plan that was causing it this time. Looking through it though, I quickly saw what I thought was the main point of slowness with this query (yes it was slow). If you open the image of that execution plan and compare the red section to the yellow section, you will see that it is duplicated. These sections also correlate to each side of the OR condition that was already mentioned.
When browsing this execution plan in SSMS, I was also lured into these sections due to a bad estimate on the query cost (which you don’t see in this image). All of the Index Spools and Index scans/seeks had a cost of 89% attached to them (within the red and yellow regions). Seeing that both sections were identical – I was certain that the OR condition could be optimized.
DeDuping
Since the queries were soooo similar, I knew there had to be a way to combine both sides and make it less costly. After working on it a bit, I came up with the following query.
[codesyntax lang=”tsql”]
DECLARE @CompareParam INT = 200 SELECT tccv.MyColumn from dbo.table1 as cb INNER JOIN dbo.table2 tcfbt on cb.column1 = tcfbt.column1fk INNER JOIN dbo.table3 as cpd on cb.column2 = cpd.column2fk INNER JOIN dbo.table4 as tcfcpt on cpd.column3 = tcfcpt.column3fk INNER JOIN dbo.table5 as tcfcpct on cpd.column4 = tcfcpct.column4fk INNER JOIN dbo.table6 cp on cb.column5 = cp.column5fk INNER JOIN dbo.table7 cal on cp.column6 = cal.column6fk INNER JOIN dbo.table8 tgmc on cal.column7 = tgmc.column7fk INNER JOIN dbo.table9 tccv on tgmc.column8 = tccv.column8fk INNER JOIN dbo.table10 fps on cp.column9 = fps.column9fk where cb.calendarPeriodID = @CompareParam and ( ( tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3 OR (tcfbt.CodeVal = 'Live' and not exists ( select 1 from dbo.table12 cvh where tccv.MyColumn = cvh.MyColumnfk ) ) ) and ( ( -- use case 1 cp.column5fk = @CompareParam and fps.CodeVal not in ('Val1','Val2') and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4) ) --use case 2 or ( cp.column5fk = @CompareParam and tccv.Date1 < cp.Date2 and not exists ( select 1 from dbo.table11 cvcp INNER JOIN dbo.table6 prevcp on cvcp.column5fk = prevcp.column5fk INNER JOIN dbo.table10 prevfps on prevfps.column9fk = prevcp.column9 where prevcp.[calendarID] = cp.[calendarID] and prevcp.Date4 < cp.Date2 and cvcp.MyColumnfk = tccv.MyColumn and prevfps.CodeVal in ('Val1','Val2') ) and exists ( select 1 from dbo.table6 as pastcp INNER JOIN dbo.table10 as pastfps on pastcp.column9 = pastfps.column9fk where pastcp.column6 = cp.column6 and pastcp.Date4 < cp.Date2 and tccv.Date1 between pastcp.Date2 and pastcp.Date4 and pastfps.CodeVal in ('Val1','Val2') ) ) ) )
[/codesyntax]
Now take a close look at the first part of the where clause.
[codesyntax lang=”tsql”]
where cb.calendarPeriodID = @CompareParam and ( ( tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3 OR (tcfbt.CodeVal = 'Live' and not exists ( select 1 from dbo.table12 cvh where tccv.MyColumn = cvh.MyColumnfk ) ) ) and
[/codesyntax]
That little change represents the only difference between the two sides of the Or condition in the prior query. The result set is exactly the same. Now we get a new execution plan that looks like this.
In this execution plan, you can see that there is now only one section matching the red and yellow sections from the previous query. This is a very good thing. We can see that the plan looks simpler than the last one. What about if we execute the two queries in a single batch for comparison?
The improvements we see are substantial. Look at the difference in cost comparison between the two!! The old query consumes 97% of the cost. To further illustrate the improvements by making this little change in logic, let’s take a look at the time.
Looking at the time it takes to run these queries should also be a strong indicator of the performance gains made by simplifying that where clause. Let’s also take a look at another couple of screen-shots that helps to show some of the different improvements made by simplifying that “OR” condition.
Notice the old plan has an index scan, sort operation and 61 parallel operations. The new plan has a key lookup. Now look at the glaring difference between the two in estimated rows. That is significant.
That is not all that was done with the query itself to improve performance. Other things like index tuning and more work on the rest of the query was also done with some improvements in overall execution of the query. But none were as significant as this.
TCP/IP
The other place where significant improvement was made in the performance of this query came from something outside of SQL Server. A key part of this exercise has not yet been mentioned. The reports were all taking 30-60 seconds to even load without any parameters having been selected. Then they would take another 30 seconds or so after that to bring up the next parameter.
We ran a trace between the two servers to see what we could find. After some more troubleshooting, we decided to try disabling the TCP Chimney. This change had a significant impact in loading the reports from SSRS. From within IE, these reports were now immediately loading and the prompt flow became very responsive.
Conclusion
Taking a little bit of extra time when writing out the logic for your queries can have a profound impact on the responsiveness of your queries. In this case, simplifying the code but retaining the logic had a tremendous effect on overall performance.
In addition to what can be done in SQL Server, sometimes one needs to look outside of SQL to the network or disk layer for further tuning. Working with the appropriate teams to accomplish those goals can go miles to having a positive effect on co-workers and application experience by the business users.