This month, Robert Pearl is helping to host a Performance Story contest. The contest is being sponsored by Safe Peak, and you can see more about it – here.
The timing on this contest actually fits well with something I have already been working on. The only delay has been related to getting everything scrubbed and obfuscated. There is a lot that could be included with this particular experience, I plan on touching on two of the key areas.
Backstory
I have been working a lot lately with building reports in SSRS. These reports had been working great from IE hitting directly to the SSRS Server in one environment. When we migrated the reports to the production environment – we started seeing some serious timeout issues.
There was one report in particular that was giving more grief than others. I had already gone through everything I could find with regards to SSRS in trying to help speed these reports up to acceptable times.
Having gone through what could be done in SSRS, I next turned my attention to the main stored procedure creating the data-set for the report.
Stored Procedure
Although the report was not even making it to this particular stored procedure – I wanted to take a look at it and make any improvements there that I could. It made sense to eliminate it as a possible source of the problem.
I opened up the stored procedure and found a fair amount if Business Logic. I came to a section of the proc that contained code that at first sight, made my head spin a bit. I thought for sure I was seeing double because of all of the logic.
Once I finally got my head wrapped around it, I noticed a lot of similarities and wanted to start from that particular section with my tuning. Here is an obfuscated version of what I saw.
DECLARE @CompareParam INT = 200
SELECT tccv.MyColumn
FROM dbo.table1 AS cb
INNER JOIN dbo.table2 tcfbt
ON cb.column1 = tcfbt.column1fk
INNER JOIN dbo.table3 AS cpd
ON cb.column2 = cpd.column2fk
INNER JOIN dbo.table4 AS tcfcpt
ON cpd.column3 = tcfcpt.column3fk
INNER JOIN dbo.table5 AS tcfcpct
ON cpd.column4 = tcfcpct.column4fk
INNER JOIN dbo.table6 cp
ON cb.column5 = cp.column5fk
INNER JOIN dbo.table7 cal
ON cp.column6 = cal.column6fk
INNER JOIN dbo.table8 tgmc
ON cal.column7 = tgmc.column7fk
INNER JOIN dbo.table9 tccv
ON tgmc.column8 = tccv.column8fk
INNER JOIN dbo.table10 fps
ON cp.column9 = fps.column9fk
WHERE
cb.column5 = @CompareParam
and
(
tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3
and
(
( -- use case 1
cp.column5fk = @CompareParam
and fps.CodeVal not in ('Val1','Val2')
and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4)
)
--use case 2
or
( cp.column5fk = @CompareParam
and tccv.Date1 < cp.Date2
and not exists (
SELECT 1
FROM dbo.table11 cvcp
INNER JOIN dbo.table6 prevcp
ON cvcp.column5 = prevcp.column5fk
INNER JOIN dbo.table10 prevfps
ON prevfps.column9fk = prevcp.column9
WHERE prevcp.column6 = cp.column6
and prevcp.Date4 < cp.Date2
and cvcp.MyColumnfk = tccv.MyColumn
and prevfps.CodeVal in ('Val1','Val2')
)
and exists (
SELECT 1
FROM dbo.table6 AS pastcp
INNER JOIN dbo.table10 AS pastfps
ON pastcp.column9 = pastfps.column9fk
WHERE pastcp.column6 = cp.column6
and pastcp.Date4 < cp.Date2
and tccv.Date1 between pastcp.Date2 and pastcp.Date4
and pastfps.CodeVal in ('Val1','Val2')
)
)
)
)
or
(
-- same as the other side of 'OR' just above
tcfbt.CodeVal = 'Live'
and not exists (
SELECT 1
FROM dbo.table12 cvh
WHERE tccv.MyColumn = cvh.MyColumnfk
)
and
(
( -- use case 1
cp.column5fk = @CompareParam
and fps.CodeVal not in ('Val1','Val2')
and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4)
)
--use case 2
or
(
cp.column5fk = @CompareParam
and tccv.Date1 < cp.Date2
and not exists (
SELECT 1
FROM dbo.table11 cvcp
INNER JOIN dbo.table6 prevcp
ON cvcp.column5fk = prevcp.column5fk
INNER JOIN dbo.table10 AS prevfps
ON prevcp.column9 = prevfps.column9fk
WHERE prevcp.column6 = cp.column6
and prevcp.Date4 < cp.Date2
and cvcp.MyColumnfk = tccv.MyColumn
and prevfps.CodeVal in ('Val1','Val2')
)
and exists (
SELECT 1
FROM dbo.table6 AS pastcp
INNER JOIN dbo.table10 AS pastfps
ON pastcp.column9 = pastfps.column9fk
WHERE pastcp.column6 = cp.column6
and pastcp.Date4 < cp.Date2
and tccv.Date1 between pastcp.Date2 and pastcp.Date4
and pastfps.CodeVal in ('Val1','Val2')
)
)
)
)
I want you to look closely at both sides of that OR condition in the where clause before proceeding to look at the image of the execution plan that this query generates.
Again, my head started spinning when I first looked at the execution plan. It was the size of the overall plan that was causing it this time. Looking through it though, I quickly saw what I thought was the main point of slowness with this query (yes it was slow). If you open the image of that execution plan and compare the red section to the yellow section, you will see that it is duplicated. These sections also correlate to each side of the OR condition that was already mentioned.
When browsing this execution plan in SSMS, I was also lured into these sections due to a bad estimate on the query cost (which you don’t see in this image). All of the Index Spools and Index scans/seeks had a cost of 89% attached to them (within the red and yellow regions). Seeing that both sections were identical – I was certain that the OR condition could be optimized.
DeDuping
Since the queries were soooo similar, I knew there had to be a way to combine both sides and make it less costly. After working on it a bit, I came up with the following query.
DECLARE @CompareParam INT = 200
SELECT tccv.MyColumn
FROM dbo.table1 AS cb
INNER JOIN dbo.table2 tcfbt
ON cb.column1 = tcfbt.column1fk
INNER JOIN dbo.table3 AS cpd
ON cb.column2 = cpd.column2fk
INNER JOIN dbo.table4 AS tcfcpt
ON cpd.column3 = tcfcpt.column3fk
INNER JOIN dbo.table5 AS tcfcpct
ON cpd.column4 = tcfcpct.column4fk
INNER JOIN dbo.table6 cp
ON cb.column5 = cp.column5fk
INNER JOIN dbo.table7 cal
ON cp.column6 = cal.column6fk
INNER JOIN dbo.table8 tgmc
ON cal.column7 = tgmc.column7fk
INNER JOIN dbo.table9 tccv
ON tgmc.column8 = tccv.column8fk
INNER JOIN dbo.table10 fps
ON cp.column9 = fps.column9fk
WHERE
cb.calendarPeriodID = @CompareParam
and
(
( tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3
OR (tcfbt.CodeVal = 'Live'
and not exists (
SELECT 1
FROM dbo.table12 cvh
WHERE tccv.MyColumn = cvh.MyColumnfk
)
)
)
and
(
( -- use case 1
cp.column5fk = @CompareParam
and fps.CodeVal not in ('Val1','Val2')
and tccv.Date1 between cp.Date2 and ISNULL(cp.Date3, cp.Date4)
)
--use case 2
or
( cp.column5fk = @CompareParam
and tccv.Date1 < cp.Date2
and not exists (
SELECT 1
FROM dbo.table11 cvcp
INNER JOIN dbo.table6 prevcp
ON cvcp.column5fk = prevcp.column5fk
INNER JOIN dbo.table10 prevfps
ON prevfps.column9fk = prevcp.column9
WHERE prevcp.[calendarID] = cp.[calendarID]
and prevcp.Date4 < cp.Date2
and cvcp.MyColumnfk = tccv.MyColumn
and prevfps.CodeVal in ('Val1','Val2')
)
and exists (
SELECT 1
FROM dbo.table6 AS pastcp
INNER JOIN dbo.table10 AS pastfps
ON pastcp.column9 = pastfps.column9fk
WHERE pastcp.column6 = cp.column6
and pastcp.Date4 < cp.Date2
and tccv.Date1 between pastcp.Date2 and pastcp.Date4
and pastfps.CodeVal in ('Val1','Val2')
)
)
)
)
Now take a close look at the first part of the where clause.
WHERE
cb.calendarPeriodID = @CompareParam
and
(
( tcfbt.CodeVal = 'In-Progress' -- use cases 1,2,3
OR (tcfbt.CodeVal = 'Live'
and not exists (
SELECT 1
FROM dbo.table12 cvh
WHERE tccv.MyColumn = cvh.MyColumnfk
)
)
)
and
That little change represents the only difference between the two sides of the Or condition in the prior query. The result set is exactly the same. Now we get a new execution plan that looks like this.
In this execution plan, you can see that there is now only one section matching the red and yellow sections from the previous query. This is a very good thing. We can see that the plan looks simpler than the last one. What about if we execute the two queries in a single batch for comparison?
The improvements we see are substantial. Look at the difference in cost comparison between the two!! The old query consumes 97% of the cost. To further illustrate the improvements by making this little change in logic, let’s take a look at the time.
Looking at the time it takes to run these queries should also be a strong indicator of the performance gains made by simplifying that where clause. Let’s also take a look at another couple of screen-shots that helps to show some of the different improvements made by simplifying that “OR” condition.
Notice the old plan has an index scan, sort operation and 61 parallel operations. The new plan has a key lookup. Now look at the glaring difference between the two in estimated rows. That is significant.
That is not all that was done with the query itself to improve performance. Other things like index tuning and more work on the rest of the query was also done with some improvements in overall execution of the query. But none were as significant as this.
TCP/IP
The other place where significant improvement was made in the performance of this query came from something outside of SQL Server. A key part of this exercise has not yet been mentioned. The reports were all taking 30-60 seconds to even load without any parameters having been selected. Then they would take another 30 seconds or so after that to bring up the next parameter.
We ran a trace between the two servers to see what we could find. After some more troubleshooting, we decided to try disabling the TCP Chimney. This change had a significant impact in loading the reports from SSRS. From within IE, these reports were now immediately loading and the prompt flow became very responsive.
Conclusion
Taking a little bit of extra time when writing out the logic for your queries can have a profound impact on the responsiveness of your queries. In this case, simplifying the code but retaining the logic had a tremendous effect on overall performance.
In addition to what can be done in SQL Server, sometimes one needs to look outside of SQL to the network or disk layer for further tuning. Working with the appropriate teams to accomplish those goals can go miles to having a positive effect on co-workers and application experience by the business users.