Bad performance of a query

  • Thank you. I will make the changes tomorrow morning as its 1 am here and i am finding it hard to concentrate now. Will post the plan as soon as i reach office tomorrow. Thanks for your valuable time.

  • weitzera (8/23/2011)


    On or around line 287 you have a bunch of subqueries selecting counts from the @ipOrderStatusCodes with no outer references (meaning the subquery does not reference anything in the main query, so it will be the same value for every row.)

    I think you should change these lines like this:

    ((SELECT count(*) FROM @ipOrderStatusCodes WHERE StatusCode='O' AND SubCode='Declined')=0 OR (a.OrderStatus = 'O' AND DispatchStatus='Declined')) AND ...

    becomes:

    (@ipOrderStatusCodeCountODeclined=0 OR (a.OrderStatus = 'O' AND DispatchStatus='Declined')) AND ...

    On line 193 where you calculate the total order status count, you can precalculate each of these values:

    DECLARE @ipOrderStatusCodeCountODeclined INT;

    SELECT @ipOrderStatusCodeCountODeclined=count(*) FROM @ipOrderStatusCodes WHERE StatusCode='O' AND SubCode='Declined';

    and so on...

    Again, I don't think this is affecting your performance, but it is making it harder to read the plan.

    I am not a developer by any means. But your suggestions taught me a lot. Following your advice, I removed all aggregations in where clause and pre-calculated them and just used the variables which stored the value and used the directly in where clause. I am attaching the sql plan now. thanks

  • My guess is that your proc still behaves poorly.

    The reason is the massive WHERE clause which is trying to use short-circuiting, but this usually doesn't work in SQL Server. I've read that a certain CU for SQL Server 2008 (R2) will take short-circuiting into consideration, but I can't remember any details regarding which CU or how it works.

    Take a look at the execution plan, and you'll see that the first step is an index scan on IX_Orders_ShipToID_Dispatched. It estimates 855 rows, but actual rows are 1775982 (!!).

    Because of the estimate it does a nested loop to the clustered index for each of the 1775982 rows to get a single column "OrderStatus". The rest of the columns seems to be included in the index.

    Can you post the definition of this index? I assume it is huge, so perhaps adding OrderStatus to it (INCLUDED) would get rid of the nested loop. It won't help much since there is a series of other nested loops for the 1775982 rows in the rest of the plan.

    My suggestion stands: Rewrite this to dynamic SQL. In addition I would have done a thorough look through the indexes to identify missing indexes that can help out in your query (after the rewrite to dynamic SQL). You'll need to identify common parameter patterns in order to do so. The sys.dm_db_missing_index* DMOs can be of good help in this task.

  • Nils Gustav Stråbø (8/24/2011)


    My guess is that your proc still behaves poorly.

    The reason is the massive WHERE clause which is trying to use short-circuiting, but this usually doesn't work in SQL Server. I've read that a certain CU for SQL Server 2008 (R2) will take short-circuiting into consideration, but I can't remember any details regarding which CU or how it works.

    Take a look at the execution plan, and you'll see that the first step is an index scan on IX_Orders_ShipToID_Dispatched. It estimates 855 rows, but actual rows are 1775982 (!!).

    Because of the estimate it does a nested loop to the clustered index for each of the 1775982 rows to get a single column "OrderStatus". The rest of the columns seems to be included in the index.

    Can you post the definition of this index? I assume it is huge, so perhaps adding OrderStatus to it (INCLUDED) would get rid of the nested loop. It won't help much since there is a series of other nested loops for the 1775982 rows in the rest of the plan.

    My suggestion stands: Rewrite this to dynamic SQL. In addition I would have done a thorough look through the indexes to identify missing indexes that can help out in your query (after the rewrite to dynamic SQL). You'll need to identify common parameter patterns in order to do so. The sys.dm_db_missing_index* DMOs can be of good help in this task.

    I am not so good in changing the whole procedure myself:-) I can definitely send your recommendation to the developers to use dynamic sql. Whatever change I am able to do, is because of helping people here.

    I restarted the dev server so that I can view missing index details from scratch. I included 'orderstatus' column to the index 'IX_Orders_ShipToID_Dispatched' to avoid lookup. I ran the query again and included actual plan here and found that now it was using a new index which was rather small so i added 'orderstatus' there too. Results will come in 3-4 min and I will post the plan again.

    I want to highlight the fact that all these values are being called from a view which in turn calls base tables. The view is created with joins on 5 tables with nolock hint. I hope it doesn't screw up the plan.

    Somehow, I am seeing that most of the indexes are doing seek operations. I am suffering because of the massive 'where' clause but will keep trying. I will post execution plan in 5-10 min.

    Thanks

    Chandan

  • Here is the plan. Please let me know if you see anything that can be improved at DBA end.

  • Now it looks even worse 🙁

    I don't know how many columns the Orders table have. My fear is that the index I was talking about, now contains almost the same columns as the base table which really isn't of any use.

    The nested loop is still there, but now there are no Output columns, which confuses me.

    Your developers need to start rewriting this into dynamic SQL, unless the breaking of ownership chain will cause any problems for you.

    The way it is now (with the monster WHERE) will cause SQL Server to scan no matter what you do. There are no indexes that will help you due to the nature of the WHERE clause. SQL Server will try to find the index with the least cost to scan.

    Sorry that I can't be of any more help.

  • Yeah, even though the indexes you modified now cover the output of the index scan on orders, they don't cover all the possible combinations of predicates. For example, if you search on the order guid.

    I think you need to update your statistics; I suspect that when you do, you may find the plan reverts to a clustered index scan instead of the inner join it's using now to get the order data.

    I think you've got one more subquery that should be refactored out:

    When you insert into @headers, you select the count of records in your CTE, orderHeaders.

    I think you should get rid of that column in @headers, and instead use @@rowcount:

    SET @recordCount = @@rowcount;

    SELECT @recordCount AS RecordCount, * FROM @headers;

    The plans you've shown suggest that the proc should be running faster than it was previously. Are you noticing any speedup?



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • weitzera (8/24/2011)


    Yeah, even though the indexes you modified now cover the output of the index scan on orders, they don't cover all the possible combinations of predicates. For example, if you search on the order guid.

    I think you need to update your statistics; I suspect that when you do, you may find the plan reverts to a clustered index scan instead of the inner join it's using now to get the order data.

    I think you've got one more subquery that should be refactored out:

    When you insert into @headers, you select the count of records in your CTE, orderHeaders.

    I think you should get rid of that column in @headers, and instead use @@rowcount:

    SET @recordCount = @@rowcount;

    SELECT @recordCount AS RecordCount, * FROM @headers;

    The plans you've shown suggest that the proc should be running faster than it was previously. Are you noticing any speedup?

    Yes the speed is better now. It takes 2 minutes now and previusoly its was 3-4 min always. I will try to remove the subquery in @headers and will let you know how it performs now. I used exec sp_updatestats today.Do you want me to update the statistics with full scan option.

    I was running the query and noted the number of logical reads. It was more than 1.5 billion logical reads. Somehow I get a feeling that because we are querying from a view , it can lead to unnecessary reads from other base tables as well which might not be required for a particular column but are there in the view defintion.

    Also, Grant had pointed about costly xml and asked me to bring it to temp table but my developer says that the data is dynamic so it has to be read directly from xml.

    Ninja, Nils and Sean had pointed that there is function which is called 80k+ times because it is a scalar function and should be chnaged to inline. It can save me 8-10 seconds for sure. I have asked developer to write that.

    But I am very concerned about number of reads. Suggest me something here. I can create even more than needed indexes to reach the right one as my db is read only for reporting.

    Thanks

    Chandan

  • Yes, please update the stats with a full scan, so that we can rule out bad stats as a cause for bad performance.

    After you do that, post the updated code and the new plan, and we'll see if we can improve it any more.

    In the last plan you posted, about 70% of the "cost" was in shredding the xml into temporary table variables.

    (note that cost is a very inaccurate statistic, and it might have actually been much cheaper.)

    What Grant suggested was to create a table with an XML column, load the XML into it, and create XML indexes on it before shredding the xml. This would all be done in the stored proc, and may or may not help you. XML in SQL server is very slow.

    (Edited for spelling)



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • Thank you. I will run the update statistics with a full scan and then try changing the code as you suggested a while earlier and then will post the code and execution plan both.

    Regards

    Chandan

Viewing 10 posts - 31 through 39 (of 39 total)

You must be logged in to reply to this topic. Login to reply