Tuning a query

  • Ninja's_RGR'us (8/9/2011)


    I stopped asking why many moons ago.. just give you headaches ;-).

    Do you have the new execution plan? At least we can see how we can help there.

    Last table is left. it has been running for last 20 min. Once it is done, you want me to give you the execution plan for original short query or for the whole procedure?

  • chandan_jha18 (8/9/2011)


    Ninja's_RGR'us (8/9/2011)


    I stopped asking why many moons ago.. just give you headaches ;-).

    Do you have the new execution plan? At least we can see how we can help there.

    Last table is left. it has been running for last 20 min. Once it is done, you want me to give you the execution plan for original short query or for the whole procedure?

    Let's start with the same one.

    The "correct" way to go about this is to run the procedure once with profiler (after data has been cached if possible) on and trace the sp_statement completed event (of sql:statement for a sql batch). Then start by tuning the longest running (using cpu and reads as meters, duration includes wait time so it's less accurate). Assuming you did something similar to this to pick this query to optimize then we're on the right track.

  • I did'n use the profiler. I ran the whole procedure once and then found that each batch was taking almost 20-25 percent in the execution plan. So i just picked one batch and tried to concentrate on it.

    I will run the original statement asap and then post the new plan. Dont know why but the with full scan option, things r moving too slow.

  • The execution plan % are based on the estimated costs, which are based on the row estimations, which are incorrect in this case.

    Use profiler, use Statistics IO, use statistics time. The exec plan costs are estimates.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • chandan_jha18 (8/9/2011)


    I did'n use the profiler. I ran the whole procedure once and then found that each batch was taking almost 20-25 percent in the execution plan. So i just picked one batch and tried to concentrate on it.

    I will run the original statement asap and then post the new plan. Dont know why but the with full scan option, things r moving too slow.

    That's not the most accurate way to do it, (better than nothing).

    But there's nothing like actual reads, cpu and duration (once you exclude the waits... and that could start a whole new discussion in of itself).

    There are basically 2 types of tuning. 1 is to get the query as good as it can run. Then when you find out it doesn't run as expected, find out what's blocking it (or waiting on).

    We're on #1 atm.

  • GilaMonster (8/9/2011)


    The execution plan % are based on the estimated costs, which are based on the row estimations, which are incorrect in this case.

    Use profiler, use Statistics IO, use statistics time. The exec plan costs are estimates.

    I am going mad:-( Do you mean when I run a query and include actual execution plan option, the plan shown is not real? Then why everyone stress on learning to read execution plans 🙁

  • GilaMonster (8/9/2011)


    The execution plan % are based on the estimated costs, which are based on the row estimations, which are incorrect in this case.

    Use profiler, use Statistics IO, use statistics time. The exec plan costs are estimates.

    And there's also a bug. Sometimes <often> the total % for all queries will exceed 100% (by a heck of a lot). Sometimes also in a single query plan you'll see 150% total (seen 300%+ a couple times).

    So all in all, your system doesn't seem fit to use this technic!

  • Ninja's_RGR'us (8/9/2011)


    chandan_jha18 (8/9/2011)


    Ninja's_RGR'us (8/9/2011)


    I stopped asking why many moons ago.. just give you headaches ;-).

    Do you have the new execution plan? At least we can see how we can help there.

    Last table is left. it has been running for last 20 min. Once it is done, you want me to give you the execution plan for original short query or for the whole procedure?

    Let's start with the same one.

    The "correct" way to go about this is to run the procedure once with profiler (after data has been cached if possible) on and trace the sp_statement completed event (of sql:statement for a sql batch). Then start by tuning the longest running (using cpu and reads as meters, duration includes wait time so it's less accurate). Assuming you did something similar to this to pick this query to optimize then we're on the right track.

    Please find the new plan for the query which i gave in the starting of the thread. It is the one after updated stats.

  • chandan_jha18 (8/9/2011)


    GilaMonster (8/9/2011)


    The execution plan % are based on the estimated costs, which are based on the row estimations, which are incorrect in this case.

    Use profiler, use Statistics IO, use statistics time. The exec plan costs are estimates.

    I am going mad:-( Do you mean when I run a query and include actual execution plan option, the plan shown is not real? Then why everyone stress on learning to read execution plans 🙁

    They are real. The optimizer has to "best guess" it's way to a good plan (checking all possible solutions could take months, especially on a monster proc like this). It uses stats to do so among a lot of other things. So if those are wrong, everything decision based on that is also likely the wrong one.

  • <Almost> no change. Bottom line is that every single estimate is wrong, at every level.

    Let's try to clean this mess up.

    Take the first query. Start using joins, remove the group bys / distinct (one step at a time) and see if you keep getting the same results at every step. You might need to call in the data architect to ask him about the data and if those queries should return the same data.

    There's a slim change it'll help the optimiser. At the very least, it'll have less processing to do so it can only help.

    Any way you can put a clustered index on CardData_temp Compagny? If this table is constantly getting deleted / inserted it's only going to help anyways (nasty "feature" about pages not getting reused correctly in a heap that can cause a buttload of pages to be left unused, but scanned when selecting on that table).

    Edit the query as much as you can without changing the results.

    If you see that the distinct is required, then ask the data architect if you are missing a join to a table, or maybe a column.

    I have a well designed ERP here and the only time I need to use group by is to do reporting.

    Once you're done post the new execution plan and then I'll send more options to try.

  • chandan_jha18 (8/9/2011)


    I am going mad:-( Do you mean when I run a query and include actual execution plan option, the plan shown is not real? Then why everyone stress on learning to read execution plans 🙁

    Of course the plan's real. It was what the query execution engine used to run the query.

    The costs however, are estimates. They are generated at compile time. They are based on the optimiser's calculations and the row estimations. Those costs are used by SSMS to generate the %s that you see. Hence they too are estimates, and they can be wrong.

    We encourage people to learn to read execution plans so they don't get caught up on things like this - thinking the % and costs are real, hard numbers.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Best book around for that... and free => http://www.sqlservercentral.com/articles/books/65831/

  • Ninja's_RGR'us (8/9/2011)


    <Almost> no change. Bottom line is that every single estimate is wrong, at every level.

    Let's try to clean this mess up.

    Take the first query. Start using joins, remove the group bys / distinct (one step at a time) and see if you keep getting the same results at every step. You might need to call in the data architect to ask him about the data and if those queries should return the same data.

    There's a slim change it'll help the optimiser. At the very least, it'll have less processing to do so it can only help.

    Any way you can put a clustered index on CardData_temp Compagny? If this table is constantly getting deleted / inserted it's only going to help anyways (nasty "feature" about pages not getting reused correctly in a heap that can cause a buttload of pages to be left unused, but scanned when selecting on that table).

    Edit the query as much as you can without changing the results.

    If you see that the distinct is required, then ask the data architect if you are missing a join to a table, or maybe a column.

    I have a well designed ERP here and the only time I need to use group by is to do reporting.

    Once you're done post the new execution plan and then I'll send more options to try.

    I can remove the group by and try running it and will track the execution with profiler. Distinct cannot be avoided as result set varies then. So need to have it 🙁

    I will do whatever i can following advice from you people and post the results tomorrow. Its 1 am here in India and need to sleep. Will update you guys about the progress. Thank you.

  • Ok, then check if the join logic is correct. Distinct / group by <usually> should not be required for this if the table design is correct. It might be ok here after re-checking the list of tables

    Also in a few sentences, what's the goal behind putting this data in a temp table? What list of things is it supposed the save?

Viewing 14 posts - 31 through 43 (of 43 total)

You must be logged in to reply to this topic. Login to reply