Use your own optimizer to tune your queries

  • Gullimeel (5/18/2012)


    Thanks.

    Do not worry I will not create my own sort.

    However, the people who queries dm_exec_query_stats to find the costly queries misses this sort cost. Like if you see this salesorderheader query.It will show just 706 logical IO's which is just 10 % of actual cost. Thus we are kind of comparing 10 % cost of one query against 100% cost of another query and showing that the query with no sort is more costly than the sort one. But actually the sort query might be much more costly than origianl one.

    Thus while finding the queries which are putting pressure on sql server we should consider the exec_query_stats dmv along with dmv for memeory_grant_option.

    But you're still just comparing logical I/O in that case. Don't get stuck there. Take everything into account, I/O, CPU, execution time. Combined these indicate a costly query. Any one on their own doesn't provide a complete enough picture of what's going on with the queries.

    And as far as the fix goes, it's good that you're getting some benefit from it, but it's not an approach that very many people are going to be able to accomplish. I used to manage about 50 production servers with several hundred different databases on them (not a huge install by any means). Trying to achieve that level of understanding of the data on any one table in order to implement your approach would have completely destroyed my time. Instead, I learned and used standard tuning methods that are more widely applicable, quickly, and easily. Plus, since they're part of the standard methods, they're more likely to work in more cases since they're working with the optimizer rather than attempting to bypass it.

    It's a good experiment, but it's applicability is severely questionable.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Gullimeel (5/18/2012)


    However, the people who queries dm_exec_query_stats to find the costly queries misses this sort cost.

    No they don't (well, not unless they only look at logical IO). The sort cost is included in the CPU (worker time) and duration, sorts are CPU intensive. If the sort spills, the IOs will be included in the IO counters. Cross apply to sys.dm_exec_query_plan and you can see the query cost (which includes the sort) and the sort operator.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • In addition to the logical IO you seem to be fixated on you must also consider the TYPE of IO. Random IO on rotating media, which is what you will get for your index seek/bookmark lookup plan you are "forcing" is HORRIBLY INEFFICIENT when compared to the sequential IO you should get from a scan. That is why the optimizer costs those IO types VERY differently and the math leads to changing from seek/lookup to scan at a SURPRISINGLY LOW number of estimated rows compared to the total number of rows in the table.

    Now, if SSDs were in play AND the optimizer was updated to KNOW they were available - the calculus would be quite different due to the incredible efficiency at which they perform random IO. But that isn't the case with the SQL Server optimizer in any current implementation that I am aware of.

    As an aside, I must say that I find it laughable that you are going on and on butting heads with quite a few of the best SQL Server relational engine tuning experts in the world. People who have not just literally wrote the book, but have taught it AND done it in the real world. I won't even bother adding my voice to the "you are going down the wrong path" chant since you clearly have no desire/ability to listen to exceptionally well stated reasoning and logic.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • But you're still just comparing logical I/O in that case. Don't get stuck there. Take everything into account, I/O, CPU, execution time. Combined these indicate a costly query. Any one on their own doesn't provide a complete enough picture of what's going on with the queries.

    I am not comparing just the logical IO. I compare 4 parameters. logical IO,physical io, cpu time and Memoy used. I just mentioned one. But i wish if microsoft provides a new column in that dmv say cost which is calculated based on all the criterias which are used in calculating the estimated cost but using actual values of those criteria.

    I am not saying that you change all your procedure using my code. I have mentioned in the script itself. Only a person who understands the data very well should give it a try , when other standard tuning doesnt work. I have tried this in just 4 procedures in last 3 years.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • [highlight=""]No they don't (well, not unless they only look at logical IO). The sort cost is included in the CPU (worker time) and duration, sorts are CPU intensive. If the sort spills, the IOs will be included in the IO counters. Cross apply to sys.dm_exec_query_plan and you can see the query cost (which includes the sort) and the sort operator.

    [/highlight]

    Yes.It is included in the CPU cost but again nothing related to memory is anywhere.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • [highlight=""]

    In addition to the logical IO you seem to be fixated on you must also consider the TYPE of IO. Random IO on rotating media, which is what you will get for your index seek/bookmark lookup plan you are "forcing" is HORRIBLY INEFFICIENT when compared to the sequential IO you should get from a scan. That is why the optimizer costs those IO types VERY differently and the math leads to changing from seek/lookup to scan at a SURPRISINGLY LOW number of estimated rows compared to the total number of rows in the table.

    [/highlight]

    Yes. I know that and that is why i wanted the threshold to be less than the 1570.Around 1570 rows the IO's are same in table scan and Index scan. I thought the actual tipping point should be less than that. But again i have not taken into account the cpu cost and other costs thus i have changed the threshold to 1600.But actual tiping point was between 1800 and 1900 somewhere.


    As an aside, I must say that I find it laughable that you are going on and on butting heads with quite a few of the best SQL Server relational engine tuning experts in the world. People who have not just literally wrote the book, but have taught it AND done it in the real world. I won't even bother adding my voice to the "you are going down the wrong path" chant since you clearly have no desire/ability to listen to exceptionally well stated reasoning and logic.


    I know who these guys are (I am reading one of Grant's book) and they are in this area much longer than I am but thats not the point here.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • Gullimeel (5/19/2012)


    However, the people who queries dm_exec_query_stats to find the costly queries misses this sort cost.

    The DMV provides a number of runtime performance metrics to compare (e.g. elapsed time, processor time, reads, writes). Why do you say the sort cost will be missed? As Gail said, the sort will affect elapsed time and processor time, and the I/O measures too (if the sort exceeds the memory assigned to it at runtime). Actual memory usage might be an interesting extra piece of information for this DMV to report, but that is not the case today. Is that what you mean?

    Gullimeel (5/19/2012)


    Like if you see this salesorderheader query.It will show just 706 logical IO's which is just 10 % of actual cost. Thus we are kind of comparing 10 % cost of one query against 100% cost of another query and showing that the query with no sort is more costly than the sort one. But actually the sort query might be much more costly than origianl one.

    You've lost me here. I see the 706 logical I/Os (which do not tell the whole story as we have discussed at length) but I don't see where the 10% figure comes from. Are you comparing logical I/Os for this query to your original test query?

    Gullimeel (5/19/2012)


    But i wish if microsoft provides a new column in that dmv say cost which is calculated based on all the criterias which are used in calculating the estimated cost but using actual values of those criteria.

    You could always suggest this on the Connect website, if you can make the case that it would be useful. I don't see much value in calculating an 'actual' cost, since it is just an abstract number used for modelling purposes. Other metrics like elapsed time and processor usage are probably more useful. There are certainly other things that could be reported (like actual row count and size versus estimated values) but we don't have that today.

    Gullimeel (5/19/2012)


    I am not saying that you change all your procedure using my code. I have mentioned in the script itself. Only a person who understands the data very well should give it a try , when other standard tuning doesnt work. I have tried this in just 4 procedures in last 3 years.

    I don't think you made that at all clear in your previous posts. One of the 'standard techniques' is to conditionally choose a hinted query based on inspection of a runtime parameter. Your idea is an extension of that, and could have value in some very specific extreme cases, but nothing I have ever encountered. Perhaps your initial post was a little over-excited. It probably would have helped if you had strongly emphasised that the technique you describe is a very last resort, applicable to very few situations, and then only with extreme care (and I still don't like it!)

  • The DMV provides a number of runtime performance metrics to compare (e.g. elapsed time, processor time, reads, writes). Why do you say the sort cost will be missed? As Gail said, the sort will affect elapsed time and processor time, and the I/O measures too (if the sort exceeds the memory assigned to it at runtime). Actual memory usage might be an interesting extra piece of information for this DMV to report, but that is not the case today. Is that what you mean?

    Yes.

    You've lost me here. I see the 706 logical I/Os (which do not tell the whole story as we have discussed at length) but I don't see where the 10% figure comes from. Are you comparing logical I/Os for this query to your original test query?

    No this 10% is the cost of the table scan using clustered index in the plan you posted.

    Related to new column in that DMV. That will help in comparing the performance of the queries. Currently, we compare either base don logical,elapsed time,cpu time and physical IO's separately. I would also like a counter say rows processed or rows affected as this will be quite useful.Some queries which process 1 million rows might actually do say 1 million IO's.Thus 1 million io here should not be that much concern.But other queries is processing say a few rows and clock say 20000 IO. Then This query is what needs some kind of tuning.

    In next posts I will clearly mention the scenarios where you should use it.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • Gullimeel (5/19/2012)


    Yes.

    Well you have the memory grant reported in an 'actual' execution plan, and the DMV (while the query is executing). The extra information might be useful for some specific tuning tasks, but it's not needed in general. If you want to suggest it to Microsoft, file a suggestion and justification at connect.microsoft.com.

    No this 10% is the cost of the table scan using clustered index in the plan you posted.

    So this is just another argument against using logical I/O as a primary tuning metric. Remember your original post used logical I/O alone, that's why we're discussing it. And this is why, for general tuning, you need to consider and balance all the available performance metrics (including elapsed time and processor usage). The lack of memory usage information in the query stats DMV does not prevent us from tuning query execution well.

    I would also like a counter say rows processed or rows affected as this will be quite useful.

    SQL Server 2012 has total, last, min, and max row counts in sys.dm_exec_query_stats.

    In next posts I will clearly mention the scenarios where you should use it.

    I think it's a bit late now. Very few people are likely to read this far in future. You could edit your first post instead? Just a suggestion.

  • I mentioned that logical IO's should be one of the criteria for tuning. I usually prefer amount of data processed ,logical io, physical io,cpu time ,memory and elapsed time.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • I usually prefer amount of data processed ,logical io, physical io,cpu time ,memory and elapsed time for performance comparison.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • I can see this discussion going on forever. I think it is time to stop as it is quite apparent that the author is not going to be swayed. It is my hope that people will heed the arguments presented in this discussion before considering implementing this solution. I can't see myself using it as I personally would not want to maintain it nor train others in how to do so.

  • Lynn Pettis (5/19/2012)


    I can see this discussion going on forever. I think it is time to stop as it is quite apparent that the author is not going to be swayed.

    Thanks for your insight Lynn, but I don't share your assessment. I think we have explored quite a number of interesting aspects here, and I would like to think that some progress has been made. Feel free to unsubscribe from this thread if the continuing discourse bothers you.

  • SQL Kiwi (5/19/2012)


    Lynn Pettis (5/19/2012)


    I can see this discussion going on forever. I think it is time to stop as it is quite apparent that the author is not going to be swayed.

    Thanks for your insight Lynn, but I don't share your assessment. I think we have explored quite a number of interesting aspects here, and I would like to think that some progress has been made. Feel free to unsubscribe from this thread if the continuing discourse bothers you.

    It has neen interesting, and I have learned quite a bit. I just don't see the author changing is opinion. I may be reading it wrong, and I hope I am, but all I see is him continuing to defend his solution. I do see that he uses it very sparingly but still insists that it should be a viable solution. I would not want to see an inexperiened DBA try to implement this solution, and it isn't one I can see myself using.

  • Lynn

    I have added some 4 points at the start of the script and I hope it will be enough to make sure that when this can be used and when this can not be used.

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

Viewing 15 posts - 31 through 45 (of 49 total)

You must be logged in to reply to this topic. Login to reply