There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. - Donald Knuth. Structured Programming with go to Statements, ACM Computing Surveys, Vol 6, No. 4, Dec. 1974
What premature optimisation is not
The above quote is not an excuse to write bad code. It is not an excuse to write lazy code. It is not an excuse to write code you know to be inefficient and problematic. It is not an excuse for sloppy decision-making. It is not an excuse skimp on design (eg, ‘let’s make all columns varchar(8000) and tune later if necessary’). In short, it’s not an excuse for taking shortcuts.
Databases should be properly designed, data types chosen to match the data being stored, queries should be written according to all good coding practices, indexes created to support those queries. That’s what database development is, good design, good coding practices. We do take pride in our work after all.
What should be avoided though is the random attempts at ‘tuning’ when there’s no idea what the problem is or even whether there’s a problem at all.
Measure, then tune
As always, the golden rule of optimization is profile first. - Bob Nystrom. Game Programming Patterns (https://twitter.com/munificentbob)
I keep seeing questions, on these forums and elsewhere along the lines of ‘How do I get rid of this hash join?’ or ‘How do I stop this query from doing a scan?’
The first question in situations like that has to be ‘Is that operation a problem?’ The second question should be ‘Is that query a problem?’ There’s often no answer to either.
The first phase of any performance tuning exercise must involve gathering performance-related metrics and identifying problematic queries/processes.
This doesn’t mean grabbing whatever queries show up in sys.dm_exec_requests when the users are complaining. It means doing sufficient investigation and analysis to determine what queries are part of the problem and which ones just happened to be running at the same time. The investigation may require plenty of time and effort.
Tuning queries randomly without knowing whether or not they are a performance problem is an excellent waste of time. It’s not good for much else though. It’s definitely not much good at solving overall performance problems. Without profiling the system first to identify which queries run frequently, which queries are causing the blocking, using high CPU, performing a lot of physical reads or whatever else has been identified as a problem, the chance of fixing the problem by tuning queries at random is somewhere close to zero.
The other reason to gather metrics first is so that you have values to check against to see whether you’ve actually achieved anything. I know it sounds obvious, but I’ve seen way too many ‘performance tuning exercises’ which cost a great deal of money and time but achieved little to nothing once the recommended changes were implemented.
Measure at every step of the process. Measure the system’s overall performance, measure the individual query performance before starting to tune, measure the query’s performance at every step of the tuning, measure the final performance after finishing the tuning, measure the system’s overall performance after implementing changes.
That ensures that the problems are identified, the problematic queries are the ones addressed, the changes are effective and that the system’s overall performance has improved after the changes were implemented.
Any changes which did not achieve a performance gain should be discarded. This is why measuring at every step is important. If a query’s performance is tested, fifteen changes made to the query, table and indexes and then the performance tested again, which of the fifteen changes were responsible for the change in performance? There’s no way to tell. That means that all fifteen of those changes have to be tested, deployed to source control, code reviewed and deployed to production and supported, even if most of the changes had no effect at all.
Detailed before and after metrics also make a damn nice report to send to the boss to show how good you are at your job. 🙂
Effort vs reward vs ongoing effort
Anytime the optimization sacrifices maintainability, it is highly suspect. Indeed, many such optimizations are superficial and may not actually improve the resulting code’s performance. - Joe Duffy. The Premature Optimisation is Evil Myth
Query tuning takes time and effort, often a lot of time and effort. Database professionals, like many IT professionals, in general have more work than they have time to do that work. As such, it is important that we spend time on that which is important.
Spending a day tuning a query to get it from 30 seconds to 5 seconds might sound like a good choice. However if the query only runs once a month, that was probably not an effective use of time. (Unless it’s a report that the big boss in the corner office runs. In which case it’s an excellent choice, and make sure he knows you tuned the code 😉 )
At every point in tuning, the effort expended (and expected effort) must be evaluated against the gains and expected gains. Don’t spend more time on tuning a query than it’s worth. This goes back to Knuth’s quote, critical code should be carefully examined and tuned where necessary, after that critical code has been identified. Critical code, i.e. code which is time-sensitive or which runs very frequently, is where the majority of tuning time should in general be spent, for the best overall gains.
To give an example, last week I was working on a stored procedure that ran around 7 hours. The procedure runs twice a day. I took half a day, rewrote part of it and got it down to 10 minutes. That’s a good gain for the effort expended. I could have spent another half a day and probably got it down to under five minutes. That would not have been a good gain, there are better tuning opportunities in the database, other queries I could spend that half day on which would get a better gain than saving 10 minutes a day.
Beware of the Sunk Cost fallacy. Having spent time trying to tune a query is not a good reason to continue to try and tune the query. If you’re not getting anywhere, consider an alternative approach, asking for advice or looking elsewhere for easier gains.
Unfortunately being able to judge likely gains and estimate time required comes with experience and, like any software time/cost estimation, is complex. It’s still something that should be kept in mind.
The other gain vs effort which must be kept in mind is the effort of future maintenance/changes. If an optimisation shaves 5% off the duration of the query, but makes it so complex that a future developer would be hard-pressed to understand the code if they need to change it, that optimisation is suspect and possibly should not be implemented. The decision whether or not to implement such an optimisation depends on how critical the code is, how likely future changes are and how well the complexities can be documented. Sacrificing maintainability for small performance gains is not, in general, a good trade off.
In Conclusion
Measure early, measure often.
Database performance tuning is a process which should be approached scientifically, with clear metrics which show where it is needed, where it will have the biggest impact and what that impact was.
Acknowledgements
Thanks to Jason Brimhall (b|t) , Wayne Sheffield (b|t) and Lynn Pettis (b|t) for reviewing this article.