Tom LaRock has a new meme for Meme Monday. It’s all about the problems caused in your system other than disks. Thankfully, despite the title, I don’t have to list 99 separate things, only 9, but you know what, 99 is possible. I’m going to present the problems. You find the solutions on your own today. Let’s go.
Recompiles
I’ve seen queries so big that they take more than three minutes to compile. That’s the edge case, but as an edge case it is educational. The most important thing to remember about recompiles is that they are driven by data changes. Once a threshold is reached on any given set of statistics, all queries referencing that set of statistics gets marked for recompile. The key words and tricky phrase here is “any given set of statistics.” My edge case was a query against 86 tables. Any of the statistics on any of the 86 tables changed, and the query went into it’s three minute recompile. The problems come in because while it’s recompiling, it’s locked and everyone has to wait. And waiting is the performance problem.
Parallelism
The default cost threshold for parallelism is 5. This means that any query plan that has an estimated cost greater than 5 could be marked as a parallel query plan. Parallel execution is a good thing, when the costs of breaking the code into disparate streams and gathering them back together is offset by the use of multiple processors. The problem is that cost is generally much higher for the break apart and gathering than the benefits of bringing multiple processors into play.
Deadlocks
Yeah, deadlocks are a performance problem. You have to deal with the fact that at least one process goes through some part of the steps it was doing and then has to roll back. The rollback process is costly and then it has to be resubmitted for processing again. These things can seriously kill performance.
ORM
These things are great. These things are evil. I think of ORM tools like a firearm. In the proper hands, used the proper way, they’re an excellent tool. In the wrong hands and used the wrong way, mayhem ensues. Same thing goes with most of the ORM tools. Further, many of the ORM tools, very well built pieces of software, are advertised and pushed in a way that makes people use them incorrectly. It would be like someone showing off hand gun twirling with loaded pistols as an example of safe gun handling. More education and more appropriate expectations are needed here.
Dynamic Queries
To a degree these are an offshoot of ORM tools, but they’ve been around a lot longer. You’ve all seen them and I think most of us have built them at one point or another; the catch-all query, the report with infinite flexibility, etc. They fill memory up and flush things out of cache and cause untold grief.
Database Engine Tuning Advisor
If Microsoft is going to put something out there that advises people on how to improve performance, you would expect it to work. It doesn’t. Oh, it can. I’ve seen it catch some of the simplest queries and fix them. I’ve also seen it miss the simplest queries and what it suggests for complex queries can be really, really scary. It will have you adding indexes and individual sets of statistics all over the place. You’ll duplicate existing indexes. All sorts of stuff. And, because it’s from Microsoft, people take what it tells them as gospel and implement it, with no testing or evaluation at all.
TSQL
I don’t think the language is all that hard. But, evidently it is. And it’s easy to muck it up and write code that is truly not good.
Backups
No, not the act of backing up databases, but the fact that people can’t seem to take backups. Don’t think that’s a performance issue? How fast are queries running when everything is offline? How good is performance when the database is offline because it’s corrupt?
Query Optimizer
And no, it’s not that I think the optimizer itself is a performance problem, but that so many people rely on it to perform miracles that it just can’t perform. I have an 86 table join, why isn’t everything running fast? I’ve got 36 deep nested views, how much the query is bad? It’s the expectations that the optimizer can and will figure out anything for them quickly, efficiently, and accurately, so that they don’t have do the things that they should that becomes a problem. And I understand it. The optimizer does incredible amounts of processing in very short periods of time. It’s truly an amazing piece of software. But, because it does all this stuff so well, I think it’s taught people to rely on it for too much and it just can’t deliver everything. It doesn’t have a “run faster” button that many people seem to think it has.