Guidance is hard.
Seriously, you’d think it would be easy. You’d think you say things like, don’t shrink your database, most tables should have a clustered index, never go against a Sicilian when death is on the line, don’t mix sharks and tornados, and that would be it. You’d be done. But it’s not that easy. Even worse, it’s SHOCKINGLY easy to get stuff wrong.
An example.
I was looking at information over at Microsoft Developers Network (MSDN) in the SQL Server Books Online. I was reading through information about wait types when I found this little beauty:
Occurs when trying to synchronize the query processor exchange iterator.
Know what that is? Yeah, OK, a few of you who memorize wait stats do (shut up Tom), but most of us won’t recognize the CXPACKET wait. Maybe if Bob Ward explained it:
Used to synchronize threads involved in a parallel query. This wait type only means a parallel query is executing.
Oh, well, heck, that makes sense. But that’s not what I’m going off about here. No. What I’m freaked about is the next sentence over at Microsoft:
You may consider lowering the degree of parallelism if contention on this wait type becomes a problem.
Go run this query:
SELECT * FROM sys.dm_os_wait_stats;
(or you can use sys.dm_db_wait_stats on a WASD database)
There’s a pretty good chance, on most servers that have multiple CPUs, that you’re going to see CXPACKET within the top 5 or maybe top 10 waits. Now, using Microsoft’s guidance, you should immediately go and reduce the parallelism on your server, right?
NO! No, no, no, no, no, no. NO!
There are a ton of things you should do first. For example, as we see from Bob Ward’s explanation, this just indicates that you have parallel queries. It doesn’t mean that you have a problem with parallel queries. Parallel queries are good. Some people (Adam, looking at you here) go out of their way to try to get parallel queries (where appropriate). So no, you shouldn’t be setting MAXDOP = 1 the first time you see CXPACKET waits.
Instead, check to see if you have other indications of excessive CPU usage. Do you have queues on the CPU? Do you have queries that are using excessive amounts of CPU?
Let’s say you do have high CPU. Time for MAXDOP = 1 right? No! Stop! Let’s try this first. Let’s go see what the Cost Threshold for Parallelism is set to on your server. The default is 5. That’s an INSANELY low number. It’s using the cost estimates of the query to arrive at which plans might benefit from parallelism. So, if we think we’re suffering from excessive parallelism, instead of turning it off, let’s first try raising the threshold a bit, say to 35. Run with that for a week on the server and see how things stand now. We can also try, oh, I don’t know, tuning the query, making sure we have good indexes, making sure we have good statistics, in short, a whole bunch of stuff before we resort to MAXDOP = 1.
Guidance is hard. But you really can’t afford too be this far off the mark. Go and read through the guidance offered by Bob Ward. It’s not one sentence. It can’t be. Offering guidance requires quite a lot of information because there is seldom a one sentence answer that adequately covers guiding someone through a problem, or even identifying if you have a problem. We, the crazy people who try to tell others what to do, need to be careful when providing glib and simple answers to complex problems.
Oh, and avoid land wars in Asia too.
UPDATE: While this was good information, things sometimes do move quickly on the internet. Microsoft has updated their entry on CXPACKET waits. Read all about it here.
I suddenly feel like the Man in Black fighting Fezzik:
I just want you to feel you’re doing well.