May 26, 2011 at 5:12 pm
I'm using a server with 4 Quad-Core processors, 96GB of relatively dated RAM, and a 2-disk RAID array for a decently-sized ETL (~700GB) for a project that is currently in the development phase. I can expect a particular step in the process to take between 12-14 hours in its original design configuration. The other day, I happened to notice that this step had massive waits due to our old friend, CXPACKET. Thinking that I could reduce the run-time on this step by getting rid of the waits, I set MAXDOP to 8, 4, and then 2...each one still resulted in lots of CXPACKET waits. It was only when I reduced to 1 that the waits went away. Now, instead of the step taking 12-14 hours, it's barely 10% complete at the same marker. While processing, CPU time is way down (obviously, I guess), there aren't any waiting tasks, and disk I/O is very low (10-13mb/sec).
I've read that CXPACKET waits can come from effectively having too much CPU bandwidth versus spindle throughput and RAM. This makes sense to me, given my configuration... Unfortunately, as this is a POC (proof of concept, not piece of crap - though, that description works, too) that I'm working on, this is the most that I can work with until we move forward on the project.
Is it reasonable to expect to use "Max Degree of Parallelism" to completely eliminate CXPACKET waits in this context?
I've gone back and set MAXDOP=8 for this step and, while I do have a lot of waits, it does complete in the same timeframe as before. I am (perhaps errantly) convinced that I'm just dealing with a hardware limitation at this point...that said, has anyone experienced any negative side effects of these waits piling up? From my perspective, until I can obtain a beefier server, it's better to have the damned thing finish with CXPACKET waits than to crawl along without them.
May 26, 2011 at 5:24 pm
CXPacket waits are not bad things directly.
It means that the internal optimizer isn't balancing the load between the threads perfectly, and some end up waiting. It could be they're waiting on the 'feeder' that occurs at the beginning, or that some previously split components are more chewy then others.
Now, if you're getting significant splits in the wait differences between the threads, or the parallelism (and the subsequent re-integration of the streams) is slowing you down, then you drop maxdop down and test, as you've done. It doesn't always help, and usually hurts except in smaller queries. The other reason to drop it down is to give threads back to the rest of the system. 😉
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply