One of the biggest mistakes we can make when troubleshooting is to confuse the necessary responsibility of observing activity that is occurring with what the problem is. It’s not enough to accurately identify that STUFF is happening when you perceive a issue exists. You must identify what, if any, issue EXISTS in the first place and then you must identify the CAUSE(S) of the issue.
A well configured and provisioned RDBMS and the resources that support it can potentially afford large numbers of concurrent transactions. That in itself is a good thing, not a sign of an issue. Think of a highway: one could look at it and say “There are CARS! Lots of ’em! On a HIGHWAY!” Well, yeah. There are. That’s why we built it. That in itself doesn’t signify any problem. Now if you say “There’s a tractor trailer flipped over and it is blocking three lanes!” Now we have a problem. But what if we have volume during rush hour going into New York City? Is it meaningful to say: “Hey, there are a LOT of people trying to drive into New York today and they ALL seem to be in a hurry!” Maybe, but probably not. This happens every day and I don’t think NYC is building any new bridges any time soon.
The challenge is to collect data, apply knowledge and experience, and see if the hypothesis you posit withstands logical reasoning.
Paul Randal is starting a series of posts on Avoiding Knee-Jerk Performance Troubleshooting. That’s a good place to start for all of us to make our troubleshooting skills more effective.