April 30, 2004 at 3:20 pm
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnist
May 5, 2004 at 2:15 am
A nice concise article as an introduction to the mega-complex area of performance tuning - just the sort of thing I need.
Much appreciated.
Jamie Thomson
Conchango UK
Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
May 5, 2004 at 4:28 am
Very good guide on basic counters, looking forward to the rest of the series .
Darren Woodcock
--------------------------------------------------------------------------------------
[highlight]Recommended Articles on How to help us help you and[/highlight]
[highlight]solve commonly asked questions[/highlight]
Forum Etiquette: How to post data/code on a forum to get the best help by Jeff Moden[/url]
Managing Transaction Logs by Gail Shaw[/url]
How to post Performance problems by Gail Shaw[/url]
Help, my database is corrupt. Now what? by Gail Shaw[/url]
May 5, 2004 at 5:12 am
Very helpful list and explanations.
I wondered though if perhaps the process object should have been the processor object?
-Caleb
May 5, 2004 at 7:00 am
Steve,
Great article.
Can you elaborate on the rule of thumb for disk queue length and Processor Queue Length. In the past, I have heard that anything (sustained) above 2 is a high number for the disk. And that anything above 10 is a high number for processor queue length. You say 8 and 2 respectively.
Thanks,
Ben Reeder
May 5, 2004 at 9:52 am
Thanks everyone, glad you liked it.
As far as the CPU %, it is the system object. Apologies!
Disk Q - A tough one and honestly, I have to admit my experience with the latest and greatest is a little old. Been working with SANs quite a bit lately and the counters don't translate as well. However, I've usually gone with 8/disk thumb/view/guesstimate. When I see this, usally in conjunction with longer lock wait times but the # of reads/writes not growing I get worried. Of course, the complaints from the users is the main benchmark
For the CPU Q, the standard benchmark has been 2 per CPU over time, which for me is more than 10-15 minutes, that I get worried. If it's isolated occurrances because of some report or query, I can let that go. But if it's 4 times a day, then I need to perhaps consider upgrading the CPUs or adding more. Of course that's tough, so this metric is often more of a support for an argument for a change. If the Q stacks up, 5, 10, more per CPU and the CPUs are 100%, then you likely have a bottleneck here. If the CPU is 100% and the Q is low, then it's just busy.
Now with hyperthreading the 2 per CPU may need to be modified to 3 or 4 per physical. Honestly haven't done enough testing in comparison here. Hopefully I'll get to do some over time.
May 5, 2004 at 2:51 pm
Here is the list of counters I used to use when I was a DBA. It is based on the SQL Server Operations Guide from Microsoft (http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlops0.mspx). Many of the ones I used match the ones in Steve's article but there are some on this list that you might not see an immediate need for. The purpose of some of these is to establish a performance baseline so that when a user complains about how performance has "suddenly" gone to pot, you can compare the current counters to the baseline to see if the problem really is related to SQL Server.
Here is my list:
Memory
Network Interface
Physical Disk
Process
Processor
SQL Server: Access Methods
SQL Server: Buffer Manager
SQL Server: Cache Manager
SQL Server: Databases (All Instances)
SQL Server: General Statistics
SQL Server: Latches
SQL Server: Locks
SQL Server: Memory Manager
System
[font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
Business Intelligence Administrator
MSBI Administration Blog
November 11, 2005 at 7:33 am
Greetings,
Nice article. After years monitoring Oracle, DB2 and SQL Server databases, I made for each of those RDBMS a three level performance monitoring check list. The first level is a day to day monitor list. I run on each server a perfmon logging into a csv file the results and doing some trending with Excel. It is a 3 minutes process per server.
When I notice something abnormal I run the second level. 95% of the time, I can identify positively there is a problem and begin to address it. If I don't see anything with the level 2, I go on level 3. I can say frankly, I went to level 3 only 7-8 times in the last 5 years.
Level 1:
Processor >> % Total Processor time |
Physical Disk >> % Disk utilization |
Physical Disk>>Avg. Disk Queue Length |
System >> Processor Queue Length |
System Object : Avg. Disk Queue Length |
à Free Memory (KBs) (Task Manager) |
à Total memory available (Task manager) |
SQL Memory Manager >>Total server memory |
Memory Object : Available Mbytes |
Level 2:
Check SQL Connectivity issue |
Memory >>Pages/Sec |
SQL Buffer Manager>>Pages Writes/sec |
SQL Buffer Manager>>Pages Reads/sec |
SQL Buffer Manager>>Buffer Cache Hit Ratio |
SQL Access Methods>>Index Searches/sec |
SQL Server Databases Methods: Transactions/Sec |
SQL Access Methods >>Page Splits/sec ctr |
SQL Blocking situation |
SQL Error or warning message in the log |
SQL Abnormal activity in current activity or Profiler (logging errors and warnings) |
Level 3:
SQL Server Locks Object : Average Wait Time |
SQL Locks>>Lock Waits |
SQL Locks>>Number of deadlocks |
Server Network Interface Object>>Bytes total/sec |
SQL SQL Statistics >> SQL compilations/sec |
SQL SQL Statistics >> SQL re-compilations/sec |
Thank you
Alain Gagne OCP, DB2 expert, MCDBA, MCSE, MCSD
November 11, 2005 at 8:34 am
In general I've noticed the following about diskQs.
1. There are 2 schools of thought to follow for a general rule of thumb.
a. No more than 2 per CPU-- that means for every CPU on the box, you can withstand a Q up to 2. So the Q for a 4-way box would be 8 tops.
b. No more than 2 per spindle-- that means for every HD on the box you can withstand a Q of up to 2. This requires you to know how many spindles are in your raid set, but it's easy info to come by.
In general, I've noticed that they are both valid under different circumstances. The first rule is handy for small to mid sized boxes with direct attached storage. The 2nd is good for SANs and larger boxes, though it will work well with mid size boxes on a SAN as well.
Of course, these are merely rules of thumb to see where you should start looking for trouble. The tie breaker of course is fn_virtualfilestats. This is where you'll get your most valuable info.
--Sean.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant – http://www.MidnightDBA.com/DBARant
November 11, 2005 at 8:45 am
Nice article. You inspired me to add a few counters (to my cluttered mess!) that I was missing.
Three Points
1.I believe the "System Object : Avg. Disk Queue Length" actually belongs to physical disk.
2.While you mention it, the importance of looking at individual disks and processors and not just the totals can not be overstated. The devil is often in the details.
3.It also helps to look at the usage of the paging file.
Terry Duffy
November 11, 2005 at 8:46 am
I liked the article as it was concise and gave good tips for where to start. Is anyone out there using the SQLH2 tool available from MIcrosoft. I am using it and it is very helpful as it provides reports (SQL RS) that allow you to track usage over time. It is not as user friendly as some other tools available from Idera and others, but it is free, which at my compnay makes it the only one I can use to monitor.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
November 11, 2005 at 12:07 pm
Good Job Steeve,
Will be nice for new comers, and a nice recap for old dummies.
I have more counters, but it includes all the counters you have listed.
Sincerely, I had to go through the hard route to come up with my list of counters. Where were you till now?
five smileys for you.
Jambu
November 11, 2005 at 12:32 pm
I keep trying not to be a DBA after having done it since 1982. I am more interested in developing products; but I keep falling back into the trap that data needs to be migrated, databases need to be managed, and performance needs to be improved. So, now that I do not have the responsibility, but the database isn't taking care of itself, it's nice to have some stuff to plug in to at least look like I am doing that job, too! Thanks Steve. it is a pleasure to know you and your efforts are appreciated.
December 19, 2005 at 10:33 pm
A number of contributors have added some useful monitoring check list to supplement this good article. Great work everyone!
One thing I would be interested in finding out: what happens next? For example, if the monitoring shows up excessive disk queueing, how can this be fixed? Are there articles/books that discuss both monitoring and resolutions to the problem?
July 6, 2006 at 12:42 am
Hi Steve
Thanks for the article. Nice piece.
I did notice the following differences in the counter objects:
You wrote: System Object : Avg. Disk Queue Length
... I only found it in the Pyhsical Disk Object : Avg. Disk Queue Length
You wrote: SQL Server Databases Methods Object : Transactions/Sec
... was actualy SQL Server Databases Object : Transactions/Sec
and the last one: SQL Server Buffer Manager Object : Cache Hit Ratio
... was SQL Server Buffer Manager Object : Buffer Cache Hit Ratio
I liked Alain's three phase approach to monitoring. Gave me an idea what to do better.
Regards
__________________________________
Searching the KB articles is like
picking your nose. You never know
what you'll find.
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply