April 29, 2008 at 5:37 am
Hello guys!
For a project for school we need to implement some stored procedures in SQLServer 2005, at a later time the stored procedures will be implemented in Nagios, the Nagios part is no problem, but the problem is we don't know what is important to monitor by a DBA?
What we have:
- check Uptime
- Check failed logins
- check log size
- check #connections
- check failed jobs last x days
- check logon settings
- check for exists of sample databases
We need more check, but we don't know which, at the moment I'm installing MoM to see what they do.
Anyone some ideas?
Kind regards!
April 29, 2008 at 8:06 am
This is what comes to my mind. This is not complete list, I am sure other members will add some more.
- blockings
- tempdb size
- empty space left on hard disk
- buffer cache hit ratio
- object cahce hit ratio
- processes consuming the most of resources
- CPU usage
- memory usage
April 29, 2008 at 8:42 am
Availability, consistency, and recovery seem to be a good things in the database world. For some reason -- the users have come to expect such silly things. 🙂
So, can't have recovery without backups. Can't have availability without the server running and can't guarantee consistency without running verification.
For all of my systems, I make certain that I always have a full or differential backup < 1 day old for EVERY database. This applies to dev, test, and production.
For some production databases, my data loss interval is 15 minutes, so if I don't have a t-log backup every 15 minutes, I want a message. In general, however, if I don't have a t-log backup every hour (on production databases), I want a message.
You can verify the consistency by checking the last known good checkdb state. If I haven't noticed the checkdb jobs failing, I want a message if integrity has failed for two iterations (some weekly, some daily -- depending on db size and requirements)
You already mentioned availability, so I figure you got that covered.
Kyle
April 29, 2008 at 8:52 am
I like Kyle's thoughts. Backups are the most important. Be sure you know that every database has been backed up or you get a notification and can do something immediately.
I've never really worried about database size, but I do track backup size. If the backup grows by more than 20%, someone should look at it. Tracking this over time helps to figure out trending as well.
Disk space is good, and cpu/memory are good for tracking what's normal. I'd probably stop with buffer cache ratio, users, trans/sec as gross metrics, but if you add a few more, then it shouldn't be an issue.
I've rarely had tempdb problems, but that's not a bad one to track, especially as code changes can really cause issues in tempdb, so knowing what's normal is good.
Uptime, not sure. If it's a metric you can sell, then fine. Otherwise, who cares. It works when it's needed or your phone rings. That one's easy to know. If you have perceived issues, perhaps this helps to show things aren't bad.
April 29, 2008 at 10:53 am
Thank for al this answers, I'll implement a few of them tomorrow, the cpu, memory monitorring is easy as hell in Nagios. For us it's more the part of database monitorring..
If someone else have other ideas, let them know !
April 30, 2008 at 12:37 am
you can also change your sa passwords on regular basis to keep the security in palce.
April 30, 2008 at 4:50 am
Also you can Track changes of your Production database like Schema changes, credential changed etc. by using DDL AND DML Triggers
---------------------------------------------------
"Thare are only 10 types of people in the world:
Those who understand binary, and those who don't."
April 30, 2008 at 5:09 am
Adding along to the list..
Check the health of your db by running dbcc check db. Or you can schedule the same by creating a maintenance plan.
"Keep Trying"
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply