As an old school DBA and always having cheap frugal employers, I have rolled my own management and monitoring solutions. I’ve shared bits and pieces of it over the years to colleagues and to the community but never the whole thing at one go. I did try and build a commercial version. With the birth of my son, doing contract work and building a company from scratch just wasn’t in the cards. So, my loss is your gain! I’ve decided to take what I have built up over the years, clean it up and post it as a series of articles. The code will be hosted here and possibly github. It may take a while to unwind all the pieces, update it for SQL Server 2008 and clean it up for general consumption but when it is done you should have a solid foundation for a solution of your own.
My goal as a DBA is to provide secure, accurate, consistent and efficient data access to all the databases under my care.
To meet those goals I should:
- Provide monitoring and alerting for critical errors.
- Provide reliable maintenance on all databases.
- Perform backup and recovery operations, including testing backups to guarantee recoverability.
- Analyze performance issues and be proactive about possible issues.
- Secure the databases and monitor for any breaches in security policies and procedures.
If you have a few SQL Server instances to manage a single person may be able to do these things without any level of automation. As you scale to dozens or hundereds of instances it becomes impossible to throw enough people at the problem. You get inconsistency across servers in configuration and management. Introduce inaccuracy in your monitoring and long term growth projections. Add a whole new level of unreliability as some servers just fall off the radar and your DBA team spends all their time reacting to problems.
My core philosophy for dealing with these issues is pretty strieght forward.
Manage By Exception
This means setting up your monitoring and alerting systems to work for you not against you. To many alerts are just as deadly as too few. As you handle problems feed the solution back into your automated system.
Automate Everything
Make your servers work for you and not against you! There is always a cost up front to automate the management of your environment. The time you get back by not having your very valuable and highly skilled workers not doing menial tasks more than makes up for it.
Standardize Across Servers
As servers come on line you may find that they aren’t setup to the company policies. Fixing all of this by hand is also a huge time waister. Define a standard and modify it as needed. Through automation you can make a single change and then deploy it across your entire environment.
To meet these needs I’ve written several modules, a centralized system for managing them and the servers in my care. I will update this list with links as the articles come on line. If I use any code that I haven’t written I’ll be very clear who did write it, if I made changes to it and what the license to redistribute the code falls under.
Monitoring
- Server Blocking
- Backup Status
- Check Backup Space
- Daily Status
- Database Drive Detail
- Server DB Configuration
- Server Database I/O
- Server Database Transaction Rate
- Server Poll
- Server Wait Statistics
- System Database Roles
- Job Schedules
- Table Statistics
- Long Running Queries
- Long Running Stored Procedures
- Log Shipping
Administration
- Backup Database
- Update Statistics
- Re-indexing
- QA Refresh
- Test Backups
- Log Shipping
Reporting
- Daily Status
- Meta Data
- Long Running Queries
- Long Running Stored Procedures
Trend Analysis
- Backup/Restore Failure Analysis
- Backup Growth
- Drive Space Detail
- Server Blocking
- Database I/O
- Database Transaction Rate
- Server Wait Statistics
- Long Running Queries
- Long Running Stored Procedures