Best practices for monitoring databases-related issues?

  • 1) What you monitor regarding your databases?

    2) How you monitor them (internal SQL function)?

    3) How do you generate alerts or reports (email notifications, external nms)?

    We're feeling the need for more monitoring in our environment. For instance, we created a data warehouse around 8 months ago. From what I understand, it's fairly straightforward - we replicate one of our major production databases and then ETLs do the rest. However, one of the ETLs stopped working a few weeks ago. Also, I don't think replication was restarted after this weekend's infrastructure maintenance. In both cases our users came to my office with reports containing incorrect data. Not good!

    So we're just looking for some direction on the above questions. Any advice anyone may have will be greatly appreciated!

  • You can also build your own monitoring solution, and sql server has a lot of built in monitoring and notification parts.

    There a bunch of different 3rd party solutions to monitoring. A lot of them are fantastic products and have MVP's that frequent this forum, a great one even sponsors this forum. They all have trials you can use to determine what fits your needs.

    There's also some good freeware out there as well.

    My own shop uses a mixture of 3rd party solutions, and solutions I've built myself. I feel I have a pretty comprehensive view of my environment, but you can never have everything monitored all the time, and there's always a tradeoff the more agressive you get.

    The first thing you need to do is clearly define what you want to monitor, and how you want to be notified. Do you want to monitor overall health? Drive space? Query response time? Cache hit ratios? I/O throughput during backups? Do you want an email? A dashboard? Does this need to be archived in a repository for historical analysis or so future trending can be done?

    Then figure out what fits your environment, budget, and sla's.

    Something like getting an email notification from an etl job's success, run from the sql agent can be done natively and easily provided you have a smtp server.

    Having a historical record of exactly what was happening to your server so you can trouble shoot a vauge "slowness" issue tuesday at 6:32 am 2 weeks ago may be more cost effective to buy a 3rd party solution.

  • I've often cobbled together some type of monitoring solution myself, and it has worked OK. However someone noted the other day that implementing monitoring means you have to maintain it, which can be problematic. Buying a third party tool, and doing minimal integration, can be a better ROI if time is valuable to you.

    SQL Monitor (http://www.red-gate.com/products/dba/sql-monitor/) from Red Gate, was designed for alerting and monitoring, letting you know to look at things in your server when you have issues, or potential issues. It works well for a monitoring solution, but isn't a deep dive troubleshooting tool. Some people expect that. there are other third party solutions that do deeper dives, and monitoring differently.

    Whatever you do, there are basics to monitor. Disk space, log usage, data size, corruption alerts, basic performance counters (disk/ram/cpu). You want to catch those simple things before they are noticed by others.

    SQL Server does include alerting and it's easy to set up, but lots of things to do manually. If you build enough scripts, and maintain them, perhaps it's a better choice, but allow for a learning curve over time. You will have more holes than a third party initially, but more control over the long term.

    Disclosure, I work for Red Gate Software.

  • This is some really good advice. Thanks to all. I'm going to take what you've given and do some research.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply