IT departments may have a DBA, but how do they know if they've implemented the operational processes for running an effective database administration function? And for that matter, what are the "effective" operational processes? Of course, since we generally know what good DBAs do, then all we need is a framework or checklist to rate the database administration operations area. Sounds easy, doesn't it? Then why aren't there any good templates available? A quick Internet search on SQL Server audits will produce a lot of results including best practices and worst practices, tips and tricks, performance audits, and security audits--unfortunately nothing on operational audits. Sure, Microsoft will champion their MOF assessment, but what I really want is a database specific checklist or framework that provides an objective assessment of the things a Database Administration function needs to do in order to effectively run database operations.
Defining the Audit Items
Not having found a suitable checklist I decided to create my own. First let me define the purpose of an operational audit: to rate an organization's database administration area against known best practices in an objective manner. My Internet search wasn't completely fruitless as I did turn up one item that I'll use in developing a checklist: a Powerpoint presentation from Microsoft TechEd 2003 entitled "Building an Operations Guide" by Cathan Cook. In her presentation, Cook makes several suggestions--including using the Microsoft SQL Server 2000 Operations guide as an outline for creating a run book. I read the Microsoft SQL Server 2000 Operations guide when it first came out and even kept a copy on my bookshelf, where it remained for the past four years. At the time I read it, the usefulness of the guide did not occur to me until I viewed Cook's presentation. It was all written right there in the Operations guide. Each of the seven chapters lists key operational processes:
- Change, Configuration, and Release Management
- Security Administration
- System Administration
- Monitoring and Control
- Capacity and Storage Management
- Problem and Incident Management
- Service Management
Re-reading the Microsoft SQL Server 2000 Operations guide, it seems almost complete. However, there was one item, which although not a process, I felt should be included in an operational audit--SQL Server licensing. The reason I believe licensing should be included is that it is often misunderstood and sometimes you'll find SQL Servers that are either unlicensed or improperly licensed. However, it's not always bad. Sometimes
you'll find cases where SQL Servers are running with a higher end SQL license than what is supported by the hardware or user population, presenting an opportunity to save money. A good analogy for software licensing is that it's like taxes: you want to be honest and pay your fees, but at the same time you don't want to over pay.
Creating the Audit Report Format
Now that we have a framework let's discuss what is the best way to
present the
information contained in our audit. Cook's presentation mentions
creating
your own operations guide or a run book, which isn't exactly what I had
in mind. Another Internet search turned up some excellent auditing
documents from GIAC--
specifically Graham Thompson's paper on
Auditing a SQL Server 2000 Server.
Looking at his document, I was so impressed with his overall
organization that I decided to adapt it to my own audit document
creating the
following sections:
- Overview and Identification
- Checklist
- Summary
The overview section will contain a short description of the purpose of
the operational audit, a description of the department or
company
(including IT infrastructure) and a list of all SQL Servers that will
be included in the audit. For the checklist I'll use a
concise one page table, per check. The table is
formatted as follows:
Reference | Used to "cite your sources" by providing a description and URL for the operational check. Here you want to include a link to the SQL Server 2000 Operations Guide chapter. Also note you can include other references which fall under the operational category, for example under Security you could include a link to the operations manual and a link to a good SQL Server security article. |
Objective | Describes the objective of the operational process. |
Risk | Documents the risk of not having the operational process. |
Compliance/ Expected Results | Documents the expected outcome of the operational process. |
Review Details | Describes how data will be collected. |
Review Results | Documents the review results: Compliant, Partially Compliant, or Non-Compliant. |
Reviewer Notes | Documents the reviewer's comments and provides more detailed information on results. |
Notice I'm not using the word "audit"; I'll explain why later. Finally,
the summary section includes a one page executive summary of the
findings. The executive summary can take the format of bulleted text or
alternatively, a dashboard like scorecard with red, yellow and green
checkmarks. The sample SQL Server Operational
Assessment contains
both formats.
Collecting the Data
The data for the audit will be collected by conducting interviews, reviewing
the existing documentation, and querying systems configuration. Some of
the key processes will only require interviewing and reviewing
documentation, while other key process data can be collected through
scripts or other automated processes. For example, you can't run a script to
collect data for Service Management to determine whether SLAs have been implemented or whether issues and bugs
are properly tracked. You'll need to talk to people to determine
whether these operational processes have been adopted. For the
operational process data which can be collected through scripts, try to
automate as much as possible. There are some good scripts on SQL Server
Central or commercial software or you may want to create your own scripts. I found Microsoft's
SQL Health and History Checker (SQLH2)
to be a
fast and free way to collect a lot of information on multiple SQL
Servers quickly; however its reliance on Reporting
Services for producing reports doesn't make it very lightweight.
In the end you'll probably end up with a series of scripts. Choose the
scripts and utilities that work best for you, and thoroughly test the
scripts in your own
environment.
Executing the Audit
On my first attempt to use the operational audit, I quickly discovered
people didn't like the word audit. It seems to have a
negative
connotation, especially in financial circles. So a quick global replace
of the word "audit" with "review" and the overall project is now an
"assessment" and instead of an "audit." I know this seems petty, but having dealt
with auditors myself (no offense, I know auditors are just doing their job) I
could see their point. As for collecting the data
for the assessment, keep in mind your goal is to audit and report
findings not to create a run book or fix issues that are found. You can
always plan for addressing issues as follow-on work. Also remember
that nobody likes to be criticized, so constructive criticism
should be delivered with tact and diplomacy. When I ran through the audit
with an external customer, I often reminded them that my own database
administration department wouldn't pass certain key operational
processes, and in general most database administration departments would not
pass all key operational processes however, it is
worth striving
towards. Ultimately, the more best practices adopted, the easier it will
be to support database operations.
Conclusion
This article demonstrates a framework for evaluating the
effectiveness of database operations using existing guides and best
practices. By using the provided framework you can extend your evaluation
to include nearly any additional item within the eight key operational
processes provided. The operational audit template will provide you with a
useful method for evaluating an organization's database operations.