Auditing Your SQL Server - Part 3 - Managing Audit Data
Introduction
Auditing anything generates a tremendous amount of data. In the past, line printers were used in many applications, merely printing a new line on a piece of paper each time an event occurred, someone entered a door, a phone call was completed, etc. If you've ever used a system like this you have a great appreciation for the amount of paper that is consumed.
In Part 2 - Using a Mirror, I used a mirror table, essentially a copy of the original table to store copies of the data rows at points in time to track the changes that were made. This type of auditing generates large numbers of rows, sometimes a large multiple of rows relative to the original table. So how do you manage this data?
Separation
One of the ways that I've used to manage this data is to store the audit data in a separate database. In many applications, you will find that the audit data is examined very rarely. In fact, you will usually find that most of the audit data is never looked at again.
Using a separate database is very easy, simply create a database, continuing the example from Part 2, I'd create a Northwind_Audit database, potentially on a separate physical disk if the amount of activity was high. When creating this database, keep in mind that your users will need access to this database to write the audit data, so user management becomes slightly more complex.
If I were auditing the Orders table, as in Part 2, I'd next create my Orders_Audit table in the Northwind_Audit database. The structure would be the same, but the location would be different. My triggers would stay the same, but I'd add an additional object to Northwind:
create view Orders_Audit as select * from Northwind_Audit.dbo.Orders_Audit go
By using a view, I've abstracted out the location of the audit table. I've also used this type of structure to "move" my audit data from the main database to another one as it grew. If I implemented linked servers, this technique would also allow me to move the audit data to another server without disturbing the application. I would merely recompile the view.
Using a separate database also brings some additional benefits. First the data is stored in another file, so this data can be moved without disturbing the main data. In the event of a disaster, I can more quickly restore the application data because I do not have to restore the audit data at the same time.
Archiving
Generating this much data will eventually become a problem for everyone. At some point, the amount of archive data can quickly grow larger than the amount of real data. Dealing with this data can be another problem in and of itself.
Not having an archive plan is a worst practice, but doubly so for audit data. This is something that's easy to implement and then you can forget about it, but you need to do it. I've got a couple of different methods of doing this.
First you can merely detach the audit database and create a new audit database periodically. This works well and keeps the data handy in a form that it can easily be restored. I did this quarterly for one system and it allowed us to attach the database back as Northwind_Audit_2002_Q3, or whatever the date was and then access the data. The downsides were that our application could not easily find this data and custom queries were required to report on the audit.
Another solution that worked much better was to BCP or DTS out some amount of old data periodically. In most cases I've found that the last 30 or 60 days of audit data was most often accessed. After that it was rare that someone needed to track the history of a record, and even then, they usually knew the time period that needed to be audited.
To handle this, I wrote a script that would BCP out data that was older than 60 days into a file that was named for the time period. Using the Northwind.dbo.Orders example, I'd bcp out the data from December 2002 as "Northwind_Orders_Audit_200212.txt. I like text files because I can easily view them in Notepad or any editor. This script was set to run monthly and would also zip up the text file, which was then backed up to tape. If someone needed older audit data, we could BCP it back into the Northwind_Audit.dbo.Orders_Audit table, allow them to query it from their application, and then delete it out of the table. We still had the text file, so this could be repeated as needed.
The downside to this was that we had to take manual action to restore the data as well as remember to remove it (otherwise the next time the script ran, it would duplicate audit data). This also could be cumbersome if someone wanted a large amount of audit data. You might spend your entire data doing BCPs.
Conclusions
Building an archive plan and managing your audit data is something that invariable is a custom solution. You need to understand the audit review requirements to know what frequency to archive data and whether or now a separation of the audit data from the source data is needed.
In Part 4, I'll look at selective auditing of particular types of data, a custom solution that makes the audit data management easier to deal with.
As always, I welcome feedback on this article, including your suggestions, ideas, and criticisms.
Steve Jones
©dkRanch.net March 2003