Knowing when an event occurred within the database environment is a very important thing. Being able to act quickly when certain events occur is equally as important. Sometimes, we may not find out about an event for a few days or weeks and then we are asked to figure out the who, when, why and how of the event. Hopefully there was enough foresight in place to plan for such a request to travel back in time.
Recently while networking at a large database related conference, I had the chance to discuss some of this with a semi-desperate DBA. In his environment, apparently a database can be taken offline at random times without the involvement of the DBA team. He was looking for a solution to prevent it from happening. For me, the simplest solution is to revoke permissions from the person doing it. If you can’t revoke the permissions, then at least capture the pertinent information to document who and when the databases are being taken offline.
Auditing and Facts
I am a huge fan of auditing the database environment to gather facts. There is comfort that can be gained in knowing that if a change happens to a database or server under your purview, that you can report some facts back up the chain when a post-mortem is required. I recently shared another example of such an auditing scenario – here.
There is a technique in that article that may be of interest for you if you are restricted to implementing this kind of audit. I won’t go into detail about the poor man audit techniques in this article, so it is worth reading that previous article.
When looking to find data about a database being taken offline, one could capture the data from the default trace or from the SQL Server Error Log. While this article will show how to capture this data from both of those sources, it is important to understand that the data just might not be available in either location depending on settings and activity of the server combined with how old the offline event was.
Lets first take a look at the default trace. That one should be easy enough.
/* from default trace */DECLARE @Path VARCHAR(512) ,@StartTimeDATE ,@EndTimeDATE = GETDATE(); SET @StartTime = dateadd(dd, datediff(dd, 0, @EndTime) - 1, 0) SELECT @Path = REVERSE(SUBSTRING(REVERSE([PATH]), CHARINDEX('', REVERSE([path])), 260)) + N'LOG.trc' FROM sys.traces WHERE is_default = 1; SELECT @@servername as SvrName,gt.StartTime AS EventTimeStamp ,tc.name AS EventCategory,spid ,tv.subclass_name ,gt.LoginName,gt.DBUserName,gt.HostName ,gt.DatabaseName,gt.ObjectName,gt.TargetUserName ,gt.TargetLoginName,gt.ParentName AS SchemaName ,gt.RoleName,te.name AS TraceEvent ,gt.textdata FROM ::fn_trace_gettable( @path, DEFAULT ) gt INNER JOIN sys.trace_events te ON gt.EventClass = te.trace_event_id INNER JOIN sys.trace_categories tc ON te.category_id = tc.category_id INNER JOIN sys.trace_subclass_values tv ON gt.EventSubClass = tv.subclass_value AND gt.EventClass = tv.trace_event_id WHERE 1 = 1 AND ObjectType = 16964 AND te.name = 'Object:Altered' ORDER BY gt.StartTime DESC;
And a snippet of some possible output looks like the following:
If you look closely at the output, you will probably notice that I don’t get a lot of information. I find out that there was an ALTER event to a database but I do not know what the event was nor do I see the text from the SQL statement that caused the change. So, at best, I get a partial view of the picture by looking at just the default trace.
With that in mind, let’s look at the error log and see if that provides better information.
/* from error logs*/SET NOCOUNT ON; GO IF OBJECT_ID('tempdb.dbo.#EnumLogs','U') IS NOT NULL BEGIN DROP TABLE #EnumLogs; END; IF OBJECT_ID('tempdb.dbo.#Logger','U') IS NOT NULL BEGIN DROP TABLE #Logger; END; -- Search all sql error logs at once DECLARE @SearchString1 NVARCHAR(MAX) ,@SearchString2 NVARCHAR(MAX) ,@LogType INT; -- ---------------------------------------------------------- SET @SearchString1 = 'offline' --leave blank for everything SET @SearchString2 = '' --leave blank for everything SET @LogType = 1-- =SQL Server log, 2=SQL Agent log -- ---------------------------------------------------------- CREATE TABLE #EnumLogs ( [ArchiveNum] INT , LogDate DATETIME , LogSizeInBytes INT ); INSERT INTO #EnumLogs EXECUTE xp_enumerrorlogs @LogType -- Iterate through all the logs CREATE TABLE #Logger ( LogDate DATETIME , ProcessInfo NVARCHAR(MAX) , LogText NVARCHAR(MAX) ); DECLARE @Counter INT DECLARE errlogs CURSOR LOCAL STATIC FORWARD_ONLY READ_ONLY FOR SELECT el.ArchiveNum FROM #EnumLogs el; OPEN errlogs; FETCH NEXT FROM errlogs INTO @Counter; WHILE @@Fetch_Status = 0 BEGIN INSERT INTO #Logger EXECUTE xp_readerrorlog @Counter ,@LogType ,@SearchString1 ,@SearchString2 FETCH NEXT FROM errlogs INTO @Counter; END CLOSE errlogs; DEALLOCATE errlogs; -- Fetch rows SELECTLogDate , ProcessInfo , LogText FROM #Logger ORDER BY LogDate DESC; IF OBJECT_ID('tempdb.dbo.#EnumLogs','U') IS NOT NULL BEGIN DROP TABLE #EnumLogs; END; IF OBJECT_ID('tempdb.dbo.#Logger','U') IS NOT NULL BEGIN DROP TABLE #Logger; END;
This is pretty cool here. I am checking all of the log files available on SQL Server and scanning for a search term of “offline.”
Ok, this shows a little bit of promise. I can see that the database
was indeed set to offline (or “option OFFLINE” was set to ON) with a spid and a timestamp. Unfortunately, the data shown here is far from being super useful in a post-mortem. The error log does not show who took the database offline.
What if, there was a way to combine these two queries into a single result set and better correlate the results into something highly useful? Something like that seems like it could be possible given we have a spid and timestamp in each result set.
Let’s try it and see.
/* Pulling it all together */SET NOCOUNT ON; GO -- Search all sql error logs at once DECLARE @SearchString1 NVARCHAR(MAX) ,@SearchString2 NVARCHAR(MAX) ,@LogType INT ,@Path VARCHAR(512) ,@StartTimeDATE ,@EndTimeDATE = GETDATE(); -- ---------------------------------------------------------- SET @SearchString1 = 'offline' --leave blank for everything SET @SearchString2 = '' --leave blank for everything SET @LogType = 1-- =SQL Server log, 2=SQL Agent log SET @StartTime = dateadd(dd, datediff(dd, 0, @EndTime) - 1, 0) -- ---------------------------------------------------------- SELECT @Path = REVERSE(SUBSTRING(REVERSE([PATH]), CHARINDEX('', REVERSE([path])), 260)) + N'LOG.trc' FROM sys.traces WHERE is_default = 1; IF OBJECT_ID('tempdb.dbo.#EnumLogs','U') IS NOT NULL BEGIN DROP TABLE #EnumLogs; END; IF OBJECT_ID('tempdb.dbo.#Logger','U') IS NOT NULL BEGIN DROP TABLE #Logger; END; CREATE TABLE #EnumLogs ( [ArchiveNum] INT , LogDate DATETIME , LogSizeInBytes INT ); INSERT INTO #EnumLogs EXECUTE xp_enumerrorlogs @LogType -- Iterate around all the logs gathering results CREATE TABLE #Logger ( LogDate DATETIME , ProcessInfo NVARCHAR(MAX) , LogText NVARCHAR(MAX) ); DECLARE @Counter INT DECLARE errlogs CURSOR LOCAL STATIC FORWARD_ONLY READ_ONLY FOR SELECT el.ArchiveNum FROM #EnumLogs el; OPEN errlogs; FETCH NEXT FROM errlogs INTO @Counter; WHILE @@Fetch_Status = 0 BEGIN INSERT INTO #Logger EXECUTE xp_readerrorlog @Counter ,@LogType ,@SearchString1 ,@SearchString2 FETCH NEXT FROM errlogs INTO @Counter; END CLOSE errlogs; DEALLOCATE errlogs; SELECT @@servername as SvrName ,gt.StartTime AS EventTimeStamp ,L.LogDate ,tc.name AS EventCategory,spid ,tv.subclass_name ,gt.LoginName ,gt.DBUserName ,gt.HostName ,gt.DatabaseName ,gt.ObjectName,gt.TargetUserName ,gt.TargetLoginName,gt.ParentName AS SchemaName ,gt.RoleName,te.name AS TraceEvent ,L.LogText FROM ::fn_trace_gettable( @path, DEFAULT ) gt INNER JOIN sys.trace_events te ON gt.EventClass = te.trace_event_id INNER JOIN sys.trace_categories tc ON te.category_id = tc.category_id INNER JOIN sys.trace_subclass_values tv ON gt.EventSubClass = tv.subclass_value AND gt.EventClass = tv.trace_event_id INNER JOIN #Logger L ON REPLACE(REPLACE(l.ProcessInfo,'spid',''),'s','') = gt.spid AND CONVERT(SMALLDATETIME,l.LogDate) = CONVERT(SMALLDATETIME,gt.StartTime) WHERE 1 = 1 AND ObjectType = 16964 AND te.name = 'Object:Altered' ORDER BY gt.StartTime DESC; IF OBJECT_ID('tempdb.dbo.#EnumLogs','U') IS NOT NULL BEGIN DROP TABLE #EnumLogs; END; IF OBJECT_ID('tempdb.dbo.#Logger','U') IS NOT NULL BEGIN DROP TABLE #Logger; END;
If I use the timestamp from each result and the spid from each result set, I can surely join the two results together. As you can see, that is exactly what I did. I had to format the spid and timestamp a little bit – but that is ok. With that little bit of magic, I get a result set similar to the following now.
Now, I can easily see who put the database into offline mode. I can also tell you what the statement was when they executed the offline. I also know what the timestamp was when the database was put offline. This is usable data now when it comes time for a post-mortem or fact finding request.
The Wrap
Is this the most eloquent method to trap this kind of data? Absolutely not, but it is usable. When it comes to something more sleek and modern, I would recommend a better tool such as Extended Events. I will show how to do this same task with greater ease and reliability through the use of XE in my next article.
Oh, if you are interested in any of my other articles about Auditing or Extended Events, I recommend you read start in here (for Auditing) or here (for XE).