One of the more gut-wrenching experiences of a DBA is to have a database get stuck in recovery. This typically happens when the server was rebooted unexpectedly and typically coincides with a rather large transaction.
The reboot and sudden stop of the transaction requires the transaction to be replayed or to experience an undo. This could take hours or days depending on worst case scenarios. Sitting there and waiting for it to complete just makes you sweat.
This happens to have happened to a client recently so I dusted off an old script (which you can find if you click this link), and started investigating and monitoring the server. While waiting and investigating, I decided to take on my challenge (quoted hereafter) from the first time I posted this query (here).
Unfortunately, this query does not demonstrate the time remaining for the rollback nor the percent complete without needing to query the error log. Those would be awesome additions if you know how to do it (and let me know), other than via the error log.
I did not remove the dependency on the error log, rather I leveraged that dependency to allow me to get more information from the DMVs and link it to the error log messages to give a bigger more complete picture of the recovery process from a simple query.
This script is now enhanced from the 2014 version and works through 2017.
DECLARE @ErrorLog AS TABLE ( [LogDate] DATETIME , [ProcessInfo] VARCHAR(64) , [TEXT] VARCHAR(MAX) ); INSERT INTO @ErrorLog EXEC sys.xp_readerrorlog 0, 1, "Recovery of database"; SELECT DB_NAME(dt.database_id)AS DBName , GETDATE()AS currenttime , at.transaction_begin_time , dt.transaction_id , er.command , er.status , er.wait_type , er.last_wait_type , er.wait_time , CASE WHEN cx.RecoveryPhase = 1 THEN 'Analysis' WHEN cx.RecoveryPhase = 2 THEN 'Redo' WHEN cx.RecoveryPhase = 3 THEN 'Undo' ENDAS RecoveryPhase -- , cx.RecoveryPhase , at.nameAS TranName , CASE dt.database_transaction_type WHEN 1 THEN 'Read/write transaction' WHEN 2 THEN 'Read-only transaction' WHEN 3 THEN 'System transaction' ENDAS db_tran_type , cx.PercentComplete , cx.MinutesRemaining --, estimated_completion_time , CASE WHEN estimated_completion_time < 36000000 THEN '0' ELSE '' END + RTRIM(estimated_completion_time / 1000 / 3600) + ':' + RIGHT('0' + RTRIM((estimated_completion_time / 1000) % 3600 / 60), 2) + ':' + RIGHT('0' + RTRIM((estimated_completion_time / 1000) % 60), 2) AS [Time Remaining] , percent_complete AS ExecReqPercentComplete , er.task_address , er.reads , d.log_reuse_wait_desc , database_transaction_log_record_count , database_transaction_log_bytes_used , database_transaction_next_undo_lsn , CASE at.transaction_state WHEN 0 THEN 'Not Completely Initialized' WHEN 1 THEN 'Initialized but Not Started' WHEN 2 THEN 'Transaction is Active' WHEN 3 THEN 'Read-Only tran has Ended' WHEN 4 THEN 'Distributed Tran commit process has been initiated' WHEN 5 THEN 'In prepared state and waiting resolution' WHEN 6 THEN 'Transaction has been committed' WHEN 7 THEN 'Transaction is being rolled back' WHEN 8 THEN 'Transaction has been rolled back' ENDAS TranState FROM sys.dm_tran_database_transactions AS dt LEFT OUTER JOIN sys.dm_tran_active_transactions AS at ON dt.transaction_id = at.transaction_id INNER JOIN master.sys.databases AS d ON d.database_id = dt.database_id CROSS APPLY ( SELECT TOP 1 [LogDate] , SUBSTRING( [TEXT] , CHARINDEX(') is ', [TEXT]) + 4 , CHARINDEX(' complete (', [TEXT]) - CHARINDEX(') is ', [TEXT]) - 4 )AS PercentComplete , CAST(SUBSTRING( [TEXT] , CHARINDEX('approximately', [TEXT]) + 13 , CHARINDEX(' seconds remain', [TEXT]) - CHARINDEX('approximately', [TEXT]) - 13 ) AS FLOAT) / 60.0AS MinutesRemaining , DB_NAME(SUBSTRING([TEXT], CHARINDEX('(', [TEXT]) + 1, CHARINDEX(')', [TEXT]) - CHARINDEX('(', [TEXT]) - 1))AS DBName , CAST(SUBSTRING([TEXT], CHARINDEX('(', [TEXT]) + 1, CHARINDEX(')', [TEXT]) - CHARINDEX('(', [TEXT]) - 1) AS INT) AS DBID , SUBSTRING(el.ProcessInfo, CHARINDEX('d', el.ProcessInfo) + 1, 2)AS session_id , SUBSTRING(el.[TEXT], CHARINDEX('Phase', el.[TEXT]) + 6, 1)AS RecoveryPhase FROM @ErrorLog AS el ORDER BY [LogDate] DESC ) AS cx INNER JOIN sys.dm_exec_requests AS er ON er.session_id = cx.session_id WHERE d.state_desc <> 'online' AND cx.DBID = dt.database_id;
Wrap
SQL Server recovery is a safeguard to protect the data in the event of an unexpected failure. The recovery process is necessary and has several phases to roll through in order to bring the database back online. It will require patience and this script can help achieve that patience.