How to check progress of database (in recovery)

Question

How to check progress of database (in recovery)

Viewing 4 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

dbaxter-1000885 Right there with Babe Points: 716 More actions · Answer 1

thank you for your help. I will read over that blog post and hopefully that will give us some ideas so we can reduce the time it takes to recover next time.

chrisharrod76 SSC Enthusiast Points: 101 More actions · Answer 2

This is a good query that gives %complete and helps estimate the finishing time also.

SELECT r.session_id,r.command,CONVERT(NUMERIC(6,2),r.percent_complete)

AS [Percent Complete],CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) AS [ETA Completion Time],

CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) AS [Elapsed Min],

CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) AS [ETA Min],

CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) AS [ETA Hours],

CONVERT(VARCHAR(1000),(SELECT SUBSTRING(text,r.statement_start_offset/2,

CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END)

FROM sys.dm_exec_sql_text(sql_handle)))

FROM sys.dm_exec_requests r WHERE command IN ('DB Startup')

SQLRNNR SSC Guru Points: 281344 More actions · Answer 3

Here is what I prefer to use:

http://bit.ly/DBRecovery2

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Gregory Lienard Newbie Points: 7 More actions · Answer 4

We had one of our biggest databases (>100 Tb) being suspect. Analysing the logs said it was caused by to some failing disk operation during a checkpoint. Which is kind of strange as we are using a SQL cluster with Storage Direct Spaces containing only NVMe disks.

I knew the last backup contained consistency errors. As we restore every backup to another server and check it there. So restoring this one might introduce the same problem again. So I ran DBCC CheckDB with REPAIR_ALLOW_DATA_LOSS. It took about 2,5 days to finish.

A few days later I noticed that the log backups did not truncate the log files and had to add a new log file. This happened over and over again. Log backup didn't do the job. Putting it into SIMPLE recovery model neither. It kept on saying that there was an active transaction although dbcc opentran didn't give anything. Closing all connections didn't do the job either.

After 1 week the transaction logs grew to 10 TB. It was time to take another action. So I backed it up, and then moved the SQL role to another node (which is more or less the same as restarting sql server service).

All databases on the instance were up and running except the big database was now In Recovery. There were only a few interesting items in the log:

Recovery of database 'Keyboost' (19) is 0% complete (approximately 12 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

3434 transactions rolled forward in database 'Keyboost' (19:0). This is an informational message only. No user action is required.

Except for a bunch of Deleting unrecoverable checkpoint table row there seemed to be no activity except for logon failures. Note that if you want to get rid of all logon failures in your sql log, use this script:

DROP TABLE IF EXISTS #SQLErrorLog; --assuming SQL Server 2016+...otherwise use the typical if object exists drop

CREATE TABLE #SQLErrorLog
(
        LogDate     DATETIME
       ,ProcessInfoNVARCHAR(12)
       ,LogText     NVARCHAR(3999)
)

INSERT INTO #SQLErrorLog
(
LogDate
   ,ProcessInfo
   ,LogText
)
EXEC sp_readerrorlog;

SELECT * FROM #SQLErrorLog
where processinfo <>'logon'
order by logdate desc

None of the scripts above gave anything except this one:

SELECT r.session_id,r.command,CONVERT(NUMERIC(6,2),r.percent_complete)
AS [Percent Complete],CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) AS [ETA Completion Time],
CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) AS [Elapsed Min],
CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) AS [ETA Min],
CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) AS [ETA Hours],
CONVERT(VARCHAR(1000),(SELECT SUBSTRING(text,r.statement_start_offset/2,
CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END)

FROM sys.dm_exec_sql_text(sql_handle)))
FROM sys.dm_exec_requests r WHERE command IN ('DB Startup')

The percentage completed staid at 3% for hours and hours, increasing the ETA from hours to days and even weeks. Then I noticed what Gail Shaw said. Going over the 10 Tb of log files would indeed take some time, even on our flash only cluster. I went to see in the Disk Resource Monitor, normally on your sql server, but in our case it resides on the node where the SQL cluster share volume is located, and I noticed that the logs were being read at about 2 Gb/s. Then I was reassured that at least it was doing something.

After about 6 hours the next message appeared in the log stating that the recovery was completed. Now, 4 hours later it is still clearing out the log files, but it is progressing. When this is finished, I still need to remove all additional log files I created, switch to Full recovery mode and do a full backup again.

Gregory Liénard
Microsoft Data Engineer
https://seopageoptimizer.com/: analyses billions of webpages
https://keyboost.com/: analyses trillions of links

If the statistics are boring, you’ve got the wrong numbers.