July 18, 2012 at 7:41 am
Our database crashed this morning and went into recovery mode. Can anyone provide more details about how I can track the progress of the recovery to determine how long it might take? I read is some other posts and forums that there should be an entry in the error log that shows the percentage but I don't see anything.
The error log shows that it started up all the databases, then shows the recovery messages fr the msdb database, then shows that sql server is ready for client connections. I don't see any messages about my database recovery or the number of transactions to roll forward or background. If i run the sys.sp_readerrorlog and search for my database name, the only line returned is the starting up database message.
I do expect the database to take a while to recovery as it is about 8TB, there is plenty of free disk space about 3TB.
The database started recovery while a transaction log backup was running so that backup failed,the last transaction log backup was taken 2 hours before recovery started. The last full backup completed about 5 days ago. The transaction log backup occurs every 2 hours and is typically around 16GB.
Again i now it might take a while to recovery, i am just looking for some insight to determine how long it might be.
Thank you
July 18, 2012 at 7:43 am
Keep your eye on the SQL error log, it should post the current step and estimated time left on the recovery process.
July 18, 2012 at 7:49 am
You can take a look at the DMO sys.dm_exec_requests. For a backup or restore it will show a percentage complete. It's not a perfectly accurate measure, but it will be updated regularly, so it's the one place I know to try to understand what's currently going on.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 18, 2012 at 7:57 am
This is what I always use :
if OBJECT_ID('tempdb..#tmp_Restores') is null
begin
SELECT @@servername as ServerName
, r.command as [CMD_Type]
, r.start_time
, r.percent_complete
, dateadd(second, r.estimated_completion_time / 1000, getdate()) as est_completion_time
/* Remark: 1900-01-01 00:00:00.000 offset !!! */
, dateadd(ss, DATEDIFF(SS, r.start_time, getdate()), 0) as running_time_1900_01_01
, dateadd(ss, r.estimated_completion_time / 1000, 0) as est_time_to_go_1900_01_01
, dateadd(ss, DATEDIFF(SS, r.start_time, dateadd(second, r.estimated_completion_time / 1000, getdate())), 0) as est_elaps_time_1900_01_01
, es.login_name
, es.host_name
, r.wait_type
, r.wait_time
, r.wait_resource
, r.cpu_time
, r.total_elapsed_time
, r.reads
, r.writes
, r.logical_reads
, s.text as SQLStmt
into #tmp_Restores
FROM sys.dm_exec_requests r
inner join sys.dm_exec_sessions es
on es.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) s
WHERE r.command in ( 'RESTORE DATABASE', 'BACKUP DATABASE', 'RESTORE LOG', 'BACKUP LOG' )
end
Insert into #tmp_Restores
SELECT @@servername as ServerName
, r.command as [CMD_Type]
, r.start_time
, r.percent_complete
, dateadd(second, r.estimated_completion_time / 1000, getdate()) as est_completion_time
/* Remark: 1900-01-01 00:00:00.000 offset !!! */
, dateadd(ss, DATEDIFF(SS, r.start_time, getdate()), 0) as running_time_1900_01_01
, dateadd(ss, r.estimated_completion_time / 1000, 0) as est_time_to_go_1900_01_01
, dateadd(ss, DATEDIFF(SS, r.start_time, dateadd(second, r.estimated_completion_time / 1000, getdate())), 0) as est_elaps_time_1900_01_01
, es.login_name
, es.host_name
, r.wait_type
, r.wait_time
, r.wait_resource
, r.cpu_time
, r.total_elapsed_time
, r.reads
, r.writes
, r.logical_reads
, s.text as SQLStmt
-- into #tmp_Restores
FROM sys.dm_exec_requests r
inner join sys.dm_exec_sessions es
on es.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) s
WHERE r.command in ( 'RESTORE DATABASE', 'BACKUP DATABASE', 'RESTORE LOG', 'BACKUP LOG' )
-- order by percent_complete desc
Select *
from #tmp_Restores
order by percent_complete desc
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
July 18, 2012 at 8:09 am
Thank you for the replies,
I tried looking at the dm_exec_requests early but i was not completely sure what i was looking for. There are two rows that are DB STARTUP, one of them has a wait type of SLEEP_DBSTARTUP and the other NULL. The one that is NULL, the cpu_time is increasing but the reads, writes, and logical_reads is not changing. the percent complete is 0 for all the rows in the view. Is there something else i should be looking for in this view? I am attaching the output of the view in case that would be easier.
Johan, i looked over that query and ran it but it looks like it is setup for backups and restores, my database is in recovery from a crash. I tried to change the options for the r.command in to ('RECOVER') but still got no results, do you know what the command should be for a recovery?
Again that you for help
Derek
July 18, 2012 at 8:14 am
Ah, I don't think it displays status for that recovery process. I think you'll have to wait. Sorry.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 18, 2012 at 8:16 am
Check the error log. Crash recovery logs in there.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 18, 2012 at 8:28 am
When i checked the error log the only thing it shows regarding my database is that it is starting.
I ran the the following:
EXEC sys.sp_readerrorlog 0, 1, 'prod_db'
LogDate ProcessInfo Text
2012-07-18 07:22:56.590spid29sStarting up database 'prod_db'.
I was expecting to see something more like the following that i received for msdb
EXEC sys.sp_readerrorlog 0, 1, 'msdb'
LogDate ProcessInfo Text
2012-07-18 07:22:56.590spid19sStarting up database 'msdb'.
2012-07-18 07:22:57.160spid19s99 transactions rolled forward in database 'msdb' (4). This is an informational message only. No user action is required.
2012-07-18 07:22:57.290spid13s0 transactions rolled back in database 'msdb' (4). This is an informational message only. No user action is required.
2012-07-18 07:22:57.290spid13sRecovery is writing a checkpoint in database 'msdb' (4). This is an informational message only. No user action is required.
July 18, 2012 at 8:55 am
You should see records like this
Recovery of database 'DBName' (DBID) is 0% complete (approximately 124308 seconds remain). Phase 1 of 3. This is an informational
message only. No user action is required.
Recovery of database 'DBName' (DBID) is 5% complete (approximately 1234678 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
July 18, 2012 at 9:03 am
I agree with you completely, that i should be seeing that.
Looking back at the when this happened in January when the Database was around 5TB in size it took about 54 minutes from the Starting up database message, to the first Recovery of database 'prod_db" is ##% complete.
So far I have been waiting about 2.5 hours.
Do you know what it does between those two steps? I know it is checking for uncommitted entries in the log, but does it also have to scan the entire database?
July 18, 2012 at 10:02 am
Gut feel here - severe VLF fragmentation and the recovery hasn't started (SQL's still reading over the log). When you get this back up, read Kimberly Tripp's blog post on transaction log throughput.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 18, 2012 at 10:20 am
After 3 hours and 19 minutes I am now getting the messages inthe logs at 10:41 that it is recovery the database.
Does anyone know what is actually happening during the 3 hour period it took from startup to when it began recovery? What affects the amount of time it will take? Like i mentioned when this happened in january it only took 54 minutes and the database size was 5TB. Before that 32 minutes when it was about 3TB. This time it was 199 minutes and the DB size was 8TB.
I understand the larger the database the longer it might take to restart but form the numbers i was thinking about 10 minutes per TB, so I would have guessed around 80 - 100 minutes
The database was in the middle of a log backup when it crashed this time, do you think that might account for the extra time?
July 18, 2012 at 10:39 am
We are back online now. The recover started at 10:41 and finished about 11:30.
So what does SQL do from 7:22 until 10:41 between the time the database starts and the recovery begins.
I am just trying to figure out if there is a way to determine how long it might take in the future if it happens again.
Back in january when it happened it took around 50 minutes when the DB size was 5TB and last year it took around 30 minutes when the DB size was 3TB. So i was expecting around 80 minutes this time size it was about 8TB.
Would the fact that it crashed while it was doing a transaction log backup account for the increased amount of time?
thank you all for you help.
July 18, 2012 at 10:41 am
sorry about the double post, i stepped away and didn't see the post so i thought i cancelled it.
July 18, 2012 at 10:51 am
dbaxter-1000885 (7/18/2012)
So what does SQL do from 7:22 until 10:41 between the time the database starts and the recovery begins.
Reading over the log, it has to check every VLF's header. You have severe VLF fragmentation, I'll bet lunch on it.
Would the fact that it crashed while it was doing a transaction log backup account for the increased amount of time?
Not likely, no.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply