March 25, 2017 at 6:37 pm
Last Friday, we had an issue with our production SQL cluster that uses AlwaysOn Availability Groups. I am using SQL 2016 Enterprise.
The issue caused a few of the databases to go into (Not Synchronizing / Recovery Pending) on our secondary replica node. I was able to Resume Data Movement on all of them except one. It would never complete due to some block or lock on the database. I believe it has some system level lock on it.
I was able to remove the database from the AG from the primary node. But the replica copy stayed in the same state of (Not Synchronizing / Recovery Pending). I no longer need this database but I can't drop it, detach it, or take it offline on the secondary node. It gives me errors that the database is inaccessible.
The database_state_desc is RECOVERY_PENDING and suspend_reason_desc is SUSPEND_FROM_REDO.
There is a process connected to the database: Status=BACKGROUND, Login=sa, Command=DB STARTUP
I believe this is also causing our TempDB to continue to grow in size and never release space.
How can I regain access to this database so I can drop it?
SQL error logs:03/17/2017 23:04:19,spid13s,Unknown,SQL Server has encountered 328 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:\SQLSVR\Data\RedGateMonitor.mdf] in database id 24. The OS file handle is 0x0000000000001048. The offset of the latest long I/O is: 0x000002e3600000
03/17/2017 23:04:00,spid109s,Unknown,During redoing of a logged operation in database 'RedGateMonitor'<c/> an error occurred at log record ID (20670:13759:5). Typically<c/> the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup<c/> or repair the database.
03/17/2017 23:04:00,spid109s,Unknown,Always On Availability Groups data movement for database 'RedGateMonitor' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database<c/> you will need to resume the database manually. For information about how to resume an availability database<c/> see SQL Server Books Online.
03/17/2017 23:04:00,spid107s,Unknown,During redoing of a logged operation in database 'RedGateMonitor'<c/> an error occurred at log record ID (20670:13767:16). Typically<c/> the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup<c/> or repair the database.
03/17/2017 23:04:00,spid110s,Unknown,During redoing of a logged operation in database 'RedGateMonitor'<c/> an error occurred at log record ID (20670:13694:3). Typically<c/> the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup<c/> or repair the database.
03/17/2017 22:59:16,spid13s,Unknown,SQL Server has encountered 178 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:\SQLSVR\Data\RedGateMonitor.mdf] in database id 24. The OS file handle is 0x0000000000001048. The offset of the latest long I/O is: 0x000000fe200000
Errors at the time of the incident in Windows event log:SQL Server has encountered 178 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:\SQLSVR\Data\RedGateMonitor.mdf] in database id 24. The OS file handle is 0x0000000000001048. The offset of the latest long I/O is: 0x000000fe200000
A time-out occurred while waiting for buffer latch -- type 4, bp 000000203018ABC0, page 1:1034642, stat 0xf, database id: 24, allocation unit Id: 72057594059161600, task 0x0000001FD88764E8 : 0, waittime 300 seconds, flags 0x1a, owning task 0x0000001FD9D7A8C8. Not continuing to wait.
A time-out occurred while waiting for buffer latch -- type 4, bp 000000202332C840, page 1:6398334, stat 0xf, database id: 24, allocation unit Id: 72057594063290368, task 0x0000001FDCDC6CA8 : 0, waittime 300 seconds, flags 0x1a, owning task 0x0000001FD9D7A8C8. Not continuing to wait.
Time-out occurred while waiting for buffer latch type 4 for page (1:6398334), database ID 24.
During redoing of a logged operation in database 'RedGateMonitor', an error occurred at log record ID (20670:13694:3). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
During redoing of a logged operation in database 'RedGateMonitor', an error occurred at log record ID (20670:13767:16). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
Always On Availability Groups data movement for database 'RedGateMonitor' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
March 27, 2017 at 4:33 am
The situation may have resolved itself now, but when I had something similar I think I was able to restore over the database (then drop it)- is that an option for you at all?
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply