May 4, 2011 at 2:39 pm
Testing and Production db's have lost connection to the translog 2 days in a row now. Today, my prod db was marked suspect after the issue - SCARY. The other db's did not lose connection. Possibly because there was no activity at that moment.
No errors in SQL log, only windows. Server Resources were not necessarily be hammered. I will be scouring the web, but wanted to reach out to all of you as well. See info and errors below.
Plenty of available drive space for Log, db, and tempdb partitions. 144gb RAM
SQL Server 2008 SP1; Enterprise (64-bit)
OS: Win Server 2008 R2 Enterprise
Win app logs:
error1- LogWriter: Operating system error 21(The device is not ready.) encountered.
error2 - The log for database (testing) is not available. Check the event log for related error messages. Resolve any errors and restart the database.
info mess3- Database was shutdown due to error 9001 in routine 'XdesRMFull::Commit'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
2 seconds later prod db goes down:
error4-
The log for database is not available. Check the event log for related error messages. Resolve any errors and restart the database.
error5 - During undoing of a logged operation in database, an error occurred at log record ID (86400:39070:17). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
error6 - fcb::close-flush: Operating system error (null) encountered.
error7 - An error occurred during recovery, preventing the database (PRODUCTION :w00t:) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
info mess8 -CHECKDB for database finished without errors on 2011-03-14 12:12:41.503 (local time). This is an informational message only; no user action is required.
May 4, 2011 at 2:44 pm
Looks like a hardware issue. Server is losing contact with the underlying drives.
SAN storage?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 4, 2011 at 2:47 pm
p.s. If I were you, I'd be doing checkDB more often than once in 2 months.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 4, 2011 at 2:54 pm
Yes, SAN storage. Just found out the tempdb partition and translog partition share the same physical drive. I can move the tempdb file to the larger database drive immediately - well late tonight.
Could this disk contention be causing the hardware error?
It's been running this way for a while now, but being hit harder lately. FYI - I will schedule CHECKDB's weekly, maybe daily now.
May 4, 2011 at 3:02 pm
Just stumbled into a VERY similar thread on this site:
http://www.sqlservercentral.com/Forums/Topic355924-5-1.aspx#bm482958
May 4, 2011 at 3:19 pm
I wouldn't say disk contention (but I'm not a SAN expert). Check the physical connections, switch, anything between the server and SAN. From the Windows error log, the OS can't see the disk at points.
Moving the DB will help, but you still need to find the root cause.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 1, 2011 at 10:56 am
June 1, 2011 at 11:53 am
January 28, 2013 at 10:03 pm
Hi
I know this is an old post but I had the same problem and we run a SAN an VMware so I thought it's related to that but then I found another post telling me to bring the db offline and online again and that actually worked. Very simple in the end.
March 19, 2013 at 1:53 am
Restart the sql service, and it will start working.
The same thing happened for me also. and i did the same..
Now its working fine.
May 30, 2013 at 6:56 am
Hi,
I just got the same error on one of my database and would like to share how I recovered from it.
Troubleshooted by following
(1)Take database offline
(2) bring database online
(3) change the database property to Auto Close = False
(4) we need to run dbcc so we will require db in single mode. change the database property to single user mode.
(5) run below dbcc command
dbcc checkdb ('db', REPAIR_REBUILD)
(6) now it should run fine and above log related error should not appear.
(7) change database property to “Multi User mode”
Note that we run dbcc every weekly and this error was not appeared at that time. I also have plenty of free disk space so I am still not clear why error appeared.
In my case above method worked. If there is any better way to troubleshoot this, then let me know.
Can someone let me know why this error appears? I mean what causes this ?
May 30, 2013 at 7:29 am
"Device not ready" usually means the server lost a disk (if your disk is a SAN device), check with your SAN admin to see what happened.
You can confirm it by looking in the windows system event log, you should see messages from the SAN vendor driver (i.e. EMC drivers complaining a path/port/device just died).
Once you get your disk back (after a few seconds or after intervention) you can try: ALTER DATABASE [MyDB] SET ONLINE
If that fail, the error log will give you more info on what's going on.
February 13, 2014 at 7:43 pm
We have also received Alert: The log for database is not available Resolution state: New
from SCOM
Thanks
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply