May 11, 2009 at 5:27 pm
Hi All,
Here's issue. I am not able to backup one of our production database. This is the error it gives...
BackupIoRequest::WaitForIoCompletion: read failure on backup device 'E:\Database File\abc.mdf'. Operating system error 23(Data error (cyclic redundancy check).).
What can I do? This is an urgent production issue....Please let us know your view asap....
May 11, 2009 at 10:05 pm
RPSql (5/11/2009)
quote]
Looks like .MDF file is currupted. Run DBCC CHECKDB.
May 11, 2009 at 10:56 pm
I would agree that this points to a physical disk problem on your server. Your only recourse may be to recover from your previous backup at this point. Check your SQL Server Error Log for 824 and 825 errors in the log. Run drive diagnostics and then CHECKDB to find the extent of the damage.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
May 12, 2009 at 1:36 am
Please run the following and post all the results here.
DBCC CHECKDB (< Database Name > ) WITH NO_INFOMSGS, ALL_ERRORMSGS
Take a look at this article. http://www.sqlservercentral.com/articles/65804/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 12, 2009 at 7:26 am
The Database size is 320 GB. I have started to run DBCC Checkdb on that database yesterday. It is running since 20 hours....Any idea why it is taking so long and why it is not completed yet?
In activity monitor it is showing Wait type FCB_REPLICA_WRITE. What can I do now?????
Please answer as it is a production issue and our application is down currently.
Thanks in advance.
May 12, 2009 at 7:33 am
Unless you are normally running CHECKDB (a recommended practice to catch these problems early on and reduce downtime/risk of data loss) and can look at the historical runs for how long it takes to run, there isn't a whole lot that you can do except wait it out. If you stop it, you won't get the information that you need to help resolve the problems.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
May 12, 2009 at 7:37 am
Thanks for your reply. I was littlebit worried as this database is not much large, but it's already a day since I am running this process...It's really taking so long....The thread in activity monitor is suspended and wait type is 'FCB_REPLICA_WRITE'.
Is this normal?
May 12, 2009 at 7:41 am
Per the Book Online, that wait type signals the following:
Occurs when the pushing or pulling of a page to a snapshot (or a temporary snapshot created by DBCC) sparse file is synchronized.
http://msdn.microsoft.com/en-us/library/ms179984.aspx
Based on that I'd say yes it is normal.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
May 12, 2009 at 7:47 am
RPSql (5/12/2009)
It is running since 20 hours....Any idea why it is taking so long and why it is not completed yet?
Maybe because there is corruption.
The checkDB algorithms are written in such a way that they can tell quickly if there is corruption or not, but if there is, then SQL has to go back and do extra detailed searches. it's called a 'deep-dive' and it can make the CheckDB time go up massively.
Wait until it's finished. To tell what's wrong we need the results. If you stop it now you're just going to have to run it to completion later.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 12, 2009 at 7:49 am
The file might be a corrupted one
[font="Comic Sans MS"]+++BLADE+++[/font]:cool:
May 12, 2009 at 7:51 am
Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.
Hopefully you won't have to use it though.... 🙂
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
May 12, 2009 at 7:54 am
Hi Gilamonster,
Thanks for your help. I will wait till this process complete and so as our application team..I will post the output here, please help if you can?
One more thing, I have the last backup on this database which is four days older. We didn't get notified as we are using IBM Tivoli to take SQL Backups directly to Tape. Our Storage people notify us after yesterday which is after 3 days!!!!...
Thanks again..
May 12, 2009 at 7:55 am
David Benoit (5/12/2009)
Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.Hopefully you won't have to use it though.... 🙂
You might look at replacement hardware as well. CRC failures are generally physical disk failure and requires replacing the bad disk(s) to rectify the problem.
Jonathan Kehayias | Principal Consultant | MCM: SQL Server 2008
My Blog | Twitter | MVP Profile
Training | Consulting | Become a SQLskills Insider
Troubleshooting SQL Server: A Guide for Accidental DBAs[/url]
May 12, 2009 at 7:56 am
Yeah, I have a report that looks at last day of a database backup to avoid things like that. Hopefully you won't have to use the backup. Regardless, have them make sure the tape is available.
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
May 12, 2009 at 8:09 am
David Benoit (5/12/2009)
Just a thought too, but while the DBCC continues you might want to start looking at backups and ensuring that you have one available and ready to go if the corruption is such that it can't be fixed. So, find the most recent good backup and get it available (if on tape, get it off tape). This can save you some time at the end of this process and allow you to get back online faster if you do have to perform a restore.
I'd also start checking system event logs, RAID controller/SAN logs, etc. Corruption's usually an IO problem.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 34 total)
You must be logged in to reply to this topic. Login to reply