December 1, 2008 at 9:31 am
Hello,
I have a 3TB data warehouse on SQL Server 2005. Two of our three most recent ETL procedures have returned two errors of the type:
Error: 605, Severity: 21, State: 3
Attempt to fetch logical page (3:156826) in database 6 failed. It belongs to allocation unit 72057621942501376 not to 72057595989655552.
After the first time they appeared, I repaired the affected tables and reloaded the affected data. But now that it has appeared a second time, I am worried this will be a recurring problem.
Our ETL procedures make extensive use of partition swapping, and, from what I understand, these errors are mostly produced because of issues with the I/O subsystem. We have been using the current hardware configuration for 1.5+ years, and have had no issues or this sort with it until this. I have looked through the error logs for our RAID array, windows and SQL Server but could not find anything that would direct me to a cause.
I am mostly concerned right now with tracking down the source of this error and correcting it. Any advice on how to do this or places to start looking would be much appreciated.
Thanks.
December 1, 2008 at 1:41 pm
When you can, run a full checkDB with no_infomsgs, all_errormsgs and see if it gives any errors back.
What build/patch level for 2005?
Do you have page checksums on for that DB?
If you have largish maintenance windows, I'd suggest you get hold of SQLIOSim (google for it's location) and run that against your IO system, see if it picks up any problems.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 3, 2008 at 9:21 am
Gail,
Thanks for your reply. I am using SS2005 SP2, build 9.0.3042, and page checksums are not turned on.
I was able to run CHECKDB and have attached the output as a text file. I used the PHYSICAL_ONLY option, so I would be sure it would finish in the allotted window.
I will look in using SQLIOSim.
December 3, 2008 at 10:05 am
If you run a full DBCC CHECKDB you'll see a *lot* more corruption errors as a result of the errors you've already posted.
You've got some interesting corruption in one of your clustered indexes - without backups, you will likely lose data if you run repair as these bad pages will be deallocated.
Definitely turn on page checksums and check the integrity of the I/O subsystem.
Thanks
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
December 3, 2008 at 1:31 pm
Thanks Paul.
Of what sort are the extra corruption errors a full DBCC CHECKDB will produce (i.e. what checks does PHYSICAL_ONLY turn off)? It's tough to do a full checkdb on a 3TB database.
I plan to drop the affected tables, run another CHECKDB to make sure the db is back in a consistent state, recreate them, and finally reload the data. I just want to be sure I don't run into more corruption I didn't originally detect, and that it doesn't spread.
Am I safe in assuming that a PHYSICAL_ONLY check will bring to my attention all corruption that is not data-related?
December 3, 2008 at 1:44 pm
A PHYSICAL_ONLY check will check the integrity of the allocation bitmaps and then read every allocated page in the database and do a page audit. It skips all inter-page checks. See CHECKDB From Every Angle: Complete description of all CHECKDB stages for (much, much) more details.
If you can reload the data, that's great. My concern is with you I/O subsystem and that next time the corruption may hit some pages that cannot be repaired in any way - losing the whole database or large portions of it. See CHECKDB From Every Angle: Can CHECKDB repair everything?
Thanks
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply