March 22, 2009 at 6:14 pm
Hi all,
My db has a corrupted page (checksum mismatch) that is not getting repaired by DBCC CHECKTABLE(tReports, REPAIR_ALLOW_DATA_LOSS). Other errors are successfully getting repaired, just not this one. It is on the clustered index.
Using various (great!) blog posts from Paul Randal, I have isolated it to 11 contiguous rows. I also tried to restore the data from a known good backup to that page using DBCC PAGE. Unfortunately I was running in SIMPLE recovery (silly me) so that method is not working.
There are no foreign key constraints or triggers, and the data is not critical, so I am perfectly happy to lose the data.
I very much want the table to be clean again and pass DBCC. Is there a way to simply clear out, remove or otherwise eradicate the bad page?
Here are some of the relevant messages, first from DBCC CHECKTABLE(tReports):
DBCC results for 'treports'.
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476884) was not seen in the scan although its parent (1:986630) and previous (1:476883) refer to it. Check any previous errors.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476885) is missing a reference from previous page (1:476884). Possible chain linkage problem.
There are 1228440 rows in 151616 pages for object "treports".
CHECKTABLE found 0 allocation errors and 2 consistency errors in table 'treports' (object ID 552389037).
repair_rebuild is the minimum repair level for the errors found by DBCC CHECKTABLE (db2008.dbo.treports).
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Next, running DBCC CHECKTABLE(tReports, REPAIR_REBUILD) doesn't correct the issue.
Then running DBCC CHECKTABLE(tReports, REPAIR_ALLOW_DATA_LOSS):
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1 will be rebuilt.
Could not repair this error.
.....
Msg 8986, Level 16, State 1, Line 1
Too many errors found (201) for object ID 552389037. To see all error messages rerun the statement using "WITH ALL_ERRORMSGS".
There are 1249643 rows in 155212 pages for object "treports".
CHECKTABLE found 0 allocation errors and 240 consistency errors in table 'treports' (object ID 552389037).
CHECKTABLE fixed 0 allocation errors and 240 consistency errors in table 'treports' (object ID 552389037).
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xe5e045ba; actual: 0x15bc2a83). It occurred during a read of page (1:476884) in database ID 9 at offset 0x000000e8da8000 in file 'M:\SQLData\MSSQL10.SQL2008ENT\MSSQL\DATA\db2008.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Thanks,
Neil
March 23, 2009 at 1:39 am
Can you post the full output of the following command please. I'd like to get a full picture of what's wrong before recommending anything.
DBCC CHECKDB (< Database Name > ) WITH NO_INFOMSGS, ALL_ERRORMSGS
If you have a clean backup, why repair and not restore?
Take a look at this article. http://www.sqlservercentral.com/articles/65804/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 23, 2009 at 4:50 am
Thanks for replying. Here is the output. I will also re-run using REPAIR_ALLOW_DATA_LOSS and show you that output momentarily...
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476884) was not seen in the scan although its parent (1:986630) and previous (1:476883) refer to it. Check any previous errors.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476885) is missing a reference from previous page (1:476884). Possible chain linkage problem.
CHECKDB found 0 allocation errors and 2 consistency errors in table 'treports' (object ID 552389037).
CHECKDB found 0 allocation errors and 2 consistency errors in database 'db2008'.
repair_rebuild is the minimum repair level for the errors found by DBCC CHECKDB (db2008).
March 23, 2009 at 5:24 am
Also, the last backup without this issue is from over a month ago, so I can't simply restore and lose a months worth of data.
Here are the results of running: dbcc checktable(treports, repair_allow_data_loss) with no_infomsgs, all_errormsgs
Repair: The Nonclustered index successfully rebuilt for the object "dbo.treports, IX_tReports_OrgDraftDateIncl" in database "db2008".
Repair: The Nonclustered index successfully rebuilt for the object "dbo.treports, IX_tReports_CallerNum" in database "db2008".
Repair: The Nonclustered index successfully rebuilt for the object "dbo.treports, IX_tReports_Comprehensive" in database "db2008".
Repair: The Nonclustered index successfully rebuilt for the object "dbo.treports, IX_tReports_VolNumOrgNumDraftDateFeedbackCountOfChild" in database "db2008".
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1 will be rebuilt.
Could not repair this error.
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476884) was not seen in the scan although its parent (1:986630) and previous (1:476883) refer to it. Check any previous errors.
The error has been repaired.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 1, partition ID 72057594103398400, alloc unit ID 72057594127253504 (type In-row data). Page (1:476885) is missing a reference from previous page (1:476884). Possible chain linkage problem.
The error has been repaired.
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 15 will be rebuilt.
The error has been repaired.
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 16 will be rebuilt.
The error has been repaired.
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 17 will be rebuilt.
The error has been repaired.
Msg 8945, Level 16, State 1, Line 1
Table error: Object ID 552389037, index ID 29 will be rebuilt.
The error has been repaired.
CHECKTABLE found 0 allocation errors and 2 consistency errors in table 'treports' (object ID 552389037).
CHECKTABLE fixed 0 allocation errors and 2 consistency errors in table 'treports' (object ID 552389037).
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xe5e045ba; actual: 0x15bc2a83). It occurred during a read of page (1:476884) in database ID 7 at offset 0x000000e8da8000 in file 'M:\SQLData\MSSQL10.SQL2008ENT\MSSQL\DATA\db2008.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
March 23, 2009 at 5:52 am
I'm not sure why that's not repairing. I'm sure Paul will drop by soon and explain in detail.
In the meantime...
You say you don't mind losing data. If that's the case, I can see two possible fixes, both pretty manual.
1) Restore that month-old backup alongside the current DB. Drop the table that's got the error and copy the data over from the other DB.
2) Export as much data from that table as possible. It won't be easy as a SELECT * FROM will probably fail due to the corruption. Once you've got all the data you can get, drop the table, recreate it and reload the data that you copied out.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 23, 2009 at 6:53 am
Thanks VERY MUCH Gail for responding, it would be great to have Paul chime in on this too.
If there's nothing else he can suggest, I'll probably go with your option 2. Using other methods Paul had blogged about, I've narrowed the affected rows down to a single, 11-row range, so I can simply do all the rows before, then all the rows after.
March 23, 2009 at 7:13 am
neil (3/23/2009)
Thanks VERY MUCH Gail for responding, it would be great to have Paul chime in on this too.
He's at SQLConnections at the moment so probably rather busy, but I know he does watch this newsgroup.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 23, 2009 at 10:28 am
Hmm - I'm concerned that CHECKDB couldn't repair the error. It's obviously a leaf-level page in the clustered index that's corrupt, but that should be a simple repair - deallocate the page and then rebuild all the indexes.
The 824 is from it trying to rebuild the clustered index and hitting the page checksum error. Ah - I'm guessing the page is marked as deallocated in the PFS page but is still linked into the clustered index - so the rebuild is trying to read it. And that also explains why there are only two errors - nothing about it not being able to read that page - only that linkages are missing from it.
ok- now how to repair this? If you don't want to export/import the data, then the only thing I can think of is the following:
1) turn of page checksums in the database
2) drop the clustered index
3) possibly delete the 11 rows on the affected page - depends what data on the page the corruption changed - that's up to you to determine
4) recreate the clustered index
5) root cause analysis of I/O subsystem
Interesting case... let us know how you get on.
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
March 23, 2009 at 3:22 pm
Paul, thanks very much for your reply - I'm going to experiment with the export/import method, as well as your method, in our labs tomorrow. I will certainly let you know how it goes.
March 24, 2009 at 6:31 am
Well, "pleased" would be too mild a word. I am thrilled to report Paul that your method worked perfectly. Gail I also compared it to the export/import method, but that took considerably longer to perform. (The table has 1.5 million rows and is pretty wide. )
Interesting to note that the rows on the corrupted page could be neither SELECTed nor DELETEd after the clustered index was dropped.
Small question: Does it matter at what point I turn CHECKSUMs back on? I presume I should do it just before rebuilding the clustered index.
We are upgrading the hardware this Saturday morning, so we'll be leaving behind whatever issues the I/O subsystem has that caused this.
Thanks again to both of you - I owe you the beverage of your choice if ever we cross paths in person.
Neil
March 24, 2009 at 7:01 am
Cool!
I'd turn page checksums back on as soon as the broken page is no longer part of your table.
Don't worry about the drink - I had enough last night (unfortunately) :sick:
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
August 21, 2017 at 5:12 am
Please note: 8 year old thread, and the linked page just says to run repair (well, and buy some magic software), which the OP had already tried.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply