February 3, 2010 at 4:45 am
I Recently ran dbcc checkdb on one of our SQLServer boxes (SS2005 Ent SP3, x64) and a DB generated this:
Msg 8909, Level 16, State 1, Line 1
Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:8088) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).
Msg 8998, Level 16, State 2, Line 1
Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:8088) to (1:16175). See other errors for cause.
Msg 8909, Level 16, State 1, Line 1
Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:16176) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).
Msg 8998, Level 16, State 2, Line 1
Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:16176) to (1:24263). See other errors for cause.
Msg 8909, Level 16, State 1, Line 1
Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:24264) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).
Msg 8998, Level 16, State 2, Line 1
Page errors on the GAM, SGAM, or PFS pages prevent allocation integrity checks in database ID 77 pages from (1:24264) to (1:32351). See other errors for cause.
CHECKDB found 6 allocation errors and 0 consistency errors not associated with any single object.
CHECKDB found 6 allocation errors and 0 consistency errors in database 'DbSSSS'
So far db is accessible by users and 'Select * from ...' works against all 91 tables. No repair attempted yet. I restored from prior backups as a different dbname on different SAN but same page errors. I appreciate your guidance to resolve this.
Thanks in advance..
Othman
February 3, 2010 at 5:21 am
You have what looks like errors in the allocation pages. Because of those errors, checkDB can't do a full check of the DB. There is no repair for this.
Either restore a clean backup of the database along with all log backups (if your backup strategy is good, you'll be able to do that without data loss), or, if you don't have a clean backup then you're going to need to script all objects, export all data and recreate the DB.
You may be lucky and the damaged allocation pages refer to empty sections of the DB, however that doesn't mean these are benign, they are a fairly serious form of corruption.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 3, 2010 at 11:20 am
You've got corruption in two PFS pages - this isn't repairable, even with undocumented DBCC options. You need to restore, and also figure out why this happened. Select * can work fine sometimes depending on table structure when this happens.
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
February 3, 2010 at 11:55 am
I'm curious. How did you determine that they were PFS pages, not GAM/SGAM? Something about the page ID or is it the size of the range of pages that can't be checked?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 3, 2010 at 3:11 pm
February 3, 2010 at 3:18 pm
Interesting - that Books Online page is wrong (it's based on a document I wrote while inside MS). A PFS page maps 8088 pages, not 7990. I'll get it fixed.
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
February 3, 2010 at 3:22 pm
Gail - you mean you don't have all the allocation bitmap page IDs and intervals burned into your brain? Lucky you 🙂
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
February 4, 2010 at 2:13 am
Paul Randal (2/3/2010)
Gail - you mean you don't have all the allocation bitmap page IDs and intervals burned into your brain? Lucky you 🙂
Not yet, I'm working on it though.
Can you tell a GAM from SGAM from PFS on the basis of page ID alone? For that matter, with an error like this, can you tell a GAM from an SGAM and, if so how?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 4, 2010 at 8:49 am
I should do a blog post at some point, or you should do the MCM class 🙂
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
February 4, 2010 at 10:17 am
Paul Randal (2/4/2010)
I should do a blog post at some point
Please?
or you should do the MCM class 🙂
Would love to. As soon as I persuade someone to pay for it for me or I win the lottery.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 4, 2010 at 11:16 am
Gail / Paul ,
appreciate your followup.
As suggested:hehe:, I exported data to a new db but noticed that exporting VIEWs causes correponding TABLEs to be created so it seems they have to be scripted then explicitly created.
Few questions:
1)As DBCC CHECKDB wasn't run before, this page corruption seems to have been present since weeks ago:w00t:. Does this entail necessity of running CHECKDB on daily basis during afterhours to be aware of similar corruption at early stage?
2)If required diagnostics arranged to be run against the affected SAN drive, can it also 'fix' or will it only expose damaged area?
Thanks and Regards,
Othman
February 4, 2010 at 11:24 am
abo_moaaz (2/4/2010)
1)As DBCC CHECKDB wasn't run before, this page corruption seems to have been present since weeks ago:w00t:. Does this entail necessity of running CHECKDB on daily basis during afterhours to be aware of similar corruption at early stage?
Absolutely! Maybe not daily, but often enough that, should corruption be detected, restore from a clean backup is always an option. Hence your backup strategy will determine the minimum frequency for CheckDB.
For example, if you only retain 2 days of backups and hence cannot restore to further back than 2 days, checkDB must run at least every 2 days. Personally I'd recommend running it before you run your full backups. No point in backing up a corrupt DB.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 10, 2010 at 6:00 am
Hi ,
For further analysis :cool:(note that actual db with 8998 error was exported earlier with no loss of data and the old one was renamed ), with that said,
1)If I run backup with checksum on the corrupted db, it still finishes successfully does this confirm that the corrupted area has no data?
2) the storgae group informed me that as long as the SAN volume is accessible, from their side it's 'healthy'. Should we as a precaution, replace the volume by a new one ?
Thanks and Regards
Othman
February 10, 2010 at 8:40 am
1) No. It just means that the corrupt pages don't have page checksums on them for some reason. Was the database upgraded from 2000?
2) Your storage group are fools. Is a SAN volume with poor IO throughput 'healthy' just becase it's accessible? No. Is a SAN volume that has drives causing corruption 'healthy'? No. You need to educate the storage group of what healthy means - 'does not corrupt the data' is the number 1 requirement and it's not being met here. Just because a drive can be accessed in no way guarantees that there aren't integrity problems with it.
Thanks
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
February 10, 2010 at 10:18 am
Hi Paul,
just to let you know that i really enjoyed your teched video:-). As for 1) this was a pure SS2005 db (actually, one of the SharePorint DBs). For the storage group part, what i think they meant that their available tools may not tell them much other than accessibility or performance throughput. If it helps, i can mention the brand name of the storage.
Another insight, it was determined early this week that the installed multipath s/w for the SAN was not compatible with the OS (WS2008 ENT R2 to which we upgraded last October), so it was yesterday upgraded to the compatible level, but the checkdb would still show 8998 on that db.
Thanks and Regards
Othman
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply