September 7, 2009 at 12:03 pm
Lets say i have a full backup, and 3 tran log backups in the chain. I get an error telling me of page corruption.
I run dbcc checkdb and find its just one page causing the issue.
1. How do i find out where in the backup the page is (i.e, is it in the full backup, or one of the tran log backups)?
2. if the page is in the full backup, do i still have to restore all of the tran logs, or does sql server restore the page from the full backup, and just check if there are any changes to that page in the logs, and if so apply the changes?
September 8, 2009 at 1:03 am
All restores (single page, file or full) have to start from a full backup. For full details of restoring a page, see http://msdn.microsoft.com/en-us/library/ms186858(SQL.90).aspx and http://msdn.microsoft.com/en-us/library/ms175168%28SQL.90%29.aspx
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 2:07 am
winston Smith (9/7/2009)
Let's say I have a full backup, and 3 tran log backups in the chain. I get an error telling me of page corruption. I run dbcc checkdb and find its just one page causing the issue.
Ok. It would be great to see the full text of the error!
winston Smith (9/7/2009)
1. How do i find out where in the backup the page is (i.e, is it in the full backup, or one of the tran log backups)?
Specify the STANDBY option at every stage of the RESTORE. This will allow read-only access to the database between restore steps. You can run DBCC CHECKDB with the database in this state. If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.
If the original backup statements specified the WITH CHECKSUM option (and the page was last written to at a time when the database-level PAGE_VERIFY option was set to CHECKSUM) I would have expected the RESTORE statement to report an error. Was this not the case?
The first thing to do, however, is to check the suspect_pages table in the msdb database. This will give you information about the page and allow us to determine whether it is a data page, an index page, or a system page. If you are lucky, the page will be part of a non-clustered index. In that case, you can repair the damage yourself without a backup by rebuilding the index (assuming the base data is undamaged).
It would be great to see the information currently held in the suspect_pages table.
winston Smith (9/7/2009)
2. if the page is in the full backup, do i still have to restore all of the tran logs, or does sql server restore the page from the full backup, and just check if there are any changes to that page in the logs, and if so apply the changes?
Wherever the damaged page is, you may not be able to get the database back to full health without some extra work. This may involve a repair, but let's not go down that track just yet.
Please post the full error message from DBCC CHECKDB, and the contents of msdb..suspect_pages.
Paul
September 8, 2009 at 2:33 am
Actually, I might have misread the question first time around - I thought you were asking what to do if you received the DBCC message after restoring the database from the backups.
If you were just asking a theoretical question, the answers are in the links Gail provided, though this entry: Performing Page Restores probably contains just the information you were looking for.
Paul
September 8, 2009 at 2:54 am
Paul White (9/8/2009)
If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.
Log backups generally will only have corruption in them if the log itself is corrupt (and in that case the backup usually fails). While a full/diff can easily back up a corrupted page, a log backup doesn't read the actual data pages (except for bulk-operations) and hence won't contain the corruption.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 3:03 am
GilaMonster (9/8/2009)
Paul White (9/8/2009)
If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.Log backups generally will only have corruption in them if the log itself is corrupt (and in that case the backup usually fails). While a full/diff can easily back up a corrupted page, a log backup doesn't read the actual data pages (except for bulk-operations) and hence won't contain the corruption.
Exactly. "Except for bulk-logged operations" where a very large number of extents can easily be included in the log.
Since we don't know that no bulk-logged operations occurred, I fail to see the point you are making.
We have to restore step-by-step anyway, so the incremental cost in time and effort is practically nil.
Or am I missing something?
September 8, 2009 at 3:12 am
Paul White (9/8/2009)
GilaMonster (9/8/2009)
Paul White (9/8/2009)
If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.Log backups generally will only have corruption in them if the log itself is corrupt (and in that case the backup usually fails). While a full/diff can easily back up a corrupted page, a log backup doesn't read the actual data pages (except for bulk-operations) and hence won't contain the corruption.
Exactly. "Except for bulk-logged operations" where a very large number of extents can easily be included in the log.
Well, excluding bulk-logged operations in bulk-logged recovery model.
Pages modified by bulk operations are included in the log backup straight after they were created. Corruption's usually the result of the IO subsystem mangling existing pages (SQL doesn't corrupt it's own file). For a page affected by a bulk operation to be corrupt when the log backup occurs, the corruption has to happen during the write (and if the IO subsystem is corrupting on writes, there's a far wider problem) or in the short interval between the write and the log backup. If that does happen, if the log backup encounters a corrupt page while backing up extents modified by a bulk-operation, the log backup will fail (unless continue_after_error is specified).
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 3:35 am
GilaMonster (9/8/2009)
Well, excluding bulk-logged operations in bulk-logged recovery model.
You make it sound as if backing up the log, switching to the BULK_LOGGED recovery model, performing minimally-logged operations (like an index rebuild), returning to FULL recovery, and backing up the log again...is somehow unusual 😛
GilaMonster (9/8/2009)
Pages modified by bulk operations are included in the log backup straight after they were created.
Yes. I know. 🙂
GilaMonster (9/8/2009)
Corruption's usually the result of the IO subsystem mangling existing pages (SQL doesn't corrupt it's own file).
So, you are saying there are absolutely no circumstances where SQL Server can directly cause page corruption in its files? The word 'usually' suggests not - in which case you are accepting the alternative (that is, anything else).
GilaMonster (9/8/2009)
For a page affected by a bulk operation to be corrupt when the log backup occurs, the corruption has to happen during the write (and if the IO subsystem is corrupting on writes, there's a far wider problem) or in the short interval between the write and the log backup. If that does happen, if the log backup encounters a corrupt page while backing up extents modified by a bulk-operation, the log backup will fail (unless continue_after_error is specified).
So, you are saying there are absolutely no circumstances under which a corrupted extent can appear in the log file?
Unless you are saying that undetected data corruptions can never occur, I still fail to see your problem with my original comment:
If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.
September 8, 2009 at 3:56 am
Paul White (9/8/2009)
So, you are saying there are absolutely no circumstances where SQL Server can directly cause page corruption in its files? The word 'usually' suggests not - in which case you are accepting the alternative (that is, anything else).
There are bugs, they are rare. I believe Paul Randal talks about 99.9% of all corruptions he's seen are IO subsystem problems.
If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores.
Do not agree.
If no errors are reported after a stage then you know that the full (and diff if applicable) backups were clean and that either the corruption is included in a future log backup (unlikely and only possible if there are bulk operations and bulk-logged recovery) or that it was not caught in any log backup.
Your comment can be easily read to imply that corruption is picked up by log backups and recreated by log restores which is certainly not true in full recovery and is going to require a number of conditions to be possible in bulk-logged.
Yes, there's a slim chance and there's nothing wrong with doing a checktable after each log restore but to say "you know that the corruption occurs somewhere in the following log restores." is incorrect.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 4:50 am
GilaMonster (9/8/2009)
There are bugs, they are rare. I believe Paul Randal talks about 99.9% of all corruptions he's seen are IO subsystem problems.
That's just a fancy way to say that it is perfectly possible.
GilaMonster (9/8/2009)
If no errors are reported after a stage then you know that the [...] backups were clean and that either the corruption is included in a future log backup (unlikely and only possible if there are bulk operations and bulk-logged recovery) or that it was not caught in any log backup.
If there are no errors are a stage, you know that there are no errors, or none were detected. Unless you stop in STANDBY and do a DBCC CHECKDB, you can't be as sure (we don't even know that CHECKSUMs were written for the data pages).
It is indeed possible that it was not caught in one of the log backups. I had assumed that a full log chain (including the tail) was available and contained in the set described. A reasonable assumption on my part - but you do have a point.
GilaMonster (9/8/2009)
Your comment can be easily read to imply that corruption is picked up by log backups and recreated by log restores which is certainly not true in full recovery and is going to require a number of conditions to be possible in bulk-logged.
So it is perfectly possible then - with bulk logged data - a very common thing. (As far as full recovery is concerned - are you now stating with absolute certainty that a log record restore can never contain an undetectable error? Really?).
GilaMonster (9/8/2009)
Yes, there's a slim chance and there's nothing wrong with doing a checktable after each log restore but to say "you know that the corruption occurs somewhere in the following log restores." is incorrect.
It is only incorrect in one very narrow and nit-picky sense: if you know somehow that the OP's log backups are incomplete and do not include the tail of the log. We cannot know that, but I accept the logical possibility quite happily.
There is nothing 'slim' about the chance. Undetected page corruptions can, and do, happen all the time. Even if CHECKSUM page validation is ON - and for all we know there is nothing of this sort.
I am glad that you recognize that running a check after each stage of the restore is worthwhile, and that corruption in the log is quite possible. So your only point was that the corruption might not be in the backup set?
Paul
September 8, 2009 at 5:16 am
Paul White (9/8/2009)
GilaMonster (9/8/2009)
Yes, there's a slim chance and there's nothing wrong with doing a checktable after each log restore but to say "you know that the corruption occurs somewhere in the following log restores." is incorrect.It is only incorrect in one very narrow and nit-picky sense: if you know somehow that the OP's log backups are incomplete and do not include the tail of the log. We cannot know that, but I accept the logical possibility quite happily.
No, not at all.
It's perfectly possible to have a backup set, from full backup to tail log, with the full backup clean and some corruption having appeared in the data file after the full backup and to have every single log backup clean and undamaged and for a restore using those backups to restore a clean, undamaged backup. In fact, it's the norm.
I'm not ignoring the possibility of log corruption. I'm saying that in the vast majority of cases, corruption that occurs to the data file will not at all be reflected in the log backups. So to say that after restoring the full and X log backups that if the DB is clean, the corruption occurs somewhere in the remaining logs is not valid. It's likely not in any of the log backups at all.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 6:02 am
GilaMonster (9/8/2009)
...stuff...
Oh for the love of small fluffy bunnies! :laugh:
I see what has happened here...
Me, before your first post: "Actually, I might have misread the question first time around - I thought you were asking what to do if you received the DBCC message after restoring the database from the backups."
Me, quoted by you in that post: "Specify the STANDBY option at every stage of the RESTORE. This will allow read-only access to the database between restore steps. You can run DBCC CHECKDB with the database in this state. If no errors are reported after a stage, you know that the corruption occurs somewhere in the following log restores."
See? Ha. I wonder how long we would have gone on? How funny.
And yes, for the record, if the error wasn't received as part of the restore sequence, it could well have escaped backup.
😎
September 8, 2009 at 6:36 am
Paul White (9/8/2009)
I thought you were asking what to do if you received the DBCC message after restoring the database from the backups."
You know, I've had that once, and not because there was corruption in any of the log backups. The disk system was so foobar that it was corrupting the database as it was been restored. The backups were good (we restored them elsewhere and the resulting DB was fine), all the backups succeeded without error. A checkDB after the restore was complete turned up more corruptions than the DB that we restored over.
Generally, first thing I'd try if I got corruption after restoring (if the DB backup was known to be clean) is to try out the entire restore sequence on a different server, eliminate the possibility of a failing drive adding new corruptions in.
I wonder how long we would have gone on? How funny.
Not much longer. The apparent tone in some of the replies was starting to feel like something I'd use on a particularly slow 5-year old.
p.s. I've asked someone about the possibility of a corrupt page ending up in a log backup. If the answer's something I can post in public, I'll update here when/if I get an answer.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 8, 2009 at 6:49 am
I just caught the pre-edit version of that. Please understand that your initial response was frustrating to me as well. Twenty minutes had elapsed since my post outlining how I misread the question, so I think you had the better chance of spotting the cross-purposes. All I saw was a fairly terse post from you - apparently picking on what seemed like a minor point from quite a long post.
Disappointing.
Paul
September 8, 2009 at 11:04 am
Can you two be nice or find a room somewhere. I got a headache just trying to read this topic
for the sake of fluffy bunnies everywhere...
--------------------------------------------------------------------------------------
[highlight]Recommended Articles on How to help us help you and[/highlight]
[highlight]solve commonly asked questions[/highlight]
Forum Etiquette: How to post data/code on a forum to get the best help by Jeff Moden[/url]
Managing Transaction Logs by Gail Shaw[/url]
How to post Performance problems by Gail Shaw[/url]
Help, my database is corrupt. Now what? by Gail Shaw[/url]
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply