August 17, 2017 at 8:46 am
We are suddenly getting a lot of errors in a 160GB database along the lines of SQL Server detected a logical consistency-based I/O error
This seems to have started happening around 02:00 on 14/08/17 when a 'Rebuild Index' job failed
Date 14/08/2017 02:00:00
Log Job History (Epro Index Rebuilding)
Step ID 1
Server EPROLIVESQL
Job Name Epro Index Rebuilding
Step Name Rebuild Indexes
Duration 00:11:46
Sql Severity 24
Sql Message ID 3621
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0
Message
Executed as user: NT AUTHORITY\SYSTEM. SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:20529816; actual 0:0). It occurred during a read of page (1:20529816) in database ID 7 at offset 0x00002728530000 in file 'F:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\EproPat.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. [SQLSTATE HY000] (Error 824) Executing ALTER INDEX ux_patients_history_versions ON dbo.t_patients_history REBUILD - fragmentation currently 30.2339% [SQLSTATE 01000] (Error 0) Executing ALTER INDEX in_clinic_appointments_patient ON dbo.t_clinic_appointments REBUILD - fragmentation currently 29.7339% [SQLSTATE 01000] (Error 0) Executing ALTER INDEX ux_patients_history_previousVersions ON dbo.t_patients_history REBUILD - fragmentation currently 29.0101% [SQLSTATE 01000] (Error 0) The statement has been terminated. [SQLSTATE 01000] (Error 3621). The step failed.
I've imported the latest backup onto a test SQL instance and run dbcc_checkdb and attached the results of that.
Not sure what has caused this, or if I can repair it easily!
August 17, 2017 at 9:07 am
I have also just done a REPAIR_ALLOW_DATA_LOSS on the test instance of the DB and it came back with the attached results...but I can't tell if any actual user data was lost as part of this repair?
August 17, 2017 at 10:08 am
Yes. Data was lost.
Repairing, especially with allow_data_loss, is a last resort for when there's no other way of fixing things. Best method is to restore from a clean backup. With proper backup strategy, you won't lose any data.
When did CheckDB last run without error?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 18, 2017 at 12:46 am
This was removed by the editor as SPAM
August 18, 2017 at 2:47 am
Many thanks for the replies,
We do have good backups of the database and Full recovery mode on the DB so hourly transaction logs are generated, so is it a case of restoring the last known good 'full' database backup then restore all the transaction log files after this point up until the most recent in time? (Not had to do something like this in a very long time!) 🙂
August 18, 2017 at 1:24 pm
Yes, if you have a good full backup (you may want to restore that full on your test server and run DBCC CheckDB to ensure it's clean), start with restoring that, with NORECOVERY, then (assuming you don't have any differential backups) restore the transaction log backups up to the point where you're sure the database is consistent. The best approach is going to be time consuming, but worth it...you'll want to restore the full backup with RECOVERY, run DBCC CheckDB. Then, drop the database, restore again with NORECOVERY, restore the first t-log backup with RECOVERY, run DBCC CheckDB. Drop database...and repeat until you get to the point where DBCC CheckDB encounters a consistency or allocation error...I think it indicates a time...or you can look it up when it displays the page where the problem is, and then you know where to stop the recovery at...the transaction log either before the problem occurs, or the transaction log where the problem occurs, up to a time before the problem occurs.
The restore options (and how to deal with DBCC CheckDB errors when encountered) can be easily found online.
Alan H
MCSE - Data Management and Analytics
Senior SQL Server DBA
Best way to ask a question: http://www.sqlservercentral.com/articles/Best+Practices/61537/
August 18, 2017 at 8:29 pm
john.round - Friday, August 18, 2017 2:47 AMMany thanks for the replies,We do have good backups of the database and Full recovery mode on the DB so hourly transaction logs are generated, so is it a case of restoring the last known good 'full' database backup then restore all the transaction log files after this point up until the most recent in time? (Not had to do something like this in a very long time!) 🙂
Yes, that's it:
1. Restore from last known 'good' backup.
2. Restore every transaction log backup taken since then, including taking log backups from the corrupted database before shutting it down.
Please continue taking transaction log backups on the damaged database while you prepare your restore. It has suffered significant loss, but you can recover everything, including capturing all changes made to the database, if you keep taking log backups.
Disk corruption on data files (at least the kind shown in your CHECKDB output) is not written to the transaction log, unless the log file got damaged in the incident. Because the transaction log does not contain the instructions on how to cause the damage to the data files, restoring from a full backup taken before the incident and using log backups from before and after the incident won't include the damage.
If there was damage, but it occurred to an area of the log that had already been backed up, there will be no data loss.
If the event also wiped your backups, then you're pretty much out of options.
Eddie Wuerch
MCM: SQL
August 19, 2017 at 3:17 am
john.round - Friday, August 18, 2017 2:47 AMMany thanks for the replies,We do have good backups of the database and Full recovery mode on the DB so hourly transaction logs are generated, so is it a case of restoring the last known good 'full' database backup then restore all the transaction log files after this point up until the most recent in time? (Not had to do something like this in a very long time!) 🙂
Yup, that's pretty much that. Take a final log backup before you start, restore that as the final log restore.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 19, 2017 at 10:33 am
SQL_Hacker - Friday, August 18, 2017 1:24 PMThe best approach is going to be time consuming, but worth it...you'll want to restore the full backup with RECOVERY, run DBCC CheckDB. Then, drop the database, restore again with NORECOVERY, restore the first t-log backup with RECOVERY, run DBCC CheckDB. Drop database...and repeat until you get to the point where DBCC CheckDB encounters a consistency or allocation error...I think it indicates a time...or you can look it up when it displays the page where the problem is, and then you know where to stop the recovery at...the transaction log either before the problem occurs, or the transaction log where the problem occurs, up to a time before the problem occurs.
There's no need to do that.
That kind of 'creep forward' is needed for recovering from data loss (dropped table, accidental delete, etc). Corruption, in the vast majority of cases you can just restore all log backups up to most recent, because a misbehaving IO-subsystem (most common cause of corruption) doesn't write transaction log records for what it mangled
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 19, 2017 at 10:39 am
Thanks Gail. I obviously didn't read the original question well enough. Thanks for catching that.
Alan H
MCSE - Data Management and Analytics
Senior SQL Server DBA
Best way to ask a question: http://www.sqlservercentral.com/articles/Best+Practices/61537/
August 21, 2017 at 7:45 am
I've had to go back a while to the full database backup at 08/08/17 00:48 which passes DBCC CheckDB with no consistency errors.
So just to confirm the first log after that is at 01:00, then on the hour every hour up until the most recent (at time of post 14:00, then the tail log). I am assuming all the daily full database backups taken from 09/08/17 onwards at 00:48 won't have any impact on the transaction log restores? (Hopefully that makes sense!)
August 21, 2017 at 11:58 am
john.round - Monday, August 21, 2017 7:45 AMSo just to confirm the first log after that is at 01:00, then on the hour every hour up until the most recent (at time of post 14:00, then the tail log). I am assuming all the daily full database backups taken from 09/08/17 onwards at 00:48 won't have any impact on the transaction log restores? (Hopefully that makes sense!)
Correct.
I would recommend, once this is fixed, scheduling more frequent checkDB runs. At least once a week is my preference.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 22, 2017 at 3:23 am
So, I successfully tested yesterday on a test instance using the following script (as there are over 250 transaction log files!)
RESTORE DATABASE [EproPat] FROM DISK = 'D:\tmp\EproPat_backup_2017_08_08_000106_1477461.bak' WITH NORECOVERY, REPLACE
RESTORE LOG [EproPat] FROM DISK = 'D:\tmp\EproPat_backup_2017_08_08_010007_1198079.trn' WITH NORECOVERY
RESTORE LOG [EproPat] FROM DISK = 'D:\tmp\EproPat_backup_2017_08_08_020006_9747574.trn' WITH NORECOVERY
RESTORE LOG [EproPat] FROM DISK = 'D:\tmp\EproPat_backup_2017_08_08_030005_2635503.trn' WITH NORECOVERY
(and so on for all the other transaction logs up until the latest one)
RESTORE DATABASE [EproPat] WITH RECOVERY
So, on the live system if I wait for the, for example, midnight (00:00) transaction log backup to complete then kick off the restore script at 00:01 there is no need for a 'tail log' backup to take place? (I choose midnight as an example as nobody would be using the system at this time!) then once the database was up and running again it would be consistent as of the latest 00:00 transaction log?
August 22, 2017 at 5:36 am
Tail log is to ensure that absolutely nothing is missed, taking it switches the source DB into RESTORING state, so that nothing more can be done to it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 26, 2018 at 10:53 pm
This was removed by the editor as SPAM
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply