Data Consistency Error

  • Hi All,

    After running

    dbcc checkdb ('workflowdb') with all_errormsgs, no_infomsgs

    I got the following errors. Pls let me know if there is a way to recover besides restoring from previous backups.

    Any help is appreciated.

    =============================================

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161844). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161845). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161846). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161847). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161848). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161849). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:161850). The PageId in the page header = (0:0).

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 0: Page (1:161843) could not be processed. See other errors for details.

    Server: Msg 8939, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 0, page (1:161843). Test (IS_ON (BUF_IOERR, bp->bstat) &&bp->berrcode) failed. Values are 2057 and -1.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 0: Page (1:161844) could not be processed. See other errors for details.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 0: Page (1:161845) could not be processed. See other errors for details.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 0: Page (1:161846) could not be processed. See other errors for details.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 0: Page (1:161847) could not be processed. See other errors for details.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:67198) is missing a reference from previous page (1:161844). Possible chain linkage problem.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:70290) is missing a reference from previous page (1:161845). Possible chain linkage problem.

    CHECKDB found 0 allocation errors and 7 consistency errors not associated with any single object.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:98664) is missing a reference from previous page (1:161847). Possible chain linkage problem.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:161843) was not seen in the scan although its parent (1:178432) and previous (1:252797) refer to it. Check any previous errors.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:161844) was not seen in the scan although its parent (1:217851) and previous (1:300486) refer to it. Check any previous errors.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:161845) was not seen in the scan although its parent (1:54790) and previous (1:191250) refer to it. Check any previous errors.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:161846) was not seen in the scan although its parent (1:188015) and previous (1:159782) refer to it. Check any previous errors.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:161847) was not seen in the scan although its parent (1:276811) and previous (1:206779) refer to it. Check any previous errors.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:237833) is missing a reference from previous page (1:161846). Possible chain linkage problem.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 1. Page (1:241199) is missing a reference from previous page (1:161843). Possible chain linkage problem.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:29240) is missing a reference from previous page (1:161850). Possible chain linkage problem.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:51823) is missing a reference from previous page (1:161848). Possible chain linkage problem.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:131278) is missing a reference from previous page (1:161849). Possible chain linkage problem.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 3: Page (1:161848) could not be processed. See other errors for details.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:161848) was not seen in the scan although its parent (1:37909) and previous (1:201903) refer to it. Check any previous errors.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 3: Page (1:161849) could not be processed. See other errors for details.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:161849) was not seen in the scan although its parent (1:242023) and previous (1:242022) refer to it. Check any previous errors.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 2121058592, index ID 3: Page (1:161850) could not be processed. See other errors for details.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 2121058592, index ID 3. Page (1:161850) was not seen in the scan although its parent (1:178957) and previous (1:288854) refer to it. Check any previous errors.

    CHECKDB found 0 allocation errors and 25 consistency errors in table 'DSMEMBERSHIP' (object ID 2121058592).

    CHECKDB found 0 allocation errors and 32 consistency errors in database 'workflowdb'.

    repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (workflowdb ).

    =============================================



    Pradeep Singh

  • On Checking system event viewer, i found that there was a disk related error on 20th May.

    Error1 - The driver detected a controller error on \Device\Harddisk0.

    Error2 - The device, \Device\Scsi\vmscsi1, is not ready for access yet.

    Could this be the reason for this issue?

    We've backup retention period of 15 days which means all backups are corrupted as well.

    This is a developement server and hence this issue was overlooked for this period.

    What are my options now?

    Edit - Am running 2000 Standard Edition SP3.



    Pradeep Singh

  • It looks like an issue due to hardware ((BUF_IOERR, bp->bstat)) Check the error logs from that time and see if you find error 823 there from the time near 20th.

    it looks like only one object is corrupted [2121058592].

    So first try to rebuild the clustered index in case it has one .

    This should automatically fix the broken pointers .

    If this doesnot help then see if you can drop this object and recreate it .

    Repair allow data loss will also delete some pages that are corrupt .but you will not know what data will be removed .

    In any case just see if only one object is corrupt and decide there onwards.

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

  • Thanks for the reply Abhay.

    Rebuilding the index stopped with the following error.

    ====================

    Server: Msg 823, Level 24, State 2, Line 1

    I/O error (torn page) detected during read at offset 0x0000004f066000 in file 'e:\test\test2_Data.MDF'.

    Connection Broken

    ====================



    Pradeep Singh

  • First take the database into emergency mode and then try to rebuild the index .

    I think there is one more way to rebuidl the index and that is "with drop existing " try that as well .

    Regards

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

  • Rebuild's not going to help. There's damage to the clustered index (ie the table itself). That's not redundant data hence there's nowhere for an index rebuild to read the damaged data from.

    Dropping and recreating indexes only fix corruption in two cases.

    1) The damage is solely in nonclustered indexes

    2) The damage is only in the non-leaf levels of the cluster.

    In both of those cases, checkDB would state that repair_rebuild was the level required to fix the error. The error here states that repair_allow_data_loss is the minimum level to repair.

    Take a look at this article. http://www.sqlservercentral.com/articles/65804/. It details most of the common corruptions seen and how to recovery from them (and if it's possible to recover from them)

    ps: You have two options.

    1) Find a clean backup and restore that. If you have log backups as well, you can restore them and get the DB right up to date

    2) Run checkDB with the repair_allow_data_loss option. It will lose data. 8 pages of it, to be specific, from the DSMEMBERSHIP table. Since it's the cluster, it may be possible to find approximately what data was deleted by repair, though you won't be able to retrieve the actual data.

    See Paul Randal's blog: http://www.sqlskills.com/BLOGS/PAUL/post/CHECKDB-From-Every-Angle-Using-DBCC-PAGE-to-find-what-repair-will-delete.aspx

    Since you've already noticed some IO subsystem problems, I'd suggest you get your sysadmins/storage engineers to do some detailed diagnostics to ensure that there aren't serious, ongoing problems with the drives.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • hi_abhay78 (6/29/2009)


    First take the database into emergency mode and then try to rebuild the index .

    Emergency mode is for recovering a database that's suspect.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail for the reply.

    Will try out your suggestions tommorrow at office and post the results.

    Also, will try out DBCC PAGE as mentioned in Paul's blog to see if there is any meaningful data that i can fetch.



    Pradeep Singh

  • You will not be able to see the corrupted pages even with DBCC Page. They're damaged, SQL won't read them. Besides, from looking at the errors, it looks like a 64k sector of the disk has been zeroed out, so there's no data left there to read.

    What you can do is read the page before the corrupt set and the page after and work out, based on the values of the clustering key, what's missing.

    You can use the errors that mention "missing reference from previous page" and "page not seen but parent and previous refer to it" to get the pages before and after in the clustered index chain.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • We ran dbcc checkdb against all databases on that server and found one more database which had acquired similar issues. 2 data pages belonging to 2 different tables were damaged. we lost 1 row in one of these tables. Am still wondering why we didnt lost any data from one of the damaged pages that belonged to the other table.

    From the dsmembership table of previous databases, we lost around 70 rows of data.



    Pradeep Singh

  • ps (7/1/2009)


    Am still wondering why we didnt lost any data from one of the damaged pages that belonged to the other table.

    Without seeing the output of checkDB for that database, no idea.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:399191). The PageId in the page header = (0:0).

    Server: Msg 8909, Level 16, State 1, Line 1

    Table error: Object ID 0, index ID 0, page ID (1:399192). The PageId in the page header = (0:0).

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 1310627712, index ID 0: Page (1:399192) could not be processed. See other errors for details.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 1310627712, index ID 1. Page (1:36631) is missing a reference from previous page (1:399192). Possible chain linkage problem.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 1310627712, index ID 1. Page (1:399192) was not seen in the scan although its parent (1:25816) and previous (1:399207) refer to it. Check any previous errors.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 1342627826, index ID 0: Page (1:399190) could not be processed. See other errors for details.

    Server: Msg 8939, Level 16, State 1, Line 1

    Table error: Object ID 1342627826, index ID 0, page (1:399190). Test (IS_ON (BUF_IOERR, bp->bstat) &&bp->berrcode) failed. Values are 2057 and -1.

    Server: Msg 8928, Level 16, State 1, Line 1

    Object ID 1342627826, index ID 0: Page (1:399191) could not be processed. See other errors for details.

    Server: Msg 8978, Level 16, State 1, Line 1

    Table error: Object ID 1342627826, index ID 1. Page (1:399189) is missing a reference from previous page (1:399190). Possible chain linkage problem.

    Server: Msg 8980, Level 16, State 1, Line 1

    Table error: Object ID 1342627826, index ID 1. Index node page (1:420444), slot 202 refers to child page (1:399190) and previous child (1:399191), but they were not encountered.

    Server: Msg 8976, Level 16, State 1, Line 1

    Table error: Object ID 1342627826, index ID 1. Page (1:399191) was not seen in the scan although its parent (1:420444) and previous (1:399224) refer to it. Check any previous errors.

    CHECKDB found 0 allocation errors and 2 consistency errors not associated with any single object.

    CHECKDB found 0 allocation errors and 3 consistency errors in table 'plumdbuser.PTJOBOPHISTORY' (object ID 1310627712).

    CHECKDB found 0 allocation errors and 6 consistency errors in table 'plumdbuser.PTJOBLOGS' (object ID 1342627826).

    CHECKDB found 0 allocation errors and 11 consistency errors in database 'Copy_of_Live_Plumdb'.

    repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (Copy_of_Live_Plumdb ).

    --------------------

    This is the report by checkdb.

    Table Orig Size Orig RowCnt After Size After RowCnt

    PTJOBLOGS4055901137206740469711372066

    PTJOBOPHISTORY7658637 6718637

    Above data shows table details before and after running repair, however rowcount in PTJOBOPHISTORY table still remains the same.



    Pradeep Singh

  • The damage was in the cluster. It could be that you got lucky and it was an empty page that hadn't yet been dropped from the index.

    p.s. you need to identify the cause of this. Several DBs corrupting at the same time suggests there's a problem with the IO system.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail.

    Yes, we had discussions with the SAN team and they confirmed there is no such issue at the moment.

    I also told them to notify such issues to the database team (for servers hosting databases) asap so that we can restore the db from a clean backup(if need be). In this case, there was no option than to do away with the data.



    Pradeep Singh

  • ps (7/1/2009)


    I also told them to notify such issues to the database team (for servers hosting databases) asap so that we can restore the db from a clean backup(if need be).

    You should be the one checking for issues. If they say there was no problem this time, do you really think that they're likely to notice a problem next time whatever it is happens?

    http://www.sqlskills.com/BLOGS/PAUL/post/Importance-of-how-you-run-consistency-checks.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply