Yeah! my first real corrupt DB

  • Ninja's_RGR'us (8/18/2011)


    Only 1 problem @SqlWiki, [DateHistorique_fin] is not the leading column, it's the 2nd of 3 columns being changed from NOT NULL to NULLable.

    Well there you go, I thought I had narrowed it down to being the leading column.

    I've also tried your demo and while it errors out on the same error, the DB doesn't go into suspect mode. I give it to you, it's pretty darn close, but it's not the same issue (unless your demo just needs more data to have the same output, I have ±1K pages in my table that gets corrupted).

    Shame. Perhaps it does just need more data, as you say.

    Do you have a KB link that talks about this?

    No, I spent several hours this morning working it out for myself, starting with your database and script.

    Downloading denali atm to try my code on it. Maybe they fixed it there too!

    Don't understand. It is fixed (well my repro runs correctly, haven't tried your database on it) in Denali CTP 3. I get the bug on SQL Server 2008 R2 build 10.50.2772 - perhaps my wording was not clear before?

    P.S. Where do you Work Paul, I see your name associated with SqlCat. With all the hidden knowledge you seem to have it wouldn't surprise me the least if that were true 😀

    I don't often work, but right now I am, for a hedge fund in the US. I have no association with SQLCat, sadly, other than as a reader of their public materials.

  • Denali finally up and running. My script still fails the exact same way (as it is failing in 2k5).

    Microsoft SQL Server "Denali" (CTP3) - 11.0.1515.0 (Intel X86)

    Jul 11 2011 15:33:17

    Copyright (c) Microsoft Corporation

    Enterprise Evaluation Edition on Windows NT 6.0 <X86> (Build 6002: Service Pack 2)

    Msg 3316, Level 21, State 1, Line 135

    During undo of a logged operation in database 'Remi', an error occurred at log record ID (101945:8344:214). The row was not found. Restore the database from a full backup, or repair the database.

    Msg 3314, Level 21, State 4, Line 135

    During undoing of a logged operation in database 'Remi', an error occurred at log record ID (101945:8344:214). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    Msg 3314, Level 21, State 5, Line 135

    During undoing of a logged operation in database 'Remi', an error occurred at log record ID (101945:3930:1). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    Msg 0, Level 20, State 0, Line 0

    A severe error occurred on the current command. The results, if any, should be discarded.

    Holly crap!!!!! Just ran checkdb, allow data loss =>

    Repair: The Clustered index successfully rebuilt for the object "dbo.RPT_Historique_Couts" in database "Remi".

    Repair: Deleted record for object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), on page (3:8), slot 57. Indexes will be rebuilt.

    Repair: Deleted record for object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), on page (3:8), slot 58. Indexes will be rebuilt.

    Repair: Deleted record for object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), on page (3:8), slot 59. Indexes will be rebuilt.

    Repair: Deleted record for object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), Row error: Object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), page ID (3:518), row ID 53. Column 'DateHistorique_fin' was created NOT NULL, but is NULL in the row.

    The error has been repaired.

    Msg 8970, Level 16, State 1, Line 3

    Row error: Object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data), page ID (3:518), row ID 54. Column 'DateHistorique_fin' was created NOT NULL, but is NULL in the row.

    The error has been repaired.

    Msg 8970, Level 16, State 1, Line 3CHECKDB found 0 allocation errors and 1752 consistency errors in table 'RPT_Historique_Couts' (object ID 1935462815).

    CHECKDB fixed 0 allocation errors and 1750 consistency errors in table 'RPT_Historique_Couts' (object ID 1935462815).

    CHECKDB found 0 allocation errors and 1752 consistency errors in database 'Remi'.

    CHECKDB fixed 0 allocation errors and 1750 consistency errors in database 'Remi'.

    Paul, looks like you were right with just about anything, except the part where it was fixed in CTP 3!

  • @PW. You can see the reason for my confusion => http://sqlcat.com/members/paul-white-nz/default.aspx

    😀

    Especially with all the super deep level 500 stuff you always blog about. Makes one wonder what the heck you do to know all that!

    Thanks again for giving this a go! A couple minds on this one can't be too much!

  • Retested on Denali, Snapshop isolation off db level gives the same error.

    It really looks like all that dml and DDL is messing up with something.

    The really scary part is that this is somewhat of a standard upgrade script I'd see running on 3rd party apps. All it's really doing is picking up 2 new columns and filling the blanks in a history table. Nothing realy fancy going on here!!!

    Off to bed!

  • Remi,

    You had convinced me that I was demonstrating a related problem, and not the exact one you are seeing in your database, so I created a Connect item of my own for my script:

    https://connect.microsoft.com/SQLServer/feedback/details/684732/error-3316-in-routine-xdesrmreadwrite-rollbacktolsn

    Regarding isolation settings, I have found that having either ALLOW_SNAPSHOT_ISOLATION or READ_COMMITTED_SNAPSHOT enabled is needed - though the transaction itself does not have to use a row-versioning isolation level.

  • SQLkiwi (8/18/2011)


    Remi,

    You had convinced me that I was demonstrating a related problem, and not the exact one you are seeing in your database, so I created a Connect item of my own for my script:

    https://connect.microsoft.com/SQLServer/feedback/details/684732/error-3316-in-routine-xdesrmreadwrite-rollbacktolsn

    Regarding isolation settings, I have found that having either ALLOW_SNAPSHOT_ISOLATION or READ_COMMITTED_SNAPSHOT enabled is needed - though the transaction itself does not have to use a row-versioning isolation level.

    Added my vote.

    I'll keep you guys posted once my connect item is up (should be a couple hours from now).

  • Shame. Perhaps it does just need more data, as you say.

    I've just finished that test. The ONLY way I can make the script fail is with the final insert (I've not tested what I can remove from the first queries).

    If I do select TOP 35 PERCENT => 618 rows, the insert succeeds (6 times in a row).

    If I do select TOP 40 PERCENT => 706 rows, the insert fails "as usual" (6 times in a row).

    Please note that I didn't use order by for that test.

    Moreover I think I'm real close to the tipping point because I sometimes get 3-4 errors and other times just one =>

    Msg 3316, Level 21, State 2, Line 135

    During undo of a logged operation in database 'Remi', an error occurred at log record ID (101944:25015:51). The row was not found. Restore the database from a full backup, or repair the database.

    Msg 233, Level 20, State 0, Line 0

    A transport-level error has occurred when receiving results from the server. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.)

    Please note the difference with the last error which "used" to be =>

    Msg 0, Level 20, State 0, Line 0

    A severe error occurred on the current command. The results, if any, should be discarded.

  • Same behavior on 2K5 SP4 using your reproducing script:

    Physical server, local storage, 32 bit.

    select @@version

    Microsoft SQL Server 2005 - 9.00.5000.00 (Intel X86) Dec 10 2010 10:56:29 Copyright (c) 1988-2005 Microsoft Corporation Developer Edition on Windows NT 5.2 (Build 3790: Service Pack 2)

    Msg 3316, Level 21, State 2, Line 135

    During undo of a logged operation in database 'Remi', an error occurred at log record ID (101944:24996:51). The row was not found. Restore the database from a full backup, or repair the database.

    Msg 3314, Level 21, State 4, Line 135

    During undoing of a logged operation in database 'Remi', an error occurred at log record ID (101944:24996:51). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    Msg 9001, Level 21, State 1, Line 135

    The log for database 'Remi' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

    Msg 3314, Level 21, State 5, Line 135

    During undoing of a logged operation in database 'Remi', an error occurred at log record ID (101944:22482:1). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

    Msg 0, Level 20, State 0, Line 0

    A severe error occurred on the current command. The results, if any, should be discarded.

    -----------------------------

    Repair: The Clustered index successfully rebuilt for the object "dbo.RPT_Historique_Couts" in database "Remi".

    Repair: The Nonclustered index successfully rebuilt for the object "dbo.RPT_Historique_Couts, PK_RPT_Historique_Couts_1" in database "Remi".

    Msg 8945, Level 16, State 1, Line 1

    Table error: Object ID 1935462815, index ID 1 will be rebuilt.

    The error has been repaired.

    Msg 2511, Level 16, State 1, Line 1

    Table error: Object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data). Keys out of order on page (3:528), slots 16 and 17.

    The error has been repaired.

    Msg 2511, Level 16, State 1, Line 1

    Table error: Object ID 1935462815, index ID 1, partition ID 72057599794216960, alloc unit ID 72057600371261440 (type In-row data). Keys out of order on page (3:528), slots 35 and 36.

    The error has been repaired.

    Msg 8945, Level 16, State 1, Line 1

    Table error: Object ID 1935462815, index ID 2 will be rebuilt.

    The error has been repaired.

    CHECKDB found 0 allocation errors and 2 consistency errors in table 'RPT_Historique_Couts' (object ID 1935462815).

    CHECKDB fixed 0 allocation errors and 2 consistency errors in table 'RPT_Historique_Couts' (object ID 1935462815).

    CHECKDB found 0 allocation errors and 2 consistency errors in database 'Remi'.

    CHECKDB fixed 0 allocation errors and 2 consistency errors in database 'Remi'.

    ----------------------------

  • Thanks a Million @skrilla99

  • Added a vote for your's and for Paul's.

    Thanks to you both for the amount of time you are putting into this.

    -Dan

  • You're <all> welcomed. But I think it would have been a real travestie, and unthinkable to not have gone through with this.

    It's not like this is giving out a sub-optimal plan. It's flat out destroying your db (well taking it out of prod for hours untill you bring it back online, hoping you didn't lose too much data... because you have no option without dataloss unless the server was in single-user mode ;-)).

  • Exactly the same results on my (test) instances (both on the same machine):

    Microsoft SQL Server 2005 - 9.00.4060.00 (Intel X86) Developer Edition, (Build 2600: Service Pack 3)

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1617.0 (Intel X86) Express Edition with Advanced Services



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • You've hit bugs in rollback - contact PSS and they'll work to get a workaround and fix. Nothing much else to do without being able to step through the code as it's rolling back the operation, which I can't do any more. And neither can Kimberly - from your email to her.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • LutzM (8/19/2011)


    Exactly the same results on my (test) instances (both on the same machine):

    Microsoft SQL Server 2005 - 9.00.4060.00 (Intel X86) Developer Edition, (Build 2600: Service Pack 3)

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1617.0 (Intel X86) Express Edition with Advanced Services

    tx.

Viewing 15 posts - 46 through 60 (of 78 total)

You must be logged in to reply to this topic. Login to reply