0 allocation, 1 consistency error -- repair_rebuild min repair level didn’t fix

  • Sorry - I meant the 13 original errors. I'm concerned you might have bad memory corrupting pages before they'e written to disk.

    The keys are out of order between slots 24 and 25. The key on slot 25 should be 2104614 to fit into the sequence on the page. As far as the binary values go:

    1000100010001100011001 actual value on page

    1000000001110100100110 what it should be (2104614)

    There's no obvious bit-flip corruption here, but that doesn't rule out unnoticed memory corruption. Hmm - the way I'd figure out what happened is find the point in the log backups right before this record was inserted, and then see what the insert operation is doing. That record was modified date_last_modified = Jan 18 2010 11:57AM and the surrounding ones weren't. Something corrupted the database around that tine I'd guess.

    Also, were any index rebuild operations performed on this table since 1/9? Do you have a multi-proc box? Can you try setting MAXDOP=1 and then rebuilding the index - see if that gets rid of it? If it does, that proves that its an index rebuild bug.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • ok - now I'm confused. Your last post says its a nonclustered primary key, which would imply the table's a heap after all, which would mean something in metadata is corrupt and telling CHECKDB that it's a clustered index.

    Can you run:

    select * from sysindexes where id=318624178

    and post the results?

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal (1/24/2010)


    Can you run:

    select * from sysindexes where id=318624178

    and post the results?

    See attached ...

  • Paul Randal (1/24/2010)


    Sorry - I meant the 13 original errors ...

    .

    .

    Also, were any index rebuild operations performed on this table since 1/9? Do you have a multi-proc box? Can you try setting MAXDOP=1 and then rebuilding the index - see if that gets rid of it? If it does, that proves that its an index rebuild bug.

    OK, I was ignoring the 13 original because I was thinking the current restore level of the DB went from 1/9/2010 [clean DBCHECK] to 1/22/2010 using tlog restorations, and we're now dealing w/ a DB that "only" has the 1 error. I'll post the extensive list.

    Multiprocessor? Well, we're running VMWare on ESX's that have 8 CPU's ... but the VM SQL server has only one CPU allocated ...

    "Can you try setting MAXDOP=1 and then rebuilding the index " ... OK, this is beyond my technical understanding ... how do I do this?

    BTW: I have both a virtual and physical SQL server box. I updated the physical to SP4 yesterday and am trying the same restore process from 1/9/2010 just to see if results differ.

    Since I have a SQL VM, I could set a snapshot to do various testing ...

  • ok- the index really is a clustered index. Forget the MAXDOP stuff.

    The errors in the word doc you attached looks like the I/O subsystem has corrupted an entire extent - i.e. a 64k RAID stripe. This is a classic corruption signature and says you've got an I/O subsystem problem - not a SQL Server bug. You should still go up to SP4, but you need to run I/O subsystem diagnostics to figure out what went wrong.

    Cheers

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Thanks for all the troubleshooting help.

    ... so, there's no hope of fixing the database w/ "only" 1 consistency error?

    I understand that we need to correct the underlying cause of the error, but once we do, can the restored database be fixed and then brought back up?

    What other recourse do I have ... I've seen some postings discussing creating fresh new database and importing in all tables/records.

  • You should be able to manually rebuild the clustered index and it fixes it. Let me know if it doesn't and we can discuss (mostly unpalatable) options.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Guess we're on to the unpalatable options ...

    DBCC DBREINDEX ('inv_period_usage') resulted in:

    Server: Msg 1505, Level 16, State 1, Line 1

    CREATE UNIQUE INDEX terminated because a duplicate key was found for index ID 1. Most significant primary key is '122444'.

    The statement has been terminated.

  • That would be why the repair_rebuild failed too. You won't be able to fix this until you remove/fix the duplicate key values. Google should give you plenty of scripts to help you with that. Then the rebuild should work.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Great! I'll see what I can find.

    Thanks again for all your help! :w00t:

  • Found a single duplicate record in inv_period_usage and deleted manually from DB.

    DBCC DBREINDEX('inv_period_usage') executed without error ...

    AND, DBCC CHECKDB now comes up clean, no errors!

    Now I need to investigate the I/O Subsystem errors and also see if I can bring a database from 1/9/2010 all the way thru 1/22/2010 [end of business Friday] ... current test database that now comes up good was only thru 1/21/2010. But even if this is unsuccessful, we'd be faced only with manually re-inputting one day's transactions.

    I can't thank you enough, Paul!

  • No problem - beers are on you at PASS 🙂

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Just an addendum, now that the crisis has passed...

    From what I saw in the list of indexes, you may have a few too many indexes on that table (though too many is subjective) and, based on the names, duplicate indexes.

    It may be worth looking into when you have time.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/24/2010)


    Just an addendum, now that the crisis has passed...

    From what I saw in the list of indexes, you may have a few too many indexes on that table (though too many is subjective) and, based on the names, duplicate indexes.

    It may be worth looking into when you have time.

    Interesting observation ... Unfortunately, I have no input to the design of the database. The database was developed by Activant, Inc. and is an integral part of their "Prophet 21" wholesale distribution enterprise software (www.p21.com).

    The crisis for me isn't quite over, as I still need to get the I/O Subsystem issues resolved as well as upgrade to SQL 2005.

    In the meantime, I've been checking out all the great resources at http://www.sqlskills.com and just listened to Paul's interview re: database corruption. Need to get up to speed on torn page detection, review our backup/restore strategy (can definitely be simplified/improved) and some agent alerts, etc. Baby steps for now ...

    I can't adequately express my thanks for your time, expertise and willingness to help.

    Zivio!

    Karl Styrsky

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply