Corrupt TempDB in SQL Server 2008 even after restarting service - Event ID:824

  • Hi,

    We're about to deploy a new server and today I noticed after running one specific stored procedure that hits temp tables heavily it's throwing an Event ID 824 error (see error below). After the error occurs DBCC CHECKDB and DBCC CHECKDB with DATA_PURITY show no errors in TempDB. Restarting SQL Server drops and recreates TempDB as it should then everything works correctly again, but if I run the same stored procedure it fails at the same point.

    At first I thought it could be something in the stored procedure, which is VERY large and cranks on lots of data, but if I trim it down commenting out the line and following 100 or so lines where the error occurs, it still happens just later downstream. Also this procedure works on our Test server just fine plus it's been running fine nightly for weeks. It's like when a certain amount of data is fed to TempDB and the data file reaches alittle over 1 Gig the Event 824 error occurs.

    I've checked are drive space, which there is plenty of, the TempDB is setup to grow 10% unrestricted, and I've moved TempDB log and data files to another drive on the same server just in-case that helped, which it didn't. This server is running Windows Server 2003 R2 with SP2 which is running on VMWare 1.0.1. I'm not sure what the host OS is since the server is not located in our office.

    At this point I'm thinking it's hardware problems with either VMWare or the physical server itself. With me able to recreate this problem though I hoped someone might have a solution or some pointers on where to check now since as a DBA I've extinguished about everything I can think of.

    Thanks for any suggestions.

    Here's the error:

    Event Type:Error

    Event Source:MSSQLSERVER

    Event Category:Server

    Event ID:824

    Date:3/15/2010

    Time:8:40:50 PM

    User:Server1\App1

    Computer:MCSASR

    Description:

    SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x39bf8c3a; actual: 0xfbe98071). It occurred during a read of page (1:38760) in database ID 2 at offset 0x00000012ed0000 in file 'e:\MSSQL\Data\tempdb.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    Sam

  • 824 errors are indications that you have a problem with your IO subsystem. You need to work with your server and SAN team to identify why you are getting these errors.

    And, you need to do this as soon as possible - before you start having issues with your user databases, which you will if this is not addressed.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Remove / rename your tempdb files and let sql re-create the files.

    Cheers

  • Can you run a checkDB of model please?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/16/2010)


    Can you run a checkDB of model please?

    Hi,

    I just ran CheckDB on the Model database which returned zero errors. Yesterday I emailed the network team maintain this server and this morning they reported they had found hardware problems, but they didn't give any details. Hopefully I'll know more later today.

    Thanks for the input and any other suggestions you guys might have.

    Sam

  • Hey Sam,

    Did you ever get your fix sorted? Can you post the solution if you found one?

  • katen (4/28/2010)


    Hey Sam,

    Did you ever get your fix sorted? Can you post the solution if you found one?

    Hi Katen,

    Actually we never did find a solution, though we think the problem stems from the server running an older version of SuSE and VMWare which we think just doesn't mix well with the demands of SQL Server 2008. I've been able to rewrite a few queries to get around the problem, but unfortunately I don't know if the client will ever choose to upgrade the server since everything else they're running seems to be happy with this configuration.

    Sam

  • Thanks, thats pretty helpful anyway

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply