Database Failure Checklist

  • Is there a listing (online) of the procedures and checks you should be performing in case of a database failure? . . . outside of DBCC checks.

    I know this is dependent on the individual issue (ex. page tear, deadlocking, SCSI failure, etc.), but any information will be greatly appreciated. 😀

    Thanks,

  • Depends on what you mean by 'failure'

    Database corruption is dealt with very differently to 'I accidentally dropped the database' which is dealt with very differently from 'We've just had a fatal drive failure' which is dealt with very differently from massive performance problems. I'd say the very first thing you should look at in most cases is the error log. See what's actually wrong. Identify what it is that you're dealing with. After that, it depends on what the problem is.

    I'm not sure I'd call a deadlock a database failure

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok, I knew I should have been more specific. 🙂 I was trying to hurry up and blurt something out. I apologize for the extremely general question.

    I have been having issues with a production server freezing up and upon restart, I have been running DBCC checks to ensure the database's integrity. Should I be checking anything else outside of the DBCC and error logs (system & database)?

    Not looking for a solution just CYA stuff. 😀

  • Is Windows freezing up on restart or is SQL freezing up on restart?

    If windows, leave the DB alone and look in the Windows event logs.

    If SQL, look in the SQL error log.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Something having to do with the file system is freezing, our sys admin has no idea, but thinks it could have something to do with our SCSI interface. Because it has to do with the file system I have been running DBCC's just in case.

    I was mostly concerned with the affects it could have on database corruption.

  • Then tell your sysadmin to get off his bu^H^H chair and do his job. If there are disk problems (of any form) that's the sysadmin's responsibility. Running repeated checkDBs to make sure there are no corruptions is not the answer here.

    It's like parking your car on a busy street and leaving the window open and doors unlocked, then checking on it every 15 minutes. Sure, you'll know very quickly when it's been stolen but you won't prevent it from going.

    Tell the sysadmin if he doesn't know the cause then he should contact the storage vendor, look for 3rd party help, update drivers or just replace the entire thing.

    Yes, I'm being blunt.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I've done a bit of that, but it's out of my hands. If our Director allows this, then my only option is CYA at this point. I was just curious to see if there were any other measures that I could take that I wasn't already.

    Thanks, Gila!

  • If I were you, I'd write a nice, formal email detailing the risks of continuing as-is. Send it to the director, cc your immediate manager (and maybe his manager depending what the reporting line there is), cc the sysadmin. Then take a printout of said mail (making sure that the date is on it) and file it. Do the same with any mails from the director that refer to not fixing/replacing the drive problem. Again, make sure the dates are on the print outs. Then, when something goes wrong you have hard evidence of the history of the problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply