RTO and RPO are myths unless you’ve tested recovery

,

AI generated image: DBA crying

I’ve watched teams spend a lot of time on backup strategy. They plan out the full, differential, and log backups to ensure they can successfully meet the recovery point objective (RPO). And they assume they can make the recovery time objective (RTO). There’s a second assumption, of course: they can meet RPO, too. So we’re all working with the same definitions:

  • Recovery Point Objective (RPO) – How much data can you afford to lose.
  • Recovery Time Objective (RTO) – How much time do you have to get the system back on-line.

There are all kinds of reasons for not meeting RTO and/or RPO. Here are some of them.

  • Backup files are missing.
  • There were problems getting access to the backup files.
  • The backup scheme takes too long to restore.
  • The backup strategy is invalid.
  • The backup strategy would have been valid, but something occurs to break the strategy.
  • There is some additional step beyond standard restoring the database.
  • Security wasn’t properly backed up.

Sure, there are other reasons, but that just reinforces the fact that we need to test recovery. Test the most common scenarios, including the worst one: you’ve had to rebuild a database server from scratch and then restore from backup. If you don’t have a database server standing by, how long will it take to get one up? Until you’ve tested these scenarios, you don’t know for sure you can RTO/RPO. Testing is the only way to ferret out any issues, before the disaster or failure happens.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating