Your IT infrastructure is a significant investment no matter how you look at it. With that in mind, it’s kind of baffling how many firms seem to skimp on disaster recovery. Many of the ones I’ve spoken to in the past have had their eyes glaze over when I’ve mentioned DR planning, and a troubling number seemed completely ignorant of the necessity of disaster recovery testing.
Here’s the thing about that - practice makes perfect. Even if you’ve got the most ironclad disaster recovery plan in the industry, even if you’ve got intricate, well-detailed DR processes, completely redundant infrastructure, and a great crisis communications platform...all that stuff falls short if none of your staff knows how to use any of it. That’s why drills are such an important part of crisis management.
They give you an idea of your actual preparedness for a crisis - not just what you’ve got on paper, but how your employees will conduct themselves when disaster hits.
Not only that, it’ll allow you to improve your crisis response - because there’s always something you could be doing better. A few seconds shaved off here, a more efficient communications channel there...DR drills allow you to narrow your focus onto bottlenecks in your response process, and give you an idea of how those bottlenecks can be eliminated.
Right. Enough rambling about the why. Let’s focus on the what. Here’s what you’ll need to take care of in order to run efficient, effective, and thorough DR drills in your data center.
An operational risk assessment. Assuming you haven’t already developed a disaster recovery plan, your first stage is to assess the level of risk your organization is subject to. Talk to various IT teams throughout your facility to determine points of failure, and compile a list of your critical assets. Collect all relevant documents such as building plans and equipment diagrams, and ensure they’re accounted for.
A disaster recovery plan. Once your risk assessment is complete, the next stage is simple - you need to review everything and determine the best way to stay operational in a crisis. We are always surprised at how many organizations have plans to back-up their data, but not a plan on how to use that data if their primary site is unreachable or not operational. Many times the roadblock to staying operational is the cost for duplicate infrastructure at a remote data center. We’ve found a solution to this is a virtual DR capability that has compute resources available on demand, networked with the back-up storage in order to spin up workload if the primary site is not operational. The plan should contain a “critical path” step-by-step checklist of what needs to be done to restore your operations if your primary site goes down. Who gets notified? What applications will take the longest to restore and which ones are the most critical to have up first? Are notices to end users or your customers necessary? What about workspace for users if your primary place of business is disabled - do you need space at the secondary data center for users or IT personnel?
Assessment materials for your drill. The next step is relatively simple - you need to think up a few of the major crises your data center is most likely to face. Then, run through those scenarios on a set basis (usually once a year, though you may wish to increase the frequency in the case of more critical systems). Test everything - your employee response, how your systems handled the strain, how quickly were you able to initiate production workload at the remote site, and so on. Document what you could have done better, and meet to discuss how you might improve.
That’s pretty much all there is to it. You’ve now a general idea of what it takes to run a disaster recovery drill. As you proceed, be sure to refine the process to fit your unique situation and needs.
Tim Mullahy is the General Manager at Liberty Center One. Liberty Center One is a new breed of data center located in Royal Oak, MI. Liberty can host any customer solution regardless of space, power, or networking/bandwidth requirements.