Your actions as a Database Administrator can make the difference between a disaster being a minor nuisance or a major problem.
Things can and will go wrong in your environments. Even the greatest software contains bugs and hardware fails, often.
When an unforeseen issue arises your reaction can directly influence the end result and not necessarily in a positive way.
You must be able to keep your cool under pressure in order to achieve the most desirable outcome.
You’re Playing at the High Stakes Table
Panicking in response to a crisis will delay the time to resolution and increase the cost of the outage to the business. Mistakes are more likely to occur if a knee jerk reaction is taken. In extreme cases, you could lose your job. Your company could go out of business. It happens.
As Data Professionals it is we who are responsible for the data assets within our organisations, with those of you working alone perhaps shouldering the greatest responsibility of us all.
I’m fortunate to have first-hand experience troubleshooting critical production issues and coordinating incident response for some of the largest SQL Server environments in the world. I’m going to share with you what I have learned about keeping your cool in a crisis and what I believe is the most vital component to effective incident response.
Game Time
Any serious professional playing in a high stakes game goes into it with a plan to win. Winging it is out of the question because things are just too important to leave to chance.
It’s no different for a Database Administrator and is why I believe that every DBA must have an Incident Response Plan if you’re serious about playing to win.
When problems arise you don’t want to be spending time figuring out what your next action is. You should already know what your first steps are so that you can begin to implement them right away and at the first sign of trouble.
How Will You Respond to Crisis?
When you’ve got an incident to respond to you’re going to want to at least have some sort of plan to follow. Any semblance of a documented plan is better than having no plan at all.
- What are you going to do when the call comes in?
- What is the first thing that you are going to look at on your SQL Server instance?
- Are there any specific queries that you might want to run?
- Do you have them stored in a readily accessible place?
As you begin producing your own incident response plans and processes, you’ll likely want to start to build a knowledge base of supporting resources, tools and documentation (maybe using a Wiki). A place where you and your team can get at the information you might need during an incident to respond fast and effectively.
Practice mock incident and disaster scenarios in order to put your plans to the test. This will not only give you confidence in the effectiveness of your plan but you will also likely be able to identify opportunity for further improvement.
Planning for Success
Having an incident response plan produced and ready will:
- Provide a proven structure for you to work from.
- Enable you to respond swiftly and with precision.
- Give you the confidence that you’re ready and able to handle whatever challenges might present themselves.
When it’s all kicking off and your boss is hovering around your desk, having a response plan will ensure you keep your cool, remain focused on your task and can deliver the best possible outcome.
Credit: Photo by lawrence used under Creative Commons