I've been involved with lots of crisis situations in technology. Usually it's because a server has failed or there is some problem with performance. Often the situation is high profile and I've been under pressure, along with others to come up with a solution as quickly as possible. This has resulted in me working long hours, all night, even multiple days at times in order to get through the issues.
Recently we had a situation here at the ranch where a horse got caught in barbed wire. Luckily my wife and I were outside and saw it quickly, racing to cut the horse loose and bandage up his leg while calling the vet. Actually two of our kids helped and the horse survived to live another day. That was a true crisis and for awhile I thought we were going to have him die as we were standing there.
This crisis occurred late one night, about 7:45pm, and we were not ready for it. I guess you are never ready, but once the vet arrived, there were kids that needed to go to bed, I still had work to do, and so my wife and I split up. I took care of the house, settling things down while she and our boarder handled the vet and the horse. Eventually I got back out there to check on things, but we split up again when she drove our boarder home, a 45 minute or so event, while I went to sleep. Someone had to get up in 5 or so hours to get kids moving again and it ended up being me.
When I was involved in a technology crisis in my younger days, especially if I was the expert, I wanted to be there from start to finish. I wanted to participate and be a part of the solution.
And I wanted to learn.
That last part was a big reason my skills developer as a DBA since you learn quite a bit in times of trouble. However I also realize now that I over worked myself at times, and some of the mistakes made as the hours wore on were my fault, due to my own fatigue.
Over the years I've had the occasion to manage a few crisis situations where I was in charge or one of the managers overseeing the issues and not a technician. One of the things I learned early was that the rest of the business must continue to function, even as we are solving an issue. There is still work to be done and there will be things that need to be handled the next day. I couldn't have everyone work all night and skip coming in the next day.
Depending on the size and scope of the issue, I would immediately start planning on splitting my staff, having some leave early, or as soon as possible, and prepare for a later shift or just to pick up the next day. There were a series of chronic issues that occurred and we knew would take 12, 24, or more hours to resolve and so we would immediately send someone home when the issue occurred, with the idea they would come back in the middle of the night or early the next morning.
Crisis management is a skill in and of itself, and it's important to understand that all hands on deck doesn't necessarily mean all hands sitting in a circle around someone typing on a keyboard. Life goes on, even when there's a major crisis and you have to plan for that. Spread the load around, choose different people to handle different jobs, and ensure that you think beyond just this moment.
One of my favorite quotes seems to sum this up: "Hope for the best, plan for the worst" - Jack Reacher
Steve Jones
The Voice of the DBA Podcasts
The podcast feeds are now available at sqlservercentral.mevio.com to get better bandwidth and maybe a little more exposure :). Comments are definitely appreciated and wanted, and you can get feeds from there.
or now on iTunes!
- Windows Media Podcast - 42.6MB WMV
- iPod Video Podcast - 32.6MB MP4
- MP3 Audio Podcast - 6.6MB
Today's podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music. Support this great duo at www.everydayjones.com.
I really appreciate and value feedback on the podcasts. Let us know what you like, don't like, or even send in ideas for the show. If you'd like to comment, post something here. The boss will be sure to read it.