Blog Post

12 Days of Christmas 2013 Day 2

,

This is the second installment in the 12 day series for SQL tidbits during this holiday season.

Previous articles in this mini-series on quick tidbits:

  1. SQL Sat LV announcement

burningtime

Recently I was able to observe an interesting exchange between a couple of key people at a client.  That exchange gave me a bit to ponder.  I wanted to recount a bit of that exchange here.  All names have been, well you know how that goes.

Accountant Joe came in early one wintry morning.  He was gung-ho and ready for the day ahead.  Joe had huge plans to finish counting all of the beans and get his task list done for the day.  You see, taskmaster Judy had been harping on him significantly over the past week to get his beans counted.

On this frosty morning, Joe was zipping along.  As more and more people filed into the office from the various departments, Joe was still contentedly counting his beans.  That only lasted for a few fleeting moments with everybody in the office though.

Suddenly Joe could no longer count the beans.  The beans Joe was counting were served up via the backend database.  And since the beans were running too sow, Joe called the helpdesk to have them fix the database.  A few moments later, Sally called the helpdesk too.  Sally was complaining about things being horribly slow too.  Sally was trying to open the company calendar (Sally is the executive secretary).

More and more calls were coming in to the helpdesk from various departments and every user-base in the company.  The helpdesk was busy fighting this fire or that fire.  Finally news of the slowness is escalated to the DBA Dillon so he could investigate why the beans were so slow on this frosty day.  As Dillon investigated, he noticed that IO stalls were off the charts.  He was seeing IO stalls in the hundred second range instead of the milli-second range like normal.

Like a dilligent DBA, Dillon immediately escalated the issue to the sysops team who was responsible for the SAN (yeah he notified his manager too).  Bill from sysops promptly responded.  Sadly the response was “I am too busy at the moment.

After much pestering, Bill finally became available and was ready to help – 4 hours later.

As it turns out, the SAN that housed all company shares, applications, databases and even Exchange was down to about 30GB free space.  Due to the lack of free space, the SAN degraded performance automatically to try and prevent it from filling up entirely.  Bill knew about this pending failure and had ordered extra storage – which sat on his desk for 2+ weeks.

The entire company was essentially down because Bill ended up being too busy (in a meeting).  Though the issue was eventually resolved – the sting has yet to fade.

When faced with an outage situation, let this story be your gift to remind you of how not to treat the outage.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating