Data centre failure

  • Has anyone ever heard of a dedicated data centre that was incapacitated?

    The context is that I have been told that the likelihood is so low that my money would be better spent buying more resilient hardware than paying for the expense of a separate backup site.

  • For one of my web sites there was an underground fire and the fire department forced my host to shut down everything.  We were out for 24 hours.

    For me, I like to balance it out and have a secondary standby, but not necessarily the best hot machines.  After all, I wouldn't expect them to be used for more than 24 hours and they won't need the "growth and demand potential" that real production requires.  At least we would be up and running. 

  • Was that underground fire the one in the British Telecom exchange in Manchester?

  • IIRC, this was in Virginia, USA.

     

  • About 2 weeks after we moved into our current data center the whole data center went down for about 17 hours... freak problem caused an arc in the power room that literally melted the main power buss and ATS switch - not supposed to happen but it did (one in a million/billion?).  This is a pretty substantial data center company (locations throughout the U.S.) with redundant everything, so you can bet they'd never heard of this one before either. 

    Staff was professional, communicative, etc. throughout and to the best of my knowledge the provider retained all their clients - I remember the CEO of the company calling me at 8 a.m. on a Saturday morning to personally ensure me that everything that could be done was being done.

    If you have to be up 100% of the time you need to be geographically dispersed.  The older/less capable equipment in a second site is a good option - assuming that your primary hardware/site is fully redundant already.  I would definitely address the primary site before adding a 2nd site, my personal experience is that most likely point of failure is not the data center.  Since the one incident we've had multiple hardware failures in our own equipment but never a repeat or similar outage at the data center.

    Once upon a time, shortly after 9/11, I was discussing geographical dispersion with my 2nd level VP... we were looking into adding a 2nd data center approximately 15 miles away from our current data center location -- anything capable of "taking out" both of those data centers would also get 90% of Denver including my office and house.  Of course, he just happened to ask "well what if something got both of them?", I laughed and replied that at that point it would be his problem because I most likely wouldn't be around to care... 

     

     

  • Actually, 15 miles probably isn't enough!

    On March 29th 2004 in Manchester UK an underground fire in a tunnel containing telecommunication wires took out an area that certainly extends to where I live, 20 miles south so presumably that is the radius.

    In short you could have two datacentres 40 miles apart but using the same telecomms network.

    In my case I am thinking of a datacentre in two separate cities, although one company we talked to has a UK datacentre and another in Stokholm (Sweden).

  • Two although years ago.

    One had redundant communication lines external, both running under the same sidewalk.  Contractor had to dig up the sidewalk and cut both cables.

    Downtime, just over 24 hours.

    Earlier was water running into the power panel,  (They were drilling for cooling tower lines on the parking garage above) and blew up the whole incoming power setup.

    So depending on how critical, etc. it may be a necessary expense.

     


    KlK

  • Make a business owner  and your management sign  a DRP (Disaster Recovery Plan) that includes different scenarios for different times a Server or a Data Center is down. Then they will know what to expect and the paper will have their signature. Then keep this paper in the separate backup site.

    It is much cheaper.

    Regards,Yelena Varsha

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply