Does It Count?

  • Scheduled maint should not be counted against your sla. In fact, scheduled maint should be one of the items listed and negotiated when coming to terms on an SLA.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • From a business perspective down time is down time. They may begrudgingly accept that some planned down time is necessary but it will be remembered and used when explaining to their boss why their targets were missed.

    I think it is essential to track both planned and unplanned down time plus the overall picture.

    I also think the duration of the down time is important. For planned downtime a high duration indicates that a different strategy may be needed to meet business expectations.

    Since this blog was written the world has moved on and uptime expectations are far higher. DBs such as Cassandra talk about continuous availability rather than high availability. In the RDBMS World we need to be architecting our solutions with the aspiration of continuous availability

  • Of course perfect availability is impossible. Even you have mirrored copies of your DB in six different data centers, and only do maintenance on one at a time, you will end up with down time. Entropy always finds a way.

    Doesn't mean we shouldn't shoot for that, but at the same time there is a cost to redundancy. And explaining that to the people holding the purse-strings is the real challenge.

  • IMHO I believe that if the database is unavailable to the users for whatever reason, it's down.

  • Fatherjack (1/28/2011)


    I wholly agree with craig 81366. So long as it is agreed, understood and documented then it doesn't really matter whether it's included or not.

    +1 to that.

    A good SLA, like anything that is well defined and documented, will set this expectation.

    I have worked places where it is both ways.

  • Of course it counts. I'm surprised you even ask the question.

    Yes, the distinction between scheduled and unscheduled downtime is critical. But uptime means "the time that the system is up and running and available for use." Any other metric needs a different name.

  • Shewmaker (6/19/2015)


    Of course it counts. I'm surprised you even ask the question.

    Yes, the distinction between scheduled and unscheduled downtime is critical. But uptime means "the time that the system is up and running and available for use." Any other metric needs a different name.

    Agreed. But SLA is your "Service Level Agreement" it is the different name you use to define the agreed upon Level of Service that will be provide. It is where you give a different name to Planned outage vs. Surprise Outage.

    This is very important if you are going to need a Vendor supported system that always needs to be available.

  • While it seems logical to say that all downtime counts, the real issue is what the user is paying for. A multi-tier, public facing system is going to cost a whole lot more than a file server if it is going to have 99.9% availability. Maintenance is a huge part of the equation. At a previous job, I had literally dozens of support cases closed by Microsoft because we were back-leveled on a dozen or so different things. And the reason for that was not enough time in the SLA for to keep up with all of that stuff :angry:

  • No, you shouldn’t include scheduled outages in your downtime Service Level Agreement calculations; at least not all in the same ‘bucket’ of time as unscheduled events.

    If a system has a traditional downtime SLA of 1% and anticipated Maintenance Outages 0.5% of the time, you could interpret that as “available 98.5% of the time or more”, but you would be motivating your DBA team to skimp on maintenance any time their ‘unforeseen’ downtime was higher than expected, in order to mask the bad numbers.

    If you want to motivate your team to find efficiencies in the maintenance window, your SLA should detail both service levels separately. “Scheduled Maintenance Outages will be .5% or less; Unscheduled Outages will be 1% or less.” Keep the two measurements independent of each other, since they are not equivalent in disruption, cost to the organization, or mitigation.

  • Dave62 (1/28/2011)


    Should scheduled maintenance be counted against your downtime SLA?

    I try to avoid any controversy by making sure the vendor has included language in the SLA that specifies whether scheduled maintenance is counted against downtime or not. That way when I review downtime statistics I understand what's included and can easily determine the severity and whether the SLA is being upheld or not. Along with whether there are problems that need to be addressed or not.

    Any ambiguity on what's included makes it difficult to interpret the statistics. You could end up looking for problems that aren't there or ignoring problems that need addressing.

    When your handed downtime statistics, you need to know what you're looking at.

    Dave

    And not to forget language around what constitutes "down". A scheduled maintenance that impedes performance to the point that the system is *de facto* unavailable even if on paper the system is up probably would still be considered down; on the other hand - if I manage my maintenance in such a way that the system is a bit slower but still responding, you might be able to argue that it's "up". Clarifying at what level of response a system is considered to be "down" can be important.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Interesting to see this editorial repeated years after first publication. I don't know how I missed it first time round.

    To me, downtime is downtime.

    Yes, it makes sense to classify downtime into unscheduled and scheduled, and in some cases to further subclassify both the scheduled and the unscheduled stuff; but it's all downtime, and it would be crazy, completely unreasonable, to count any of it, even regular maintenance, as not downtime. If the system isn't up and delivering results, it's down.

    I have always regarded downtime (of all sorts) as critically important, whether I've been the customer or (part of) the system provider.

    At Neos, where we were managing systems on customer sites, even installing Microsoft's critical updates counted as downtime; part of my job was to evaluate each update (whether critical or not, whether Microsoft's or ours or someone elses) and determine whether avoiding some risk by installing it represented sufficient reduced downtime risk to offset the downtime cost of installing it. And upgrades to introduce new features for the customer had to minimise the downtime needed even though it was the customer who had requested the new features.

    To some extent all downtime was Neos's fault, because we had told the customer what hardware to by and what platform and middleware software to buy so it didn't really matter if the cause of downtime was a failure of Neos's software, a failure of Microsoft's software, a failure of someone else's software, or a failure of server machines, client machines, or routers and switches, the downtime was Neos's unless it was directly caused by hardware or software that the customer had despite Neos not only not recommending it but actually warning against it. Of course usually the customer would blame Neos even then, and even do so when customer staff caused the problem by, for example, pulling power connections or network connections out.

    Tom

  • I guarantee 100% uptime, with the exception of the 8 hours a day I reserve for maintenance.

Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply