August 6, 2007 at 6:05 am
I'm reading through the Microsoft Press 70-431 study exam and found an interesting comment on a real world scenario (pages 634-635). The author purports a situation where his customer "required the database to be available more than 99.995 percent of the time as well as a guarantee of zero data loss-except in the event of a catastrophic failure such as the loss of all or part of a data center...".
He uses this scenario to describe a reason to set up database mirroring. In his discussion of discarded options, he describes clustering as "Clustering definitely could not meet the downtime requirements, although it could meet nearly all the other requirements..."
Now, at my workplace we have an active/passive cluster set up that automatically detects when the active cluster is down and switches over to the passive. Our data files are on a SAN where the drives switch over to the other instance when the heartbeat "dies". We've never had more than 30-60 seconds of downtime when this happens. So I'm trying to figure out why the author of this book thinks Clustering won't fulfill the downtime requirements needs.
Does anyone have any thoughts on this? Or experience with clustering not coming back up immediately if properly configured?
Thanks,
August 7, 2007 at 3:22 am
Hi Brandie,
99.995% means a downtime of about 28 minutes a year. You might think that a clustering failover takes about 30 - 60 sec, so I'm fine, but in real life you probably won't. Think about the restarts and failovers caused by installing hotfixes and Servicepacks. If you're a nice MS customer and always install your patches and Servicepacks every month you might get over the 28 minutes of downtime, without ever facing a real problem. In fact just installiing a servicepack almost causes 15 minutes of downtime on a cluster, because the cluster resource is taking offline.
Also the recovery time for a cluster failover can vary quite a bit depending on the number of transactions which need to be rolledback or forward.
With synchronous mirroring the failover time will be a lot shorter and more predictable because no rollback/forward is needed. And you can install your patches on one machine, while the other partner stays available.
And don't forget, the exam (guides) often favour newer techniques over old ones, even when in the real world you might choose differently.
Markus
[font="Verdana"]Markus Bohse[/font]
August 7, 2007 at 4:09 am
Markus,
Ah... I forgot about service pack installs. But, won't an install of SP bring down the mirror for about the same amount of time? After all, if you have to stop the services, the mirror can't be running while you're doing the install, right?
Have you actually tested the synchronous mirroring with a service pack install? If so, how long did it take you?
And your last comment about the exam guides is well taken. It's why, when I find odd comments like that, I like to double-check with others to see if they agree or not. You know, verify the relevance or reality behind the comment. @=)
August 7, 2007 at 4:38 am
Brandie,
you can suspend the mirror session while you install the SP on the passive node first. Then after restarting the mirror session, when status is back to synchronised, you need to force a failover and then suspend the mirror session again. Now install the SP on the original server.
So the database would be available except for a few seconds during the failover.
Markus
[font="Verdana"]Markus Bohse[/font]
August 7, 2007 at 4:43 am
Thank you again for your reply, Markus. Now the author's statement makes complete sense. @=)
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply