August 5, 2016 at 12:40 am
I remember our production team arguing a split brain situation in a mirrored server where the support engineer kept insisting it was impossible and the team said It's right here in front of us!
I worry more about data centre failovers. Database cluster/mirror failover is an out-of-the-box configuration activity. Data centre failover is subject to the skills and resources that your organisation put behind the work to build it. Bi-directional failover of datacentres is complicated. It's one thing to fail over to a datacentre that might be slightly behind in syncing up transactions but It's another thing entirely to reconcile, merge and fail back.
August 5, 2016 at 3:17 am
It seems obvious to me that most would want a configurable solution with defaults provided out of the box. Of course, we would hope for better defaults than the new database ones.
Gaz
-- Stop your grinnin' and drop your linen...they're everywhere!!!
August 7, 2016 at 10:52 pm
Steve,
Tuning a database environment too make it efficient and reliable is a balancing act and one of the great arts of a good DBA. Your commentary on " Failover" made a very good point that I believe every DBA should consider concerning failovers. In the end of your commentary, you compared the situation of having a system constantly failing, thus forcing a failover, as probably being worse that not failing over in an emergency. While having a system that is constantly failing-over is a problematic by causing delays for the users, I believe a delay in work should have far less of an impact then to have a system completely shutdown because it did not failover when needed. My question is how does one ensure that the system will failover when there is an emergency and still ensure that the system failover does not occur when their is system "hiccup"?
August 8, 2016 at 9:19 am
Jeff Torres (8/7/2016)
My question is how does one ensure that the system will failover when there is an emergency and still ensure that the system failover does not occur when their is system "hiccup"?
No idea. There should be ways to delay, or perhaps limit the failovers i some way. If you're having some network issues, it might be better to have users experience some delays on the primary than failing back and forth to a secondary.
It depends on your setup and the situation, but I've seen places where clustering is failing back and forth because of some network or other non-client issue, and this results in client downtime, and at times, someone just removing a node or two from the cluster so clients can work.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply