AlwaysOn AG for DR

  • Hi everyone. I have a design question about SQL Server 2012 AlwaysOn availability groups when used for DR purposes.

    I built a two node cluster in Windows Server 2012 for testing with one node in each data center. The AG is set to async with manual failover.

    If we lose the primary data center, we can manually bring the second node up as primary. It seems to work in testing.

    I've only seen designs and documentation that have an odd number of votes for quorum purposes but it does not seem necessary if we are not configured for automatic failover at this time. I'm seeing this setup as similar to database mirroring without a witness.

    This set up would be an improvement over log shipping. Because the second replica would have transaction replication, the apps can use a multi subnet listener, we can build it using the partially contained model, and we can grow the cluster into three or more nodes in the future.

    Is it important to set the secondary server's nodeweight to 0? Are there any down sides to this approach that I'm missing?

    Thanks much.

  • I think you're confusing failover with quorum voting.

    The quorum setup determines how many nodes need to be up in order for the entire cluster to stay up.

    I would recommend that you put a file share witness in place

  • SQLSACT (12/6/2013)


    I would recommend that you put a file share witness in place

    I concur

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • SQLSACT (12/6/2013)


    I think you're confusing failover with quorum voting.

    The quorum setup determines how many nodes need to be up in order for the entire cluster to stay up.

    I would recommend that you put a file share witness in place

    I wrote a long detailed reply but forgot to save to clipboard before hitting submit and it didn't go through. Oh well.

    I believe you are mixing up HA with DR. With a single node in each data center, a witness file server would only be of value if it was in a third location and if the AG was set to automatic failover.

    For the purpose of DR with async commit, a two node cluster will stay up and function for the purposes of cluster communication. If the primary data center is lost, the secondary can be brought up with forced failover. That's how it seems to work with Server 2012 and dynamic quorum.

    Let me know if there is a down side to this.

  • PHXHoward (12/6/2013)


    I wrote a long detailed reply but forgot to save to clipboard before hitting submit and it didn't go through. Oh well.

    Would have been interested to see that!

    PHXHoward (12/6/2013)


    With a single node in each data center, a witness file server would only be of value if it was in a third location

    That's not entirely correct. You could put the mediation vote on the primary site making the site vote heavy (never make the DR site vote heavy for obvious reasons), this way if the DR site server or network becomes disconnected you will still have quorum in the cluster and the primary node will stay online. Ideally it should be on a separate site, this much is true.

    PHXHoward (12/6/2013)


    and if the AG was set to automatic failover.

    This AG setting is not completely relevant to the cluster config, more the other way around. AO sits atop Windows clustering, this must be designed and deployed correctly for AO to work effectively.

    PHXHoward (12/6/2013)


    For the purpose of DR with async commit, a two node cluster will stay up and function for the purposes of cluster communication. If the primary data center is lost, the secondary can be brought up with forced failover. That's how it seems to work with Server 2012 and dynamic quorum.

    Let me know if there is a down side to this.

    A 2 node cluster with node majority only will stay online as long as both nodes are healthy, however, this is not the supported configuration and the cluster and AO should be issuing warnings around this.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Thanks for the input Perry.

    PHXHoward (12/6/2013)


    With a single node in each data center, a witness file server would only be of value if it was in a third location

    ...

    That's not entirely correct. You could put the mediation vote on the primary site making the site vote heavy (never make the DR site vote heavy for obvious reasons), this way if the DR site server or network becomes disconnected you will still have quorum in the cluster and the primary node will stay online. Ideally it should be on a separate site, this much is true.

    The Microsoft recommendation is to set 1 vote for each node in the primary data center and 0 votes for each node at the DR sites. This makes the primary site vote heavy without building out additional resources.

    Let me know if there is a down side to this.

    ----

    A 2 node cluster with node majority only will stay online as long as both nodes are healthy, however, this is not the supported configuration and the cluster and AO should be issuing warnings around this.

    With Windows 2012, a cluster will stay online with only one node standing. It has to do with dynamic weight adjustments.

    I don't think I'd rely on this in production but it works as expected in testing.

  • PHXHoward (12/6/2013)


    With Windows 2012, a cluster will stay online with only one node standing. It has to do with dynamic weight adjustments.

    Only if you have removed the voting from the DR nodes.

    By placing a quorum on a third site you are increasing the availability of the cluster, usually a file share witness would be the preferred witness type. Loss of communication between Live and DR can be mitigated by the witness on the external site.

    With loss of comms between Live and DR but comms between Live and witness the Live site would be available

    With loss of comms between Live and witness but comms between Live and DR the Live site would be available.

    There are many other scenarios to boot and generally there would be less chance that you would need to go force quorum on your cluster which should be a last resort as this itself has ramifications when the cluster nodes start to come back online.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle (12/11/2013)


    PHXHoward (12/6/2013)


    With Windows 2012, a cluster will stay online with only one node standing. It has to do with dynamic weight adjustments.

    Only if you have removed the voting from the DR nodes.

    By placing a quorum on a third site you are increasing the availability of the cluster, usually a file share witness would be the preferred witness type. Loss of communication between Live and DR can be mitigated by the witness on the external site.

    With loss of comms between Live and DR but comms between Live and witness the Live site would be available

    With loss of comms between Live and witness but comms between Live and DR the Live site would be available.

    There are many other scenarios to boot and generally there would be less chance that you would need to go force quorum on your cluster which should be a last resort as this itself has ramifications when the cluster nodes start to come back online.

    Hi Perry

    On this - What would happen if there was a loss of comms between DR and witness? Should be fine, right?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply