Availability Group Quorum Question

  • Hi all

    We currently have a geo cluster which spans 2 datacentres which are close together. Not having a 3rd location and as we are wanting to use availability groups I am unclear on what quorum configuration to use for the cluster.

    We only have one node in each datacentre so if I use node majority and have the quorum (probably file) in either datacentre then the situation exists that if we lose the datacentre with the quorum in the whole cluster would break and not achieve HA.

    I was considering weighting the quorum such that it could not vote, would this solve the issue or would quorum never be achieved because I have an even number of nodes?

    Thanks for any help

  • Kwisatz78 (9/13/2012)


    Hi all

    We currently have a geo cluster which spans 2 datacentres which are close together. Not having a 3rd location and as we are wanting to use availability groups I am unclear on what quorum configuration to use for the cluster.

    We only have one node in each datacentre so if I use node majority and have the quorum (probably file) in either datacentre then the situation exists that if we lose the datacentre with the quorum in the whole cluster would break and not achieve HA.

    I was considering weighting the quorum such that it could not vote, would this solve the issue or would quorum never be achieved because I have an even number of nodes?

    Thanks for any help

    You could remove voting from one of the nodes, that would then leave a single node cluster.

    If you included a majority node vote on your primary site then in the event of an issue the primary site would have 2 out of 3 online whereas the secondary would have 1 out of 3 online.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Hi Perry

    Sorry I am not sure I followed what you have said. If I remove voting from one of the nodes is this not the same as removing voting from quorum file? If so then wouldn't I have the same problem I described in my OP?

    i.e. assuming the quorum file in the same datacentre A as the primary node, and I have removed voting from node 1 in datacentre A

    Then if I lost datacentre A then there would only be one vote from Node 2 in Datacentre B, which would not be a mojority, hence the cluster would not automatically failover to Datacentre B, but I would have to bring it up manually.

    Apologies if I have mis-understood

    Thanks

  • Kwisatz78 (9/13/2012)


    Hi Perry

    Sorry I am not sure I followed what you have said. If I remove voting from one of the nodes is this not the same as removing voting from quorum file?

    where does the quorum file reference come from? what is your current quorum model in use by the cluster?

    Kwisatz78 (9/13/2012)


    i.e. assuming the quorum file in the same datacentre A as the primary node, and I have removed voting from node 1 in datacentre A

    Then if I lost datacentre A then there would only be one vote from Node 2 in Datacentre B, which would not be a mojority, hence the cluster would not automatically failover to Datacentre B, but I would have to bring it up manually.

    Why would you put a voting node in centre a and then remove voting from the cluster node in centre a, that is not what i described above.

    Kwisatz78 (9/13/2012)


    Apologies if I have mis-understood

    Thanks

    I think you may have slightly. The act of removing votes from cluster nodes is designed to provide a biased node vote across geo clusters. Incidentally this is nothing to do with SQL Server, this is the base Windows cluster configuration and should be properly designed\configured before SQL server, or any app for that matter, is installed.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Well we don't have one like this. Currently we use geo-cluster using EMC SRDF CE, with storage at each datacentre. The primary node contains both the data and quorum disks, and all is replicated using SRDF to the secondary datacentre with the disks mirrored exactly (including the quorum disk). During a failover the secondary disks come online with the replicated data, and vice-versa for failback.

    In this case the servers do not have shared storage only local storage, so we can not use SRDF CE.

  • Kwisatz78 (9/13/2012)


    The primary node contains both the data and quorum disks

    So you're using Node and disk majority quorum type?

    Kwisatz78 (9/13/2012)


    all is replicated using SRDF to the secondary datacentre with the disks mirrored exactly (including the quorum disk)

    Why are you replicating the Quorum disk? This should not be replicated and certainly shouldn't be used for a geo spacial cluster configuration.

    Windows 2008 quorum types available are;

    • Node majority - (recommended for clusters with an odd number of nodes)

      Can sustain failures of half the nodes (rounding up) minus one. For example, a seven node cluster can sustain three node failures.

    • Node and disk majority - (recommended for clusters with an even number of nodes)

      Can sustain failures of half the nodes (rounding up) if the disk witness remains online. For example, a six node cluster in which the disk witness is online could sustain three node failures.

      Can sustain failures of half the nodes (rounding up) minus one if the disk witness goes offline or fails. For example, a six node cluster with a failed disk witness could sustain two (3-1=2) node failures.

    • Node and file share majority - (for clusters with special configurations)

      Works in a similar way to Node and Disk Majority, but instead of a disk witness, this cluster uses a file share witness.

      Note that if you use Node and File Share Majority, at least one of the available cluster nodes must contain a current copy of the cluster configuration before you can start the cluster. Otherwise, you must force the starting of the cluster through a particular node. For more information, see "Additional considerations" in Start or Stop the Cluster Service on a Cluster Node.

    • No majority disk only - (not recommended)

      Can sustain failures of all nodes except one (if the disk is online). However, this configuration is not recommended because the disk might be a single point of failure.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Yes node and disk majority, the thing with SRDF CE is that it replicates every disk, I have just checked with our SAN admin and thats how it works apparently.

  • Perry Whittle (9/13/2012)


    and certainly shouldn't be used for a geo spacial cluster configuration.

    😉

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • I never said we didn't have an unusual setup :p with the quorum being a disk part of the SRDF CE cluster our SAN admin is saying it does need replicating, but I am just caught in the middle here.

    In any case it seems I am going to have to put a file share quorum (as we have no shared storage between these particular servers) in the primary site with one of the nodes. If we lost the secondary site then there should be no disruption, however if we lose the primary site then I assume some manual intervention would be required to bring up the SQL instance on the secondary site?

    Thanks

  • Actually I have just re-read what you originally posted.

    If I use Node majority and set the secondary server with a weighting of 0, such that I have Node A in Datacentre A with 1 vote and Node B in Datacentre B with 0 vote.

    Then if I lost Datacentre B the cluster would remain up, however what would happen if I lost Datacentre A?

  • Kwisatz78 (9/13/2012)


    In any case it seems I am going to have to put a file share quorum (as we have no shared storage between these particular servers) in the primary site with one of the nodes.

    Then you'll no longer be using the disk based resource

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply