October 25, 2015 at 12:00 pm
Hello everyone,
We had a big issue today during maintenance work in our SQL environment. I hope you can help us out what we are doing wrong in our SQL environment.
So our environment:
- 2x SQL Server 2014 Enterprise on Windows Server 2012 R2 (SRV1 and SRV2)
-- Both Hyper-V VMs on different Hosts
-- Both configured to an Windows Failover Cluster and AlwaysOn Availability Group (AG1)
-- AG Listener: AG1_lis
-- No shared storage (each Hyper-V Host has its own local storage)
-- Asynchronous Mode
-- SRV1 is primary, SRV2 is secondary SQL node
What happened?
- Shutting down Windows on SRV2 due hardware maintenance
- Cluster goes offline, AG1 goes offline
-- Error message: "Stopped listening on virtual network name 'AG1_lis'."
-- Error message: "The availability group database "DatabaseXY" is changing roles from "PRIMARY" to "RESOLVING" because the mirroring session or availability group failed over due to role synchronization."
Results?
- AG1_lis wasn't available for our applications and they stopped working properly because database connection was lost!
I think, I HOPE, this is not the normale behaviour when one node is shutting down (especially the secondary node!)
I already searched a little bit and found two things which could be the problem but I am not sure:
1. We haven't set any quorum. I had read a lot of documentation about AlwaysOn and my conclusion was that a quorum is not necessary in our environment. Am I totally wrong at that point ? Do we need a quorum in a 2-node cluster and without shared storage?
EDIT: This might be the problem. I will set up a Node and File Share Majority Quorum to solve this problem...
2. I found this topic in this forum: http://www.sqlservercentral.com/Forums/Topic1465938-2799-1.aspx
I see this in our environment. Our "Current Host Server" in Windows Failover Cluster Manager is SRV2. In SQL Server, our Primary Node of the AG1 is SRV1. Is this the problem? But why these differ? Is it normal?
I hope you can help me with these two questions and my problem...
Best regards,
babo
October 26, 2015 at 4:26 am
you're losing quorum and this is taking the cluster service offline, when this happens any cluster roles will also go offline.
for a 2 node cluster you should employ a witness, which witness have you used?
please supply the details of the following powershell query run on one of the cluster nodes
Get-ClusterQuorum
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
July 25, 2019 at 8:32 pm
Correct Quorum Witness is missing in the above setup.
However I have same setup in my work place. My Scenarios are....
July 25, 2019 at 9:17 pm
In case of 2-members cluster, when one node doesn't see another one, cluster decides to shutdown itself to prevent split-brain situation when you potentially have two copies of databases online with unpredictable amount of clients working with both databases (changing data, etc).
So, third member is needed to make a quorum either with first node or with second one.
Node in quorum will have databases online, another one (without quorum) will go down immediately.
July 26, 2019 at 1:00 am
What is the windows OS version and edition you are using for these cluster nodes?
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
July 26, 2019 at 3:01 pm
Windows Server 2016 Standard and SQL Server 2017 EE
regards
Sree
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply