June 24, 2020 at 6:03 pm
So, I have a 2 node cluster. Nodes are both up. Cluster validation is happy. Win Server 2016 DC & SQL Server 2016 SP2.
I create an Always On AG & listener, add a random db for testing. Again, success.
I can fail my new AG over (back & forth) happily using SQL via SSMS.
When I go to the Failover Cluster Manager, I try to move the Always On Role & it says 'Move Failed' & stays on the same node.
This happens regardless of which node the role is hosted on (remembering that I can failover via SSMS).
So ... any ideas why this is happening?
I can see nothing illuminating in the cluster.log, FCM, errorLog or the 2 x eventViewers. I can only guess that the FCM has an extra level of comms to do before it hits SQL. So perhaps I need to open up a different port, or something ... help, I'm so stuck.
Thank you!
June 24, 2020 at 6:05 pm
I should probably mention that this client has the most ridiculous lockdown policy so it could easily be GPO, permissions, firewall ... but my server admins are completely unhelpful until I prove it's on them so any pointers most gratefully received. I would like to be able to sock it to 'em & fix this bloody problem. Thanks.
June 24, 2020 at 8:38 pm
Failover Cluster Manager *cannot* be used to fail over an availability group. The only method that can be used is SQL - whether from SSMS or through SQL statements.
The reason for this is simple - in order to be able to fail over to the secondary, SQL Server needs to switch from asynchronous to synchronous (if not already set) and validate that all transactions have been hardened on the secondary before it can transition to the other node. If this is not done, then your fail over could (probably will) cause a loss of data - because there could be transactions that are still in the REDO queue waiting to be committed and hardened on the secondary.
Failover Cluster Manager does not have any visibility to the internals within SQL Server and has no awareness of the state of synchronization on all of the databases participating in the availability group.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
June 25, 2020 at 7:48 am
Hey Jeffrey, thanks for the reply.
So I've just checked on my estate & I am able to failover the AlwaysOnAG via the FCM for the other clusters. Right click, move the AlwaysOnAG role to another server. Check within both FCM & SSMS & the AlwaysOnAG & associated dbs have successfully moved to node 2.
Still confused and appreciative of advice …
June 25, 2020 at 2:38 pm
Hi again,
Just a quick note to say that it seems you can failover an AlwaysOnAG role via the FCM if it is set to synchronous with auto failover. You can't if it isn't. I guess this makes sense in the context of Jeffrey's response above. I'm guessing the FCM knows enough to check if failover is automatic & therefore synchronous & can proceed. If it isn't it wants nothing to do with a failover.
Not disagreeing with whether or not it should be done.
Thanks.
June 25, 2020 at 7:44 pm
Here are 2 references:
In both documents - they expressly state that you should not use FCM to manipulate availability groups - including such things as:
Regardless of the setup and configuration of the availability group - the warnings are very clear from Microsoft.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
November 15, 2023 at 5:39 pm
I agree to Jeff. Do not use FCM but SSMS/SQL for failovers.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply