October 5, 2021 at 1:22 am
Hi,
we are planning to activate our secondary DC2 site and bring down our primary Data Centre DC1.
we have three node cluster with AG's ON. Two nodes SQL07P & SQL08P which are in automatic failover (Synchronus commit) in Primary Data Centre DC1 but third node SQL06P in Manual failover (Asynchronus commit) in Secondary Datacentre DC2 (for DR purpose).
we have three node clusters :
SQL version on three node cluster is SQL 2012 SP4 enterprise edition
SQL07P (Primary) -Automatic failover -> Synchronus Commit (DC1)
SQL08P(secondary) - Automatic failover -> Synchronus commit (DC1)
SQL06P (secondary) - Manual failover -> Asynchronus commit (DC2)
When primary DataCentre DC1 goes down including SQL07P &SQL08P server , How should I perform failover as SQL06P node in DC2 , failover mode is in Manual mode?
I think I need to perform forced failover it with some data loss but how? please share me steps if possible?
-Do we need to force a quorum and then bring the surviving cluster nodes back online in a non-fault-tolerant configuration?
-Also, how should I reconfigure back SQL07P & SQL08P node in DC1 back in synchronized state in case of primary DC1 crashed?
-will it synchronized again once I perform forced failover?
Please help me out.
Regards,
Ken
October 5, 2021 at 4:48 pm
When primary DataCentre DC1 goes down including SQL07P &SQL08P server , How should I perform failover as SQL06P node in DC2 , failover mode is in Manual mode?
I think I need to perform forced failover it with some data loss but how? please share me steps if possible?
-Do we need to force a quorum and then bring the surviving cluster nodes back online in a non-fault-tolerant configuration?
These are all related. When DC2 loses the connection to DC1, the cluster in DC2 will lose Quorum. You will need to force quorum on that cluster before you can bring the SQL instance online. Then you can fail over the AG by logging in to SQL06P and executing ALTER AVAILABILITY GROUP <your_ag> FORCE_FAILOVER_ALLOW_DATA_LOSS; (Note: if this is a planned failover and DC1 is still online, then you will still use the FORCE_FAILOVER_ALLOW_DATA_LOSS option, but no data will be lost if the replica servers are communicating when you issue the command.)
Failing back to DC1 requires resetting quorum and resuming the DC1 secondaries (the old primary will become a secondary when it comes back). However, it looks like the servers in DC1 can establish quorum without the node in DC2. If you failed over after DC1 went down and the command didn't reach DC1, then the servers in DC1 may very well start up, establish quorum, recover databases and the AG, get rolling, and run two separate de-sync'd copies of your database.
You MUST build some test servers and perform these actions several times before disaster, or you can pretty much guarantee things will not have been prepared correctly, and wrong actions will be taken. This will either slow your recovery or completely prevent it.
Eddie Wuerch
MCM: SQL
October 6, 2021 at 3:50 am
Thank you So much for the reply.Got some clarity now:)
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply