November 27, 2018 at 3:45 pm
I'm kicking tires and driving into ditches testing our proposed AG confiuguration when I come across this:
Two servers with multilple AGs. Server 1 is primary for Group A and all other AGs. Server 2 is primary for Group B. Each is secondary to the other server's primary.
One friday I'm consolidating some AGs on server 1. I add them to group A manually using TSQL. I plan to delete the replicas (now in a 'restoring state' after having been removed from a group) restore them and run the ALTER Database command to set the HADR group.....but I get interrupted.
When I get back to it, I decide I want to add some databases to Group A from Group B...Group B is a secondary on server 1. If I remove the databases from Group B on Server 2, that will leave them in a restoring state on server1 (inconvenient). So I think, Hey! I'll just failover that ONE group to Server 1. Then I can remove the databases and add them to GROUP A and do all the restores and HADR group setting at once.
I open a session on Server 1 and check the synchonization status on the databases in Group B. They are all good. Now, failover Group B from Server 2 to Server 1. The session takes about 5 seconds then tells me it successfully submitted the request.
Result:
Group B goes into a Resolving state and....doesn't resolve. More perplexing, Group A databases (which were synchonized just fine before the failover) all become unavailable, and apparently begin re-seeding on Server 2 (?!?). So do the databases from ALL groups on server 1 !!!. All databases in ALL groups are unavailable across both servers.
At this point, my primary concern is GROUP B stuck in 'Resolving'. So, I REBOOT server 1.
GROUP A now fails over to Server 2...All databases unavailable. GROUP B shows up on the rebooted Server 1 as primary !!. After a few hours, The synchronizations completes. All the groups from Server 1 (except group A) have failed over to Server 2 and are back to synchronizing. Group B is now primary on Server 1 (?? as if the original failover command was finally received when the service resumed). And Group A is pretty much hosed and unavailable running on Server 2.
What really bothers me about this is, I was not expecting the failover of a group from Server 2 to Server 1 to affect ALL the groups on the destination server. Not that I plan to have split primary/secondary roles on the servers, but I would like think I could.
So yeah, one of the groups on Server1 was 'in progress'. But I wasn't failing it over, I was failing over GROUP B that is only a secondary on Server1. The damage in this case it that the databases (while seeding) were unavailable, until they ALL complete their seeding. (How were they seeding anyway? I thought if the database already existed on the other server this operation would fail? The wizard refuses to do it - again, I was using TSQL.) If this happened in our production environment, it would be bad. I won't say catastrophic, since there isn't any unexpected dataloss, but the unavailablity it a big hit.
As a preventative measure,
Do I need to check the health status of every database in every group on the destination server before I allow a failover of any ONE group to that server?
Do I need to check the Seeding and make sure EVERY AG is set to MANUAL before the server becomes the target of a failover?
Any one else experience FailOver Snafus like this?
November 28, 2018 at 1:51 pm
what seeding setting do you currently have on your AGs ?
I've set it to "manual" once I have my AGs in a fucntioning state.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
November 28, 2018 at 2:01 pm
I have a mix of seeding options. I started with Automatic while I was setting up, then switched to manual on a single AG to practice the manual seeding I might do in production if I were fixing a database synchronization problem and had to drop and re add a live database. I'm thinking once the initial setup is done, I want everything set to manual. If I decide I want to use the automatic seed, Ill flip it on, add the database(s), then flip it back.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply