AOAG resources partially disappearing

Question

Post reply

AOAG resources partially disappearing

Marko

SSC Enthusiast

Points: 188
More actions
August 14, 2023 at 4:02 pm

#4261017
Dear all,
I am posting here as I have been struggling 3 days now on an issue and am looking for guidance to solve my issue :
I deployed a Failover Cluster composed of 3 VMs on the same VLAN
- 2 servers that will run MSSQL instances
- Each instance is on an independent VM
- A common AD service account is used to install both MSSQL engines
  1 server hosting a File Witness share, full access to the shared granted to the cluster computer object in AD
On top 3 availabiity groups, with all the same behaviour:
- Availability group setup was super smooth. All databases were created without issue by automatic seeding
- Database inclusion / removal to availability group is done without any error / warning
Still I encounter issues on AGs with or without listners
On normal behavious : Primary instance indicates everything is fine and I see no issue on replication (all changes done on instance #1 are reflected correctly on instance #2)
However, when I look at instance #2, I already have something that doesn't look the same (is that question mark expected due to the fact the second instance is "closed" and just taking replication streams?)
Now, the fun stuff happens when I attempt a manual failover of any of the AGs to the second node. The failover operation is all green :
Still the situation just gets worse until I reboot the secondary node or fail back (in all cases, it only runs well when coming back to instance #1)
One of the nodes seems to be working as expected :
However the other is completely lost and seems to only have the secondary node visible
I have dumped all instance logs & cluster logs but I don't see any of the obvious issues that I came across on these forums (no permission errors raised on cluster, DNS or even instance access). The cluster seems healthy and I re-run all validation tests without issue (warnings on network resilience which isn't an issue on ESX servers).
Would anyone have a hint of where I can look to start understanding what could be causing this asymetric behaviour between my nodes?
Thanks for your help & support
Marko
- This topic was modified 1 year, 4 months ago by Marko.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

andreas.kreuzberg SSCertifiable Points: 6191 More actions · Answer 1

Hi,

I think, these small icons ara just icons in the ssms, I wouldn't care about it.

Do you start the aoag dashboard from a single node, or from the listener. If you start the aoag dashboard from the secondary node, only the secondary node is in the dashboard visible.

If you start the aoag dashboard on the active node, both sql server are visible.

Kind regards,

Andreas

Marko SSC Enthusiast Points: 188 More actions · Answer 2

Hi Andreas

Thanks for the prompt feedback. Indeed, when connecting through the listner it works exactly the way it should. I think it's just those question marks & the fast that primary & secondary nodes can't independently show the health of the AG that triggered my worries.

Thanks again for the info, very appreciated !