It’s surprisingly easy to set up the new AlwaysOn features. I’ve done it on VMs running on my laptop, from scratch, three times in the last few weeks. It’s easy because there are a set of validations that your run for the cluster and for the AlwaysOn setup that ensure you’re going to get a successful install… or do they?
I hit a situation where it didn’t work correctly, so I thought I’d share it in case others ran into it.
The setup is straight forward. I have network, contoso (yes, I’m using Microsoft training & documentation, it’s a beta, but you should see it available soon), with a domain controller and five servers all in a failover cluster. They passed the cluster test, so all five are hooked in. I went to use the Availability Group Wizard to set things up. I chose a database with a full backup in Full Recovery mode. I added a second server to act as the secondary in the Failover Group. I chose a share where the backup was kept and was accessible to the other server (I even checked this). Then I ran the validations:
Which came back all green except for the warning about the listener configuration (which, you don’t need to set up an Availability Group, just to access one from an app, seemlessly).
I checked the summary and then built my group, which took WAY too long, and was presented with this:
What the heck? So I took a look at the error:
Connection not active? Yes it sure is. I went round and round through this. Seriously. First, I found out I had a database in 2008R2 compatibility mode (and how funny is that to be saying at last). Fixed it. No joy. Took a new backup (dummy). No joy. Revalidated every possible check from this server. Nothing.
In desperation I went to another server and tried setting up AlwaysOn between it and a neighbor. It worked… What. The. ****?
Then, I went back and reread the error message (always a good thing). “The CONNNECTION to the primary replica is not active” Well, that’s how I read it the second time. So I tried connecting to the primary from the secondary, just through SSMS. No connection. I went back to the primary & double-checked, yes, it could connect to all four of it’s brother & sister servers. Checked each of them for connectivity back… nothing. I had a network problem that I didn’t realize was there.
So, moral of the story, just because you’ve run the tests that MS provides for AlwaysOn doesn’t mean you won’t run into an issue.