January 2, 2020 at 6:35 pm
Hello folks and happy new year! 🙂 .
I have a lab here at work and i ran into this issue regarding the Listener.
This is an Always On Availability group lab i have and everything is going fine but there's an issue when connecting through the Listener.
These are 2 SQL server 2016 EE with SP2 on 2 Windows server 2016 Standard.
Have anyone been through this? i believe its got to be something with the domain, i haven't tested at home since my lab machine have some issues and i couldn't really make this at home yet.
Thanks for taking your time to read this.
Regards,
January 3, 2020 at 6:06 pm
When you change between an AD account and a local SQL instance are you using the same Windows login on the same machine? I'm guessing you already checked but I am wondering if something is changing the in DNS server settings on the client machine?
Are the two servers on the same subnet? Do they have more than one IP address, on different subnets? There are specifics to multisubnets with availability groups, I'd have to check the details if that might be the cause.
Thanks
January 7, 2020 at 12:45 pm
When you change between an AD account and a local SQL instance are you using the same Windows login on the same machine? I'm guessing you already checked but I am wondering if something is changing the in DNS server settings on the client machine?
When connecting to SQL Server using local account, yes in the machine im still connected with my user.
Are the two servers on the same subnet? Do they have more than one IP address, on different subnets? There are specifics to multisubnets with availability groups, I'd have to check the details if that might be the cause.
Yes, they are in the same subnet, they have 2 IP Addresses each, the one we use for SQL and the WSFC heartbeat.
January 7, 2020 at 1:29 pm
I am just looking at the 1st point in your original post again. I think there should be four possible combinations?:
Can you advise which combinations work reliably and which don't?
If the connections to the listener fail then can you ping the listener name or IP?
Are you able to check the Windows Clustering manager to see if it indicates any issues?
January 7, 2020 at 1:48 pm
Sure,
Most combinations work.
AD Account to Instance itself works fine
AD Account to Listener timesout or takes too long to connect, or connects.
SQL Account connecting to listener works fast and fine
SQL Account connecting to instance itself works fast and fine
I can ping the listener through IP and name, most of times response is =1ms, sometimes like 0.1% spikes to 150ms, which i believe its normal.
If i Nslookup name, it would give me the IP Adress of the listener, whenever i nslookup the ip of the listener it doesn't return the name, i believe that's not issue though.
I'll be checking the cluster to see if i can find something.
Thanks!
January 7, 2020 at 2:25 pm
Sorry I don't know the cause of this at the moment. Is there anything in the SQL server logs at the time of the failed connection?
January 7, 2020 at 2:54 pm
It's cool, you are trying to help and thats more than welcome!, also i know it's hard to help when you don't have the issue in your hands.
Actually, sql server logs doesn't show me anything, SSMS just throws the common timeout message with a little bit more information regarding the process of authentication, i'll add a screenshot.
Thanks again!
January 7, 2020 at 3:24 pm
Thanks. The screenshot says that it failed during the post login stage so I think the connection attempt started. Is there anything in the Windows Event Log?
I don't know exactly how SQL server AD authentication works but I would think that, at some point, there must be some communication between the SQL Server and the AD server. Maybe that connection is failing sometimes?
Is it possible that, because it's a lab environment, the machines haven't got much RAM allocated and one of them is using all the memory sometimes?
January 7, 2020 at 8:08 pm
Now i got to see some error of after the whole past week searching and didn't find anything on the logs..
SSPI handshake failed with error code 0x80090304, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The Windows error code indicates the cause of failure. The Local Security Authority cannot be contacted
Now, this got me wondering, besides adding the SPN on both of the AG replicas, i should register the SPN for the Listener?
I have been trying to, but i get the error "could not locate account domain/user"
Besides registering SPNs for the replicas, is there any other service i should register SPN? AG name,Listener?
January 7, 2020 at 8:43 pm
SPN's is outside of my knowledge unfortunately. The information here might help you https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/prereqs-restrictions-recommendations-always-on-availability?view=sql-server-ver15
Where did you find the error message?
January 7, 2020 at 9:10 pm
I found it on the SQL Server error log today, earlier dates don't have the message.
Thanks for the URL.
And... effectively, you have to register SPN for the Listener, i'll keep testing the next couple of days to see if the issue have been fixed.
January 7, 2020 at 9:54 pm
The SPN needs to match the availability group name. Refer to this article for more info:
Connect to an Always On availability group listener
Sue
January 8, 2020 at 12:31 pm
Thanks Sue!, we'll be taking a look and making some tests.
oh lord if its the SPNs im gonna hang myselffffff.
January 8, 2020 at 3:40 pm
Updating you guys.
Today we've registered SPNs to the listeners.
setspn -A MSSQLSvc/AGListener.domain.com domain\serviceaccount
setspn -A MSSQLSvc/AGListener.domain.com:1433 domain\serviceaccount  (or whatever port you used for the listener)
so far.. no timeouts or slow connections, after today or 2 days i'm gonna wrap everything up and say it was SPNs.
Thanks again as1981 and Sue, any news i'll update you.
January 8, 2020 at 5:16 pm
Really appreciate it when people provide updates, which you do so thanks!
I'm guessing your fine now and that was the issue.
Sue
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply