Over period of time application stop connecting to AG database

  • Hello.

    SQL version - 2016 , windows 2019 both are enterprise edition

    Over period of time application stop connecting to Alwayson availability group database. what could be reason why application stop working automatically? please suggest.

    Checked SQL Error logs not found any errors regarding connections. also windows event viewer not found cluster related errors.

    application start connecting working just manual failover to another node.

    Thanks

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • Are you using a Listener to connect to ?

    If not, chances are your db switched to primary on the other node.

    Do you have (failed) login monitoring ( xEvents ) in place ?

    Do you have AG monitoring ( xEvents )  in place ?

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Hello Johan Thanks for reply .

    Yes, I am using listener name for connecting applications.

    Thank you for sharing reference link for monitoring failed login. I will try to capture it. My AG setup was on-prime not in any cloud.

     

  •  

    In Error logs- there were multiple errors like for login filed messages and frequently recorded in error logs.

    Error: 18456, Severity: 14, State: 8. -

    Message

    Login failed for user 'Login_name'. Reason: Password did not match that for the login provided. [CLIENT: IP address]

    it seems login failed messages but the same login name and credentials connected at application string and running smoothly.

    over period of the totally stopped Listener connection at application.

    Thanks.

  • My approach would be to try to isolated where the problem occurrs.

    • Does it happen with all connections.
    • Does SSMS connect ok.
    • Is the problem connection from a process that has been running continuously for many days.
    • Does rebooting the server hosting the problem application fix the problem.
    • Does the application connect using a domain account or a computer name  or a gMSA account

    If you can try to isolate the circumstances where the problem occurs, rather than the global 'application stops connecting', then you are most of the way to finding a solution

     

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Please find the remarks

    Does it happen with all connections?

    1. it happens only web server applications (apache web server) , other interface and thick client continuously connecting.

    Does SSMS connect ok.

    2. Yes, it is connected with SSMS also accessing databases.

    Is the problem connection from a process that has been running continuously for many days.

    3. Yes, application connection disconnect automatically after 15 to 20 days node restarting and manual failover.

    Does rebooting the server hosting the problem application fix the problem.

    4. Yes. Manual switch over to another node then application start connecting as usual.

    Does the application connect using a domain account or a computer name  or a gMSA account

    5. Application connected with SQL authentication SQL user not in domain account.

    I logged call with Microsoft support, they simply said password incorrect so that application stop connected automatically.  but this password & logins applications running since last more than 5 years

    Following errors frequently reported in error logs even logins & password correct also manually checked and connected SSMS. but why this errors reported. also application still remain connected same logins.

    Error: 18456, Severity: 14, State: 8. -

    Message

    Login failed for user 'Login_name'. Reason: Password did not match that for the login provided. [CLIENT: IP address]

    Thanks

     

     

  • I am clutching at a straw here, as I have not seen this ctual problem.

    I am wondering if there is some level of server-to-server authentication happening that is causing the problem. It could be that the password used by the app server to connect to the domain has been changed (this happens automatically normally every 30 days but the interval can be changed) but the app connect process is still sending the old password.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Since a failover fixes the issue, it may be that the SQL Login password is out of sync on one (or more) nodes.

    Change the login name and execute this with sysadmin/CONTROL SERVER permissions on each node. Check that the password hashes match exactly on every node:

    DECLARE @name sysname = N'<web service SQL login name>';

    SELECT @@SERVERNAME AS ServerName
    , @name AS LoginName
    , LOGINPROPERTY(@name, 'IsExpired') AS IsExpired
    , LOGINPROPERTY(@name, 'IsLocked') AS IsLocked
    , LOGINPROPERTY(@name, 'IsMustChange') AS IsMustChange
    , LOGINPROPERTY(@name, 'PasswordHash') AS PasswordHash
    , LOGINPROPERTY(@name, 'PasswordLastSetTime') AS PasswordLastSetTime
    ;

     

  • Hi Justin, Thanks for reply.

    I checked hash password on both the nodes, Both are same hash values. somewhere application not accepting connection in database server with listener name.

    The same connection string accepting after failover node1 and over the period of the time node1 not accepting application connection then again manual failover to node2 then application start connection.

    Node1:

    0x0200643735E85AF09ECBDB05FE6473A81154195628FA49855AE9CA8C186E70EA53567A0E9433237E6BBFA895644C2EF41F06DD43F6C836E7608BADFE039DDC437772DD58F6BE

    Node2:

    0x0200643735E85AF09ECBDB05FE6473A81154195628FA49855AE9CA8C186E70EA53567A0E9433237E6BBFA895644C2EF41F06DD43F6C836E7608BADFE039DDC437772DD58F6BE

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply