July 31, 2008 at 10:04 am
We have an master-slave SQL Server 2005 cluster, inside our network, and a web application server in the DMZ which has web applications communicating to the cluster using SQL Server authentication.
Recently, we failed over to the passive node to install some updates. During the entire time of the failover, the web applications were unable to log into the cluster (System.Data.SqlClient.SqlException: Login failed for user 'theWebAppUser'). When we failed back over to the primary node, the web applications had no issues communicating with the sql server.
So it seems that our credentials are somehow not being mirrored between the master and the slave node in the cluster.
Any ideas why this might be the case?
Thanks
July 31, 2008 at 10:21 am
Verify the connection strings for the web application are using the SQL Server Virtual IP. My guess is that they are connecting directly to the node itself.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 31, 2008 at 2:55 pm
I checked the configuration files, and they appear to properly be pointing to the virtual ip for the cluster.
I found out a little additional info - the two nodes of the cluster are using shared physical storage. So, they should be accessing the same physical files. It also appears that the passive node has all sql services shut down until the primary node fails, at which point the (normally) passive node brings up its sql services.
July 31, 2008 at 3:44 pm
Nathan Davis (7/31/2008)
I checked the configuration files, and they appear to properly be pointing to the virtual ip for the cluster.I found out a little additional info - the two nodes of the cluster are using shared physical storage. So, they should be accessing the same physical files. It also appears that the passive node has all sql services shut down until the primary node fails, at which point the (normally) passive node brings up its sql services.
That configuration is correct - cluster resources have to be shared. The services should be set to manual on both nodes so the cluster can control the services. The cluster service needs to be able to start/stop the services when it fails over.
If the connections are using the appropriate virtual server IP - then, it is possible that the cluster configuration is incorrect. What you should have are at least two cluster groups, and possibly three cluster groups if you have separated MSDTC into it's own group.
You should have a Cluster Group that contains the cluster IP address, Name and quorom drive. Then you should have the SQL Group which contains all SQL related resources (e.g. all disk resources the SQL needs, SQL Virtual IP Address, SQL Virtual IP Name, SQL Server, SQL Server Agent, etc...).
Verify the cluster setup to see if there is anything there.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 31, 2008 at 9:38 pm
When you fail-over the node, can you connect as that user from a system not in the DMZ?
K. Brian Kelley
@kbriankelley
August 1, 2008 at 9:07 am
Here is our cluster configuration:
Cluster Group
* Cluster IP Address
* Cluster Name
* Disk Q:
Group 0
* Disk D
* Disk E
* SQL IP Address
* SQL Network Name
* SQL Server
* SQL Server Agent
* SQL Server Fulltext
Group 1
* Disk F
* MSDTC
* MSDTC IP Address
* MSDTC Network Name
August 1, 2008 at 9:10 am
I'll check about access from a box not in the DMZ, once I am able to schedule some down time for the web applications.
August 1, 2008 at 9:16 am
Also, we do have one application that is inside the domain (not externally accesible) that uses windows authentication, and it does not appear to be affected.
It appears that only applications using SQL Authentication (those in the DMZ) are affected when we fail over.
August 1, 2008 at 9:27 am
That may indicate a networking issue if the only systems affected during a failover (which kicks to a different physical port) are the DMZ servers. Not likely, but maybe worth investigating.
K. Brian Kelley
@kbriankelley
August 1, 2008 at 9:49 am
My previous post was incorrect.
I double checked, and access to all databases both for internal apps using Windows Authentication, and for external apps using SQL Authentication failed.
All SQLServices did start and run when failed over. But the authentication denied all access.
August 1, 2008 at 10:54 am
Any errors in the application or system event log upon failover?
K. Brian Kelley
@kbriankelley
August 4, 2008 at 9:29 am
I'm wondering if something is messed up with the master database. We found the following in our sql logs during the time that the failure was happening.
In particular, I noticed
07/27/2008 18:17:25,Server,Unknown,-l D:\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\mastlog.ldf,,,,
07/27/2008 18:17:25,Server,Unknown,-e D:\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG,,,,
07/27/2008 18:17:25,Server,Unknown,-d D:\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\master.mdf,,,,
and
07/27/2008 18:19:44,Logon,Unknown,Login failed for user 'theWebAppUser'. [CLIENT: xyz.xy.xy.xy],,,,
07/27/2008 18:19:44,Logon,Unknown,Error: 18456 State: 10.,,,
Which makes me wonder if something strange is happening with the master database? I'm pretty new to clustering so I'm not sure entirely what is relevant. The logs for the period of the failure are below:
07/27/2008 18:19:44,Logon,Unknown,Login failed for user 'theWebAppUser'. [CLIENT: xyz.xy.xy.xy],,,,
07/27/2008 18:19:44,Logon,Unknown,Error: 18456 State: 10.,,,,
07/27/2008 18:19:38,Logon,Unknown,Login failed for user 'theWebAppUser'. [CLIENT: xyz.xy.xy.xy],,,,
07/27/2008 18:19:38,Logon,Unknown,Error: 18456 State: 10.,,,,
07/27/2008 18:18:10,Logon,Unknown,Login failed for user 'theWebAppUser'. [CLIENT: xyz.xy.xy.xy],,,,
07/27/2008 18:18:10,Logon,Unknown,Error: 18456 State: 10.,,
07/27/2008 18:17:31,Logon,Unknown,Login failed for user 'AAAAA\BBBBBBBB'. [CLIENT: xx.xx.x.x],,,,
07/27/2008 18:17:31,Logon,Unknown,Error: 18456 State: 16.,,,,
07/27/2008 18:17:31,Logon,Unknown,SQL Server is not ready to accept new client connections. Wait a few minutes before trying again. If you have access to the error log look for the informational message that indicates that SQL Server is ready before trying to connect again. [CLIENT: xx.xx.x.x],,,,
07/27/2008 18:17:31,Logon,Unknown,Error: 17187 State: 1.,,,,
07/27/2008 18:17:31,Server,Unknown,SQL Server is now ready for client connections. This is an informational message; no user action is required.,,,,
07/27/2008 18:17:31,Server,Unknown,Server named pipe provider is ready to accept connection on [ \\.\pipe\$$\clusterVirtualName\sql\query ].,,,,
07/27/2008 18:17:31,Server,Unknown,Server local connection provider is ready to accept connection on [ \\.\pipe\SQLLocal\MSSQLSERVER ].,,,,
07/27/2008 18:17:31,Server,Unknown,Server is listening on [ 10.10.25.13 1433].,,,,
07/27/2008 18:17:31,Server,Unknown,A self-generated certificate was successfully loaded for encryption.,,,,
07/27/2008 18:17:31,spid12s,Unknown,Service Broker manager has started.,,,,
07/27/2008 18:17:31,Security,Failure Audit,Logon Failure: Source Port:-,Logon/Logoff,537,NT AUTHORITY\SYSTEM,NormallyPassiveNode
07/27/2008 18:17:31,Security,Failure Audit,Backup of data protection master key. Failure Reason:0x57,Detailed Tracking,596,AAAAA\BBBBBBBB,NormallyPassiveNode
07/27/2008 18:17:30,spid12s,Unknown,The Database Mirroring protocol transport is disabled or not configured.,,,,
07/27/2008 18:17:30,spid12s,Unknown,The Service Broker protocol transport is disabled or not configured.,,,,
07/27/2008 18:17:30,spid9s,Unknown,Starting up database 'tempdb'.,,,,
07/27/2008 18:17:30,spid9s,Unknown,Clearing tempdb database.,,,,
07/27/2008 18:17:30,spid5s,Unknown,The NETBIOS name of the local node that is running the server is 'NormallyPassiveNode'. This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:30,spid5s,Unknown,Server name is 'clusterVirtualName'. This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:30,spid9s,Unknown,Starting up database 'model'.,,,,
07/27/2008 18:17:30,spid5s,Unknown,The resource database build version is 9.00.3042. This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:30,spid5s,Unknown,Starting up database 'mssqlsystemresource'.,,,,
07/27/2008 18:17:30,Security,Failure Audit,Backup of data protection master key. Failure Reason:0x57,Detailed Tracking,596,AAAAA\BBBBBBBB,NormallyPassiveNode
07/27/2008 18:17:30,Security,Failure Audit,Unprotection of auditable protected data. Failure Reason:0x8009000B,Detailed Tracking,599,AAAAA\BBBBBBBB,NormallyPassiveNode
07/27/2008 18:17:29,spid5s,Unknown,SQL Trace ID 1 was started by login "sa".,,,,
07/27/2008 18:17:29,spid5s,Unknown,Recovery is writing a checkpoint in database 'master' (1). This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:29,spid5s,Unknown,Starting up database 'master'.,,,,
07/27/2008 18:17:28,Server,Unknown,Database mirroring has been enabled on this instance of SQL Server.,,,,
07/27/2008 18:17:28,Server,Unknown,Attempting to recover in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:26,Server,Unknown,Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:25,Server,Unknown,Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:25,Server,Unknown,Set AWE Enabled to 1 in the configuration parameters to allow use of more memory.,,,,
07/27/2008 18:17:25,Server,Unknown,Detected 4 CPUs. This is an informational message; no user action is required.,,,,
07/27/2008 18:17:25,Server,Unknown,SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.,,,,
07/27/2008 18:17:25,Server,Unknown,-l D:\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\mastlog.ldf,,,,
07/27/2008 18:17:25,Server,Unknown,-e D:\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG,,,,
07/27/2008 18:17:25,Server,Unknown,-d D:\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\master.mdf,,,,
07/27/2008 18:17:25,Server,Unknown,Registry startup parameters:,,,,
07/27/2008 18:17:25,Server,Unknown,This instance of SQL Server last reported using a process ID of 900 at 7/27/2008 6:17:18 PM (local) 7/28/2008 12:17:18 AM (UTC). This is an informational message only; no user action is required.,,,,
07/27/2008 18:17:25,Server,Unknown,Logging SQL Server messages in file 'D:\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG'.,,,,
07/27/2008 18:17:25,Server,Unknown,Authentication mode is MIXED.,,,,
07/27/2008 18:17:25,Server,Unknown,Server process ID is 3784.,,,,
07/27/2008 18:17:25,Server,Unknown,All rights reserved.,,,,
07/27/2008 18:17:25,Server,Unknown,(c) 2005 Microsoft Corporation.,,,,
07/27/2008 18:17:24,Server,Unknown,Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86) Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 2),,,,
07/27/2008 18:17:24,Security,Failure Audit,Unprotection of auditable protected data. Failure Reason:0x8009000B,Detailed Tracking,599,AAAAA\BBBBBBBB,NormallyPassiveNode
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply