May 1, 2013 at 11:17 am
Last night our primary SQL Server node went down and failed over to the secondary node.
I was actually on the server at the moment having just launched a trace to troubleshoot a particular query when suddenly I lost all connectivity to SQL Server.
Our setup is:
Microsoft SQL Server 2008 R2 (SP1) - 10.50.2796.0 (X64) 2 Node Active/Passive Cluster.
Here is what I found in the Administrative Log :
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]Query timeout expired
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]The connection is no longer usable because the server failed to respond to a command cancellation for a previously executed statement in a timely manner. Possible causes include application deadlocks or the server being overloaded. Open a new connection and re-try the operation.
We have SQL Server and SQL Server agent are running under designated network accounts.
SQL Server Browser is running under a Local account.
Never had that issue before in 2 years we've been using the server.
The SQL Server error log did not reveal much. The very last event in the error log before the node went down is:
2013-04-30 20:06:48.970spid133SQL Trace ID 2 was started by login "sa".
Thank you for your help
May 1, 2013 at 3:25 pm
What is in the windows error log?
May 1, 2013 at 3:31 pm
Administrative log was the most informative:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]Query timeout expired
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]The connection is no longer usable because the server failed to respond to a command cancellation for a previously executed statement in a timely manner. Possible causes include application deadlocks or the server being overloaded. Open a new connection and re-try the operation
System Log:
Cluster resource 'SQL Server' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.
Application log:
[sqagtres] SvcStop: service did not stop; giving up.
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Server Native Client 10.0]TCP Provider: The specified network name is no longer available.
Error4/30/2013 8:16:55 PMMSSQLSERVER19019Failover
May 1, 2013 at 4:17 pm
What is in the system log before the SQL Server cluster resource became unavailable?
September 22, 2013 at 11:58 pm
Hello All,
I am also facing the similar issue.
Please let e know if the issue was resolved and share the resolution.
Regards,
Vandy
September 24, 2013 at 10:43 am
Just to give you some background clustering works on a heartbeat which is configured in Failover Cluster Manager for each each clustered resource or cluster goup.
What were you collecting in your trace and was it through the Profiler GUI - if this was a large volume of events the server\instance could have been too busy to respond to the heartbeat (health check) and as a result the failover occurred.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply