Cluster instance won't fail over

  • In QA we have a 2-node Windows 2008R2 cluster with 7 instances of sql 2012. After our systems team moved all instances to one node, they patched the other with MS patches ( not sql patches ). Now they're ready to patch the other node but at least one instance won't move to the patched server. On both nodes now in the services window, you only see 3 instances one one node and 4 on the other.

    Systems thinks I need to re-install sql but I can see references to all 7 instances in the registry so I think it's a cluster problem.

    A month ago I applied Sql 2012 SP3 to two instances, one on each node, but didn't have any trouble failing instances back and forth then. Adding SP3 to the others is waiting for QA approval.

  • I'm running the sql 2012 install "add node to cluster" -- hopefully this will do it.

    Odd thing that Microsoft refers to an instance as a "node" when everywhere else you think of "NODE" as one of the physical servers in the cluster.

  • What does the Failover Cluster Manager log say? Also, for the instance that won't fail over, is there anything related in the Sql Server error log for that instance?

  • Cluster Events show entries like this which probably occurred when trying to move this instance to the patched node. Moving it back to it's original node makes everything ok.

    Cluster resource 'SQL Server (SACSQLDEVINST007)' in clustered service or application 'SQL Server (SACSQLDEVINST007)' failed.

    Not much of interest in the sql logs unless this relates:

    Failed to verify Authenticode signature on DLL 'c:\Program Files\Microsoft SQL Server\MSSQL11.SACSQLDEVINST004\MSSQL\Binn\msxmlsql.dll'.

    Perhaps it's not a great idea to have 5 instances on sql sp2 and two on sp3, but QA makes us do these things gradually. Of the two on sp3, one is running fine on one cluster node and the other is running fine on the other cluster node.

  • Is it safe to say that SP3 was successfully applied to both nodes (active or passive) at one time or another?

  • APPLIED to two instances but I think both were on "node" 2 when patched, so none patched while sitting on "NODE" 1

    The fact that one of the patched instances was failed to node 1 and runs fine doesn't get past the problem. So lesson learned is if you have to apply service packs gradually, make sure you do one on both NODES or don't expect failovers to work 100% of the time.

  • I agree with you there. We always patched the passive node first, failed the databases over to it, then did the previous active, then failed them back to their original primary node. There was never a time when some nodes got patched and others did not, so, it was always made sure that the same patch level on all the nodes and sql instances was the same at all times for situations like the one you ran into here.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply