Blog Post

Cluster failover – what’s the best way to get notified?

,

G’day,

I look after several clustered instances of SQL SERVER. One of the tasks that I do every morning is to check if any of those instances have failed over to a new node. I would prefer to have this monitored automatically, and that’s what this post is about – getting feedback if possible.

My manual method of checking for failover simply involves running a query to see which node the cluster currently resides on, like so,

SELECT SERVERPROPERTY(‘COMPUTERNAMPHYSICALNETBIOS’);

As well as pulling the uptime of the SQL SERVER from a DMV.

The query will basically let you know what the machine name is of the node that is currently hosting the clustered instance. It you run it on a just a single server instance then NULL is returned.

I use a multi-server query, and basically run this query across all of my clustered instances.

Now, because I only have a small number of clusters to monitor – six – I can immediately see if any of them are on a different node than they should be. The solution is OK, but there’s a number of problems with it

  • It’s manual and rely’s on somebody (me) running the script.
  • The person running the script has to have knowledge of the clusters – i.e which node are they expecting to see them on (or at least some sort of reference document handy)

Now, a cluster failover is rare for me – and if there is one then it’s usually because of a non SQL SERVER related problem – such as a disk issue.

I am currently investigating a few options for getting notified of a cluster failover, these include,

  • Periodically polling all the cluster instances and comparing the current node with the preferred node. This can be done via linked servers or with a custom .NET app.
  • Using another monitoring tool such as SCOM.
  • Implementing an SQL startup procedure to notify me when the instance restarts.

However, recently, I have been thinking about a solution involving extended events.

I am wondering if it is possible to catch a “shut down” event and send a notification. Assuming that this was possible then if the SQL SERVER was shut down cleanly then I guess the notification would be sent.

However, I guess that a sudden shut down of the machine would cause the notification not to be sent.

On the reverse side of this, I could possibly catch a “startup” event. However, this would be of no use if the SQL SERVER – on the other node –  did not re-start at all.

So, I’d be interested in any comments / advice on how other people handle the cluster failover notification issue please.

You can leave a comment here if you like – then everybody will get to see it – or if you like drop me an email through martin at martinz.co.nz.

Cheers

Martin.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating