Rebooting clustered SQL 2014 problem

Question

Rebooting clustered SQL 2014 problem

Raymond van Laake

SSCarpal Tunnel

Points: 4212
More actions
April 6, 2017 at 1:58 am

#401903

Hi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:
1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to Primary
The problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

rvsc48 SSCertifiable Points: 7685 More actions · Answer 1

Hello Raymond, so, when you say "synched" do you mean after a failover the databases becoming available? Or, is there an additional HA option being performed, such as log shipping, mirroring, or replication? Sql Server, during a cluster failover, performs a restart of the sql instance, which in turn prompts a recovery of all the databases within the instance. One thing which can cause the restore times to be expanded is possibly the number of vlfs for each databases. If one or some of them have too many (>1000) for example, this can cause significant delays in your recovery times. As a check, you may want to run a query which checks the number of vlfs per database. That code is [dbcc loginfo].

Perry Whittle SSC Guru Points: 234013 More actions · Answer 2

RVSC48 - Thursday, April 6, 2017 6:52 AM
Sql Server, during a cluster failover, performs a restart of the sql instance, which in turn prompts a recovery of all the databases within the instance.

For a failover cluster instance but i believe the OP is referring to an Availability Group here

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

Perry Whittle SSC Guru Points: 234013 More actions · Answer 3

Raymond van Laake - Thursday, April 6, 2017 1:58 AM
Hi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:
1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to Primary
The problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?

Is this an alwayson availability group configuration you're referencing here

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

angeloc SSC Enthusiast Points: 138 More actions · Answer 4

Perry Whittle - Thursday, April 6, 2017 7:15 AM
Raymond van Laake - Thursday, April 6, 2017 1:58 AM
Hi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:
1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to Primary
The problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?
Is this an alwayson availability group configuration you're referencing here

Before your fail over, check your HA redo queue - if some queues are large this may be what is impacting your failover. Also suggest taking a log backup of all your DB's on the Primary before fail over.

See how that goes.