April 6, 2017 at 1:58 am
Hi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:
1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to Primary
The problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?
April 6, 2017 at 6:52 am
Hello Raymond, so, when you say "synched" do you mean after a failover the databases becoming available? Or, is there an additional HA option being performed, such as log shipping, mirroring, or replication? Sql Server, during a cluster failover, performs a restart of the sql instance, which in turn prompts a recovery of all the databases within the instance. One thing which can cause the restore times to be expanded is possibly the number of vlfs for each databases. If one or some of them have too many (>1000) for example, this can cause significant delays in your recovery times. As a check, you may want to run a query which checks the number of vlfs per database. That code is [dbcc loginfo].
April 6, 2017 at 7:14 am
RVSC48 - Thursday, April 6, 2017 6:52 AMSql Server, during a cluster failover, performs a restart of the sql instance, which in turn prompts a recovery of all the databases within the instance.
For a failover cluster instance but i believe the OP is referring to an Availability Group here
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
April 6, 2017 at 7:15 am
Raymond van Laake - Thursday, April 6, 2017 1:58 AMHi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to PrimaryThe problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?
Is this an alwayson availability group configuration you're referencing here
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
April 6, 2017 at 6:10 pm
Perry Whittle - Thursday, April 6, 2017 7:15 AMRaymond van Laake - Thursday, April 6, 2017 1:58 AMHi there,
I reboot our clustered SQL 2014 (2 nodes, in VMWare environment) regularly. The basic steps are:1) check that everything is okay
2) reboot Secondary
3) once it is up again, and synchronized again, fail over from Primary to Secondary
4) reboot Primary
5) once it is up again, and synchronized again, fail over from Secondary to PrimaryThe problem I have is in steps 3) and 5): after a reboot of the server all databases (approx. 120) have to get synced again and this process most of the times does not finish. My solution is to restart the SQL Server Service on the machine that just got rebooted, after which in most cases all databases all get synced (occasionally a second restart of the service is necessary).
I can't imagine that this is what MS has intended as normal operations. What must I do to to improve my rebooting routine?
Is this an alwayson availability group configuration you're referencing here
Before your fail over, check your HA redo queue - if some queues are large this may be what is impacting your failover. Also suggest taking a log backup of all your DB's on the Primary before fail over.
See how that goes.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply