AG-Group primary during reboot

  • New to ag groups.  I have a 2 node aggroup.  We are doing maintenance tonight and plan to reboot the secondary and have it come up and then do the primary.  I am wondering what happens when the primary is rebooted with out triggering a failover to the secondary?    Does the listener move to the secondary and make it the new primary?  I have my sysadmins doing the reboot and don't want them to have to mess with failover and back but wondering if my secondary will be come the new primary after all this?   Fail over is set to automatic.

     

    • This topic was modified 5 months, 1 week ago by  mlorek.
  • I could be mistaken, but I think it depends on the configuration. I am hoping this is your test system and not live and if so, I'd say try it out and see what happens. If this is prod, I'd be a LOT more concerned that you have things turned on and configured on prod and are doing prod testing. But maybe test system is not an option...

    My understanding is that with failover set to automatic, if the primary node goes down, the secondary will start hosting. Depending on the configuration, when the primary node comes back up, it may start hosting or it may continue to be hosted on the secondary.

    So for you first question, since you have failover set to automatic, when the primary goes down, the secondary should start up.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks for the reply Brian.  It is a production instance set up by someone else that I inherited.     I have looked and looked to see where in configuration the primary node coming back on line and taking back hosting exists and cant seem to find that?  Any ideas where those settings would exist?

     

     

  • Quick google, it looks like failback isn't something that happens with AG's. It is something you can set up, but it's not a setting in the AG. So when the primary goes down, secondary comes up. When secondary goes down, primary goes up. There is no auto-failback.

    We use a tool called DxEnterprise for our failover and it is much easier to set up than AG's we found and it allows for failback after a failover if the primary comes back online. So my memory was a bit fuzzy between DxEnterprise and the AG tool.

    Best guess why you wouldn't want to do that is if the failover occurs during company uptime, the failover is usually quick so downtime is minimal. That's the main point of the AG. BUT if you have failback, you now have 2 interruptions to your system instead of just 1 and during company uptime, I tend to like minimal downtime. That being said, there are use cases where you DO want failback. If you have multiple instances and they are hosted across multiple servers (server A hosts instance Z and server B hosts instance Y), one server going down, fails over to secondary, but due to resource constraints, you may want it to fail back as soon as possible. So we do have some set to failback (high memory or high CPU SQL instances fail back, low memory and low CPU instances stay where they are).

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Totally make sure the AG is fully synchronised before rebooting the second server.

    If the AG is not synchronised then you get to choose which version of the DB you want to keep and which one to trash. Lots of manual work required and not something you want to do on a production system.

    IMHO you either need to train your sysadmin in how to troubleshoot AGs so they can confirm all is OK before booting the second server, or you need to do those checks yourself and get the sysadmin to wait for your OK before doing the second reboot.

    At my old place we never planned to reboot both servers in a 2-node AG on the same day. The need to confirm everything was synchronised meant the start time of the second reboot became variable, and this in turn added to business risk.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply