Availability Groups Removed & Readded Database Problem :exclamation:

  • I had a bit of a server meltdown this morning that necessitated me removing one database from my availability group (on 4 servers) so that I could fix a runaway transaction issue. As I resolved that problem, I took a FULL backup of the database and several transaction log backups over the course of several hours, with the last backup being minutes before I restored the database to all 4 servers (NO RECOVERY on the 3 secondaries). But now I cannot for the life of me get the secondaries to join up to the availability group.

    I tried joining the DB on the primary server using Full and everything errored out because the databases already existed on the secondary servers (Databases XYZ already exists on the server instance that hosts the secondary replica). I tried using Skip initial data synchronization and when I went to the secondary database, they wouldn't join because there wasn't a recent transaction log (even though I'd made the backups 5 minutes before I did the restores). I tried Join only and got the same result as I did on the secondary servers (The mirror database, XYZ, has insufficient transaction log data to preserve the log backup chain of the principal database. This may happen if a log backup from the principal database has not be taken or has not been restored on the mirror database).

    This is incredibly frustrating because I took the transaction log backups right before I did the NO RECOVERY restores... Does anyone have any advice on this one?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Sounds strange.
    Test one server by deleting the secondary and populating it from the Primary maybe?

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle

  • Henrico Bekker - Monday, March 13, 2017 11:52 AM

    Sounds strange.
    Test one server by deleting the secondary and populating it from the Primary maybe?

    That's not an option I like. The database is huge and I don't want to bring down our production bandwidth with that kind of long-term synch.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin - Monday, March 13, 2017 12:02 PM

    Henrico Bekker - Monday, March 13, 2017 11:52 AM

    Sounds strange.
    Test one server by deleting the secondary and populating it from the Primary maybe?

    That's not an option I like. The database is huge and I don't want to bring down our production bandwidth with that kind of long-term synch.

    Sorry, I know that's not ideal.
    Been giving it some thought, would like to know what the cause is once you establish it.

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle

  • Sigh. After all the futzing, it looks like my only option is to take another FULL backup with a TLOG afterwards, then restore those and hope it works this time.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin - Monday, March 13, 2017 11:41 AM

    I had a bit of a server meltdown this morning that necessitated me removing one database from my availability group (on 4 servers) so that I could fix a runaway transaction issue. As I resolved that problem, I took a FULL backup of the database and several transaction log backups over the course of several hours, with the last backup being minutes before I restored the database to all 4 servers (NO RECOVERY on the 3 secondaries). But now I cannot for the life of me get the secondaries to join up to the availability group.

    I tried joining the DB on the primary server using Full and everything errored out because the databases already existed on the secondary servers (Databases XYZ already exists on the server instance that hosts the secondary replica). I tried using Skip initial data synchronization and when I went to the secondary database, they wouldn't join because there wasn't a recent transaction log (even though I'd made the backups 5 minutes before I did the restores). I tried Join only and got the same result as I did on the secondary servers (The mirror database, XYZ, has insufficient transaction log data to preserve the log backup chain of the principal database. This may happen if a log backup from the principal database has not be taken or has not been restored on the mirror database).

    This is incredibly frustrating because I took the transaction log backups right before I did the NO RECOVERY restores... Does anyone have any advice on this one?

    Hi Brandie,

     May i know what was the status of the databases in the secondary replica after you've joined them to AAG?

    Best regards,
    Ricky

  • Have you turned off the T-LOG backup job while trying to sync. This can cause synching to fail in my experience.
    How often is the T-log being taken.

  • Ricky Valencia - Monday, March 13, 2017 9:00 PM

    Brandie Tarvin - Monday, March 13, 2017 11:41 AM

    I had a bit of a server meltdown this morning that necessitated me removing one database from my availability group (on 4 servers) so that I could fix a runaway transaction issue. As I resolved that problem, I took a FULL backup of the database and several transaction log backups over the course of several hours, with the last backup being minutes before I restored the database to all 4 servers (NO RECOVERY on the 3 secondaries). But now I cannot for the life of me get the secondaries to join up to the availability group.

    I tried joining the DB on the primary server using Full and everything errored out because the databases already existed on the secondary servers (Databases XYZ already exists on the server instance that hosts the secondary replica). I tried using Skip initial data synchronization and when I went to the secondary database, they wouldn't join because there wasn't a recent transaction log (even though I'd made the backups 5 minutes before I did the restores). I tried Join only and got the same result as I did on the secondary servers (The mirror database, XYZ, has insufficient transaction log data to preserve the log backup chain of the principal database. This may happen if a log backup from the principal database has not be taken or has not been restored on the mirror database).

    This is incredibly frustrating because I took the transaction log backups right before I did the NO RECOVERY restores... Does anyone have any advice on this one?

    Hi Brandie,

     May i know what was the status of the databases in the secondary replica after you've joined them to AAG?

    Best regards,
    Ricky

    The problem was they weren't joining to the AG.

    I got the issue resolved by deleting the databases on the secondaries, taking another FULL backup and another TLOG backup, restoring them to the secondaries that way.

    I'm wondering if something got missed in the original restore, but given how SQL complains when a Tlog is missing during a restore, that doesn't quite make sense to me. Why would SQL let me restore a db with a FULL backup and 10 (for example) Tlog backups if one of the Tlog backups in the middle is missing? Unless NO RECOVERY is different and doesn't care about the order of the Tlog restores...

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Talib123 - Thursday, March 16, 2017 4:00 AM

    Have you turned off the T-LOG backup job while trying to sync. This can cause synching to fail in my experience.
    How often is the T-log being taken.

    I did turn off the job before I did any of this. All my log backups were taken manually so I would know what backup went where (order wise) and what I needed to restore to get everything synched again. It still failed until I did another FULL and TLog backup after the failures.

    The whole process was very frustrating.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin - Thursday, March 16, 2017 4:23 AM

    Ricky Valencia - Monday, March 13, 2017 9:00 PM

    Brandie Tarvin - Monday, March 13, 2017 11:41 AM

    I had a bit of a server meltdown this morning that necessitated me removing one database from my availability group (on 4 servers) so that I could fix a runaway transaction issue. As I resolved that problem, I took a FULL backup of the database and several transaction log backups over the course of several hours, with the last backup being minutes before I restored the database to all 4 servers (NO RECOVERY on the 3 secondaries). But now I cannot for the life of me get the secondaries to join up to the availability group.

    I tried joining the DB on the primary server using Full and everything errored out because the databases already existed on the secondary servers (Databases XYZ already exists on the server instance that hosts the secondary replica). I tried using Skip initial data synchronization and when I went to the secondary database, they wouldn't join because there wasn't a recent transaction log (even though I'd made the backups 5 minutes before I did the restores). I tried Join only and got the same result as I did on the secondary servers (The mirror database, XYZ, has insufficient transaction log data to preserve the log backup chain of the principal database. This may happen if a log backup from the principal database has not be taken or has not been restored on the mirror database).

    This is incredibly frustrating because I took the transaction log backups right before I did the NO RECOVERY restores... Does anyone have any advice on this one?

    Hi Brandie,

     May i know what was the status of the databases in the secondary replica after you've joined them to AAG?

    Best regards,
    Ricky

    The problem was they weren't joining to the AG.

    I got the issue resolved by deleting the databases on the secondaries, taking another FULL backup and another TLOG backup, restoring them to the secondaries that way.

    I'm wondering if something got missed in the original restore, but given how SQL complains when a Tlog is missing during a restore, that doesn't quite make sense to me. Why would SQL let me restore a db with a FULL backup and 10 (for example) Tlog backups if one of the Tlog backups in the middle is missing? Unless NO RECOVERY is different and doesn't care about the order of the Tlog restores...

    NORECOVERY doesn't alter the sequence requirement. They still must be restored in order. A missing tlog bak would cause the restore to fail.

    It sounds like too many more transactions occurred on the primary before the secondary was attempted to be added, or as was mentioned a log backup was taken somewhere that marked the recovery lsn to be too far off for you to join (mirroring, log shipping an unknown backup process like server snapshots or commvault running somewhere).

    Out of curiosity, did you try to take a subsequent log backup from the primary and restore it to the secondary(ies) (besides the tlog bak taken 5 minutes before full was restored)?

    A query to the msdb database on the primary to get a list of the backups taken would help solve some of these mysteries.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • We are experiencing the same exact problem. I believe this issue was fixed with KB3152965. 
    https://support.microsoft.com/en-us/help/3152965/fix-error-1478-when-you-add-a-database-back-to-the-alwayson-availabili
    The KB states that the fix is included in SP3, however we have SP4 applied to these servers already.
    Is it possible the the fix is not included in SP4? I read the KB for SP4 and it is not listed, but is it assumed that it includes everything from SP3?

    Currently running MS SQL 2012 SP4 KB8014073

    Thanks,
    Chris

  • Chris.null - Thursday, December 27, 2018 1:15 PM

    We are experiencing the same exact problem. I believe this issue was fixed with KB3152965. 
    https://support.microsoft.com/en-us/help/3152965/fix-error-1478-when-you-add-a-database-back-to-the-alwayson-availabili
    The KB states that the fix is included in SP3, however we have SP4 applied to these servers already.
    Is it possible the the fix is not included in SP4? I read the KB for SP4 and it is not listed, but is it assumed that it includes everything from SP3?

    Currently running MS SQL 2012 SP4 KB8014073

    Thanks,
    Chris

    Sounds like this fix is for a very specific issue and does not necessarily cover all possible problems. The alternative is that SP4 broke the SP3 fix. Which MS has done multiple times in multiple different software updates.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply