Error when attempting to do initial backup for log shipping

  • We have several databases that we log ship from our primary production SQL instance to a reporting instance.  We've been doing this for years now and it's been working almost flawlessly most of the time.  However, we've come across an issue.

    Over the weekend, we had a power outage in our DEV environment.  The host server did not handle it well, and a result was corruption across a couple of different databases, and log shipping getting broken on some others.  Which leads me to my problem.

    When I attempt to use the wizard to setup log shipping for a database, I tell it to create an initial backup, which it will then restore onto the secondary.  I am getting the following error immediately when I complete the wizard:

    System.Data.SqlClient.SqlError: The backup cannot be performed because 'INIT' was requested after the media was formatted with an incompatible structure. To append to this media set, either omit 'INIT' or specify 'FORMAT'. Alternatively, you can create a new media set by using WITH FORMAT in your BACKUP statement. If you use WITH FORMAT on an existing media set, all its backup sets will be overwritten. (Microsoft.SqlServer.Smo)

    The only workaround I have found is to create the database backup manually with NOINIT, then go back into the wizard and tell it to use that backup for the initial restore over to secondary.

    We recently implemented TDE in this environment, so I am not sure if that is the root of the problem, or if it has to do with the power outage.  Mainly, I am wondering is there a way to force the wizard to do the backup with NOINIT since that appears to be the fix?

  • An additional issue...

    The restore job on the secondary server is simply skipping over the logs for one of the affected databases.  After completing the wizard, I ran the backup job manually on the primary to create a new .trn and then ran the restore job on the secondary.  It reports success, but looking at the job history I see the following for each .trn in the target folder, including the new one I just created:

    Date 7/6/2021 3:43:56 PM

    Log Job History (LSRestore_myServer_CRMReports)

    Step ID 1

    Server reportSQLSvr

    Job Name LSRestore_myServer_CRMReports

    Step Name Log shipping restore log job step.

    Duration 00:00:27

    Sql Severity 0

    Sql Message ID 0

    Operator Emailed

    Operator Net sent

    Operator Paged

    Retries Attempted 0

    Message

    2021-07-06 15:44:23.07 Skipped log backup file. Secondary DB: 'CRMReports', File: '\\reportSQLSvr\f$\TransactionLogs\CRMReports\CRMReports_20210706194333.trn'

    2021-07-06 15:44:23.07 Could not find a log backup file that could be applied to secondary database 'CRMReports'.

    2021-07-06 15:44:23.07 The restore operation was successful. Secondary Database: 'CRMReports', Number of log backup files restored: 0

    2021-07-06 15:44:23.07 Deleting old log backup files. Primary Database: 'CRMReports'

    2021-07-06 15:44:23.10 The restore operation was successful. Secondary ID: '8a44b218-5931-4127-b2d9-9ba8dbc9ef65'

    2021-07-06 15:44:23.12 ----- END OF TRANSACTION LOG RESTORE -----

    Exit Status: 0 (Success)

    So basically, it sees the new .trn but for whatever reason decides that it doesn't need to restore it.

  • To me this looks like 2 different issues.  The first issue I expect is being caused by trying to take a full backup with INIT on top of an existing backup.  Something has changed at some point so the media header isn't valid for that backup with INIT, so it is expecting and requiring you to do a new backup file.  Using NOINIT, as you did, is likely not a good option as then instead of it being an empty backup file, it is appending to the end of the backup file and now you have 2 backups in that one backup file.

    This is what I expect is leading to your second problem that the TLOG backups aren't being applied.  The secondary is using an old backup file (the wrong one from the backup created with NOINIT) and the log is too late in the chain to apply.

    What I would do at the point you are at is rename the existing full backup to a new name, do a fresh backup to a brand new file, and then use that one for the wizard.

    NOTE - I have never set up or used log shipping as my environment hasn't needed it.  I am just reading the errors you got and the "workaround" you did and doing a "best guess".

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • That makes sense.

    One thing I ended up trying since I posted was to decrypting one of the databases.  I was able to use the wizard normally at that point - it created the backup, did the restore, created the jobs, and I confirmed that a new log backup got applied successfully.  Then, I re-encrypted the database, and did another log backup and restore, and that worked as well.  Then I tried the whole thing again from scratch with another database and got the same error when trying to run the wizard with the database encrypted.

    So it seems like - at least on this box - that wizard backup fails if the database is encrypted.  Whether that is due to some underlying issue caused by our power loss, or if it is some issue with the wizard, I do not know - I've never had to recreate log shipping with TDE in place.  I have a separate fresh environment that I'm going to use to test that out tomorrow.

    Basically, we lost power to the storage but not the host that holds our virtual servers; so the VMs didn't come down clean at all.  Lost the msdb and tempdb databases on the primary box, lost master on the secondary.  Fortunately I had copies of master and msdb, and tempdb I could recreate with a restart.  And one of the user databases had so much going on that CHECKDB kept crashing SSMS - I guess the result set was too big.

    Point is, I wouldn't be surprised if there is a LOT still going wrong under the hood 😀

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply