Log Shipping Sync Issue

  • Hey all,

    I've had an ongoing problem for a couple of weeks, whereby I cannot get my standby database to synchronise with the live one for log shipping.

    I took the following steps:

    1. Stopped all log shipping and backup jobs

    2. Took full backup of live database

    3. Copied to standby and restored

    4. Checked that current log backup was too old as expected;

    (The log in this backup set terminates at LSN 2827675000000263400001, which is too early to apply to the database. A more recent log backup that includes LSN 2828010000000320100001 can be restored. [SQLSTATE 42000] (Error 4326) RESTORE LOG is terminating abnormally. [SQLSTATE 42000] (Error 3013). The step failed.)

    5. Took a fresh log backup and attempted to restore it;

    (The log in this backup set begins at LSN 2829000000000422000001, which is too late to apply to the database. An earlier log backup that includes LSN 2828010000000320100001 can be restored. [SQLSTATE 42000] (Error 4305) RESTORE LOG is terminating abnormally. [SQLSTATE 42000] (Error 3013). The step failed.)

    6. Restored Headeronly on this logfile:

    (FirstLSN: 2829000000000422000001

    LastLSN: 2829001000000494700001

    CheckPointLSN: 2829001000000486600007

    DifferentialBaseLSN: 2827891000000107500021)

    7. Checked backup history of live db (msdb\backupmediafamily):

    (Showed ONLY;

    Log backup *Step 4

    Full Backup *Step 2

    Log Backup *Step 5

    ...and NO interim log backup)

    This issue began originally when a contractor was brought in to look at system backups and ended up adding db backups without consulting me and ended up knocking out my whole series of database maintenance plans and log shipping. I was assured that they were removed, but I have had these issues continuing with one single database, and there is no history of another log backup within the database. Is it possible that a program can be taking log backups without adding to the backup history of a db?

    Also where would be the best place to check for a Filepath against a particular LSN reference (i.e. a specific log in the sequence).

    Does anyone have any good queries? 🙂

    All help greatly appreciated, this one is beginning to bug me!

  • On the primary, have you moved any log backups prior to you full backup out of the way.

    On the secondary remove any log backups prior to your full backup from the log shipping share.

    then when the restore job runs it will look for the next file after the last one successfully restored, only find you post full backup log file, and correctly restore that.

    ---------------------------------------------------------------------

  • On the primary, I overwrite previous log backups, with the current, and it has been working fine for about 3 years prior to this without fail.

    No logs were moved, all backups taken by me have been backed up to the same place. I have repeated the above steps a couple of times.

    Like I say, the history in the database, ONLY shows the backups that I take, and trying to apply any of these fails, because it appears the standby is looking for an additional log backup taken in between the ones I take...?

    Again, the logs are xcopied to a share using an Agent Job (they're not manually copied) and overwrite the previous copy each time.

  • so this is not logshipping wizard (sql 2000 yes) but in-house built.

    Could be a truncation of the log which would be reported in errorlog

    'with init' hasn't got lost has it?

    this query returns backup history:

    select bs.database_name,bs.user_name,bs.first_lsn,bs.last_lsn,bs.backup_start_date,bs.type, convert(varchar(200),bmf.physical_device_name) as 'file'

    from msdb.dbo.backupset bs, msdb.dbo.backupmediafamily bmf

    where bs.database_name = 'yourdb'

    and bs.media_set_id = bmf.media_set_id

    order by bs.backup_start_date

    ---------------------------------------------------------------------

  • OK thanks for the query 🙂

    Here is the log for the latest 3 backups in the history;

    #1 Log Backup:

    First LSN: 2827651000000421700001

    Last LSN: 2827675000000263400001

    #2 DB Backup:

    First LSN: 2827890000000463800001

    Last LSN: 2828010000000320100001

    #3 Log Backup:

    First LSN: 2829000000000422000001

    Last LSN: 2829001000000494700001

    This suggests a big gap between the db backup and the log backup as I expected? Do you agree, and would this suggest a 3rd party tool backing up the log in the meantime, or should this be captured in the db backup history anyway?

  • Infact the specific lsn in the error appears to be in the full database backup file so is the issue that it not being restored to the standby correctly?

  • Most, BUT NOT ALL, third party tools write to msdb the same way as SQL native.

    You are definitely missing a log backup or the log is being truncated, possibly a step at the end or beginning of the full backup.

    ---------------------------------------------------------------------

  • The explanation for the above missing lines is that a truncate was carried out as part of a bigger process, by another DBA.

    It was a one-off so doesn't affect any of the earlier ones.

    I have just run the series of database and log backups again and found that the error continually refers to a LSN contained within the full backup's log file. It appears as if when I restore the database onto the standby server it doesn't properly apply the tail-end transactions included in the full backup series.

    Does anyone have any suggestions?

  • Just a bit of extra info.

    It is the very final LSN in the database backup which is causing a problem....

    When I restore headeronly the last lsn is "2830931000000492700001" which is thrown up in the error when I try to apply the log;

    "The log in this backup set begins at LSN 2831670000000139700001, which is too late to apply to the database. An earlier log backup that includes LSN 2830931000000492700001 can be restored. [SQLSTATE 42000] (Error 4305) RESTORE LOG is terminating abnormally. [SQLSTATE 42000] (Error 3013). The step failed."

    Don;t know if this makes a difference, but any help/suggestions would be much appreciated...

    Thanks

  • going to have to refer you to my previous answer, there is no other possible cause of this.

    ---------------------------------------------------------------------

  • I'll give you an overview of my jobs. I can;t really see a problem with them at all, and they were working fine until a few weeks ago...

    BACKUPS

    1. Full Database backup;

    BACKUP DATABASE MYDB TO DISK = 'X:\SQLServer\MSSQL\Backup\MYDB.BAK' WITH INIT

    2. Log Backup

    BACKUP LOG MYDB TO DISK = 'X:\Standby\MYDB_LOG.BAK' WITH INIT

    LOG SHIPPING

    1. Standby Server Log Shipping Job

    Step1: xcopy "\\server1\standby\MYDB_LOG.bak" "\\server2\Standby\" /Y

    Step2: C:\cmdscripts\killspids.cmd server2 MYDB **(Simple Job to kill any processes on the standby copy of the db)**

    Step 3: RESTORE LOG MYDB FROM DISK = 'X:\Standby\MYDB_LOG.BAK' WITH STANDBY = 'Y:\SQLServer\MSSQL\Data\MYDB_undo.ldf'

    **The above part was all working fine up until a couple of weeks ago.

    I then created the below job to resynchronise the standby server (copying it overnight to avoid slowing down the network as it is a 400GB file going between 2 sites)

    Step 1: Copy File

    xcopy "\\server1\Backup\MYDB.bak" "\\server2\mydbbak\" /Y

    Step 2: Kill SPID's

    C:\cmdscripts\killspids.cmd server2 MYDB (As in log shipping job)

    Step 3: Drop Database

    DROP DATABASE MYDB

    Step 4: Restore DB with Move

    RESTORE DATABASE MYDB FROM DISK = 'R:\mydbbak\MYDB.BAK'

    WITH STANDBY = 'Y:\SQLServer\MSSQL\Data\MYDB_undo.ldf',

    MOVE 'MYDBTest_data' TO 'G:\SQLServer\MSSQL\Data\MYDB_Data.mdf',

    MOVE 'MYDBTest_log' TO 'H:\SQLServer\MSSQL\Data\MYDB_log.ldf', REPLACE

    Another thing I should add is that the db's mdf file takes up 550GB on disk, while over 100GB of this is empty space, and the backup file is approximately 400GB. This is because of a system job which creates a lot of temporary data and needs to be run on a fairly regular basis. We took the decision to leave it at the maximised size, as it took approximately 24 hours longer to run the job when it had to go through growth operations. I again doubt that it will be a factor in the log shipping issue, but thought I would throw it in there, incase there is a possibility that the backup job may be truncating the end of the file automatically, with the potential to remove a portion of the log? A longshot I know, but I really can't see a problem with any of the jobs I have set up, unless as I originally thought, there is an outside source creating backups...

  • In this scenario

    1.stop the jobs

    2.check out wit restore file list only from disk=' '

    3.see wht r the files there in tht thn

    4.if data or log file got added in primary server thn

    5.write this script in secondary server

    restore log 'dbname' from disk='path'

    with move 'new file name' to 'path of new file'

    standby or norecovery='dbname'

    6.again start the jobs.

  • I did exactly that, there were no interim backups listed in SQL server, and the only ones restored by me were in logical order, but the problem still kept reoccurring.

    I never really got to the bottom of the cause of this issue, but the removal of SQL permissions for a service account which operated the third party tool coincided with the problem resolving.

    I was still assured that no jobs were set up to take backups, so it "may" be coincidental, but this was definitely a strange issue that was outwith normal backup/restore procedures.

    Thanks for replying all though, and the suggestions will certainly help for resolving similar problems in the future.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply