Second differential backup smaller than first one after full backup?

  • Hi,

    I'm testing some new backup processes here (a weekly job and a daily job).

    I ran my weekly job (these are just a subset of the steps...):

    3:00pm - Database Integrity Check of user databases
    3:30pm - Full backup of user databases
    4:00pm - Index and stats maintenance

    Total full backup size ~10gig.
    --------
    Then I manually ran my daily job:

    7:00pm - Database Integrity Check of user databases
    7:30pm - Differential backup of user databases

    Total diff backup size ~1.75gig
    -------

    Then, because I had actually scheduled the daily job to run at 11:00pm, it had run the two steps (dbcc, diff) again at that time.

    Total diff backup size 250mb

    If differential backups are cumulative based on the last full backup, why such a small diff backup size the second time around? I would've expected it to be larger if anything. Also, this is a development database and I'm fairly certain there weren't many, if any, changes to the database.

    And just to make sure I'm not missing something glaringly obvious (it is a Friday after all...), I checked Microsoft to see what they had to say about differential backups:

    "A differential backup performs the same operations as a full backup, but only contains all the data that has changed or been added since the previous full backup. A common misconception here is that differential backups are incremental. They are actually cumulative and successive differential backups after a full backup and will increase in size as more data is changed or added."

    I know asking about database backups seems so elementary but I just can't think of why the diff file size would decrease...

    Thanks in advance,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Mike Scalise - Friday, August 10, 2018 6:03 AM

    Hi,

    I'm testing some new backup processes here (a weekly job and a daily job).

    I ran my weekly job (these are just a subset of the steps...):

    3:00pm - Database Integrity Check of user databases
    3:30pm - Full backup of user databases
    4:00pm - Index and stats maintenance

    Total full backup size ~10gig.
    --------
    Then I manually ran my daily job:

    7:00pm - Database Integrity Check of user databases
    7:30pm - Differential backup of user databases

    Total diff backup size ~1.75gig
    -------

    Then, because I had actually scheduled the daily job to run at 11:00pm, it had run the two steps (dbcc, diff) again at that time.

    Total diff backup size 250mb

    If differential backups are cumulative based on the last full backup, why such a small diff backup size the second time around? I would've expected it to be larger if anything. Also, this is a development database and I'm fairly certain there weren't many, if any, changes to the database.

    And just to make sure I'm not missing something glaringly obvious (it is a Friday after all...), I checked Microsoft to see what they had to say about differential backups:

    "A differential backup performs the same operations as a full backup, but only contains all the data that has changed or been added since the previous full backup. A common misconception here is that differential backups are incremental. They are actually cumulative and successive differential backups after a full backup and will increase in size as more data is changed or added."

    I know asking about database backups seems so elementary but I just can't think of why the diff file size would decrease...

    Thanks in advance,

    Mike

    In my experience differentials always get larger and to restore you only require the latest differential and the latest full backup.
    Is it possible that someone has done a full backup after the larger differential hence the next differential is smaller?

  • Jonathan AC Roberts - Friday, August 10, 2018 6:09 AM

    Mike Scalise - Friday, August 10, 2018 6:03 AM

    Hi,

    I'm testing some new backup processes here (a weekly job and a daily job).

    I ran my weekly job (these are just a subset of the steps...):

    3:00pm - Database Integrity Check of user databases
    3:30pm - Full backup of user databases
    4:00pm - Index and stats maintenance

    Total full backup size ~10gig.
    --------
    Then I manually ran my daily job:

    7:00pm - Database Integrity Check of user databases
    7:30pm - Differential backup of user databases

    Total diff backup size ~1.75gig
    -------

    Then, because I had actually scheduled the daily job to run at 11:00pm, it had run the two steps (dbcc, diff) again at that time.

    Total diff backup size 250mb

    If differential backups are cumulative based on the last full backup, why such a small diff backup size the second time around? I would've expected it to be larger if anything. Also, this is a development database and I'm fairly certain there weren't many, if any, changes to the database.

    And just to make sure I'm not missing something glaringly obvious (it is a Friday after all...), I checked Microsoft to see what they had to say about differential backups:

    "A differential backup performs the same operations as a full backup, but only contains all the data that has changed or been added since the previous full backup. A common misconception here is that differential backups are incremental. They are actually cumulative and successive differential backups after a full backup and will increase in size as more data is changed or added."

    I know asking about database backups seems so elementary but I just can't think of why the diff file size would decrease...

    Thanks in advance,

    Mike

    In my experience differentials always get larger and to restore you only require the latest differential and the latest full backup.
    Is it possible that someone has done a full backup after the larger differential hence the next differential is smaller?

    Yes, same here in my experience. It's highly unlikely someone else has performed a full backup on this database. I've also checked msdb.dbo.backupset and there was only the one backup...

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • I'm sure you've thought of this, but is one backup compressed and one not?

  • Beatrix,

    Good thought. I checked and, as expected, they're both compressed. Also, the same job is what I ran manually and what ran on a schedule, so nothing changed parameter-wise between the two runs (and it explicitly specifies backup compression).

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • The most likely explanation is the log that has to be included in database backups to allow a restore using that backup file to recover the database to a consistent state.

    I suspect that the first differential backup just had to include quite a bit more log than the second one did. You could get a rough idea of whether this is the case by checking the first_lsn and last_lsn for those backups in the backupset table. There's probably a much larger gap between those for the first differential than the second.

    Cheers!

  • Jacob Wilkins - Friday, August 10, 2018 8:47 AM

    The most likely explanation is the log that has to be included in database backups to allow a restore using that backup file to recover the database to a consistent state.

    I suspect that the first differential backup just had to include quite a bit more log than the second one did. You could get a rough idea of whether this is the case by checking the first_lsn and last_lsn for those backups in the backupset table. There's probably a much larger gap between those for the first differential than the second.

    Cheers!

    This is what I was thinking when I started reading this thread.

  • Thanks for the replies! I checked the difference of the large diff backup's first and last lsn and it's:

    999757148000000

    The second, smaller one is:

    95500000

    If I did the math correctly.......

    So if it's what you guys said about the backup needing to include the log to be able to restore the database to a consistent state, then was it just a timing thing?

    And a follow-up question to that, do I need that first (huge) differential log backup anymore? I mean, I have the second one that's cumulative, 1/7th of the size, etc.

    Thanks,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • When restoring using a full backup and a differential backup you only need the latest differential backup if you are restoring the latest database.

  • Lynn Pettis - Friday, August 10, 2018 11:22 AM

    When restoring using a full backup and a differential backup you only need the latest differential backup if you are restoring the latest database.

    Lynn,

    That's how I understand it too, but then it's still not 100% clear to me why the first diff backup was 7x the size of the second. Clearly, the smaller one is sufficient for me to do a restore. Was it just a timing thing that applied to the first diff backup and not the second?

    Thanks,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Mike Scalise - Friday, August 10, 2018 11:28 AM

    Lynn Pettis - Friday, August 10, 2018 11:22 AM

    When restoring using a full backup and a differential backup you only need the latest differential backup if you are restoring the latest database.

    Lynn,

    That's how I understand it too, but then it's still not 100% clear to me why the first diff backup was 7x the size of the second. Clearly, the smaller one is sufficient for me to do a restore. Was it just a timing thing that applied to the first diff backup and not the second?

    Thanks,

    Mike

    No idea.  Could simply be that more of the transaction log needed to be backed up during the differential backup to ensure a consistent database on restore.

  • Check out https://technet.microsoft.com/en-us/library/2009.07.sqlbackup.aspx for an explanation of the amount of log needed by data backups.

    It could be a number of things, from the database's generating more log during the data reading portion of the first differential, or just there being an old active transaction when the first differential backup was run but not when the second was run.

    As a side note, doing math like that on those LSNs doesn't mean quite what you might think (it's really three separate numbers concatenated together).

    Cheers!

  • Jacob Wilkins - Friday, August 10, 2018 11:42 AM

    Check out https://technet.microsoft.com/en-us/library/2009.07.sqlbackup.aspx for an explanation of the amount of log needed by data backups.

    It could be a number of things, from the database's generating more log during the data reading portion of the first differential, or just there being an old active transaction when the first differential backup was run but not when the second was run.

    As a side note, doing math like that on those LSN numbers doesn't mean quite what you might think (it's really three separate numbers concatenated together).

    Cheers!

    Got it. Thank you. I'll need to do more research on the LSNs...

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Glad to help!

    http://rusanu.com/2012/01/17/what-is-an-lsn-log-sequence-number/ is a good place to start for the LSN.

    Cheers!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply