Backups and Transactions Log relationships

  • Hi,

    I’ve just started out in a DBA role and have picked up a few things so far, but I’m still a bit confused on how the Transaction Logs and Backups work together.

    My understanding was that in Full Recovery mode you should try and back up you log say every 30 minutes in case your log became corrupt.

    I thought the log backups would get bigger each time as the log grew throughout the day then when the database was backed up it would run a CHECKPOINT as part of the backup procedure, and commit all the transactions in the log to the database then back the database up so it’s all up to date.

    When you next backup the Transaction Log after a full backup has been done (i.e. all the transactions have been committed to the Database) it is 'truncated' at which point all completed transactions will be wiped from the log file and the space that they occupied will be made available for future transactions, the log is basically emptied.

    I now realise that this is not the case, as I’ve found out that the log is truncated every time it is backed up. So I don’t understand where these transactions go from the live log as it does not keep on getting bigger due to the truncate each time.

    Is it as simple as if you have 20 log backups done in a day which contain all the transactions (the current live log size is not as big if you add all these backups up!) these are all written to the database from the backups taken when the CHECKPOINT is run?

    If this is correct then obviously you must not get rid of any of these backups in error?

    Thanks in advance

  • Hello,

    If you wish to restore a database to a point in time you will need

    A full backup to restore from

    + A differential backup to restore from (optional, contains all changes since last full backup)

    + Chain of transactionslogbackups since last full or differential backup (whatever came last).

    (called LSN-chain)

    The backup of the transactionlog keeps the transactionlog small.

  • Bristolred (12/3/2008)


    I now realise that this is not the case, as I’ve found out that the log is truncated every time it is backed up. So I don’t understand where these transactions go from the live log as it does not keep on getting bigger due to the truncate each time.

    When you backup a log, the inactive log records get written to the backup file that you create, and then those log records are discarded. Hence each log backup only contains the log records from between that log backup and the previous one.

    Maybe read through this:

    http://www.sqlservercentral.com/articles/64582/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the replies.

    I sort of get it or maybe my initial understanding is wrong, but could you clear this up for me then please.

    If in Full Recovery Mode a backup of the log was not taken at all through the day what would be the process.

    Would all the transactions in the log be written to the database when it was next backed up and is it the checkpoint process that does this? If this is the case when the log is backed up through out the day, it backs up what is in the log at the time, and then empties it, then say 30 minutes later it backs up the log again and empties it and so on.

    When the database is then backed up are all the transactions in all these log backups then written to the database and I assume anything in the live log at the time that hasn't been backed up i.e. the time between your last log backup and your scheduled timed database backup.

    Sorry to be a pain but I really want to get my head around this.

    Thanks

  • Three steps backwards....

    When data is changed in SQL, the changes are made to the data pages in memory. The record of the change is then written to the transaction log and written to disk. Once the change is in the transaction log, the transaction is considered complete.

    At a later time, the checkpoint will write dirty data pages to disk. The checkpoint is not ties to the backups, it runs at regular intervals.

    Bristolred (12/4/2008)


    Would all the transactions in the log be written to the database when it was next backed up and is it the checkpoint process that does this?

    No. The backup has nothing to do with when the data is written to disk. See above

    If this is the case when the log is backed up through out the day, it backs up what is in the log at the time, and then empties it, then say 30 minutes later it backs up the log again and empties it and so on.

    Yup. Though 'empties' isn't quite accurate. Only inactive log records are discarded - ones that won't be needed for rollback or database recovery (in the case of a restart)

    When the database is then backed up are all the transactions in all these log backups then written to the database and I assume anything in the live log at the time that hasn't been backed up i.e. the time between your last log backup and your scheduled timed database backup.

    No. The backup has nothing to do with when the data is written to disk. See above

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My understanding was that in Full Recovery mode you should try and back up you log say every 30 minutes in case your log became corrupt

    I guess u've not understood the fundamentals right. We take log backups to minimize the data loss due to unforseen events leading to corruption of your database.

    eg. say you take full database backup at 10 PM and take transaction log backups at 2 AM, 6 AM, 10 AM..... if your database gets corrupted at say 10.30 AM, and you want to restore the database with all records, you will need to restore the full database backup taken at 10PM, restore all the transaction logs backup files (taken after the full backup) in sequence(in this case 10 AM full backup file, tran log@ 2AM, tran log@ 6AM and tran log@10 AM). (this includes a concept of tail log backup which you can take up seperately)

    your prime objective to take log backup is to restore data, not for recovering the log file in case it gets corrupt.

    Transaction logs are incremental in nature. in above example, transaction log backup taken at 6 AM will contain complete transaction from 2 AM till 6 AM only and has nothing to do with the complete backup. when you take a transaction log backup, the entries are moved to an external file and the log file gets cleared (all inactive entries only).

    It also seems u're confused transaction log backup with differential backup concept...

    if using the same example, we're taking full backup at 10 PM and DIFFERENTIAL backups at 2 AM, 6 AM and 10 AM. the 2 AM differential backup file will contain entries from 10 PM till 2 PM; 6AM file will contain entries from 10 AM till 6AM, 10 AM file will contain entries from 10 PM till 10 AM.

    Put it simply, differential backups contain entries since the time last FULL backup was taken and bears no relation with previous differential backup.

    Transaction log backup contains entries starting from previous transaction log backup; and thereby forming a chain.

    and you are right, transaction logs are truncated everytime its backed up. Full and differential backups DO NOT truncate log files.

    Also, do not mix CHECKPOINT with backup. it relates to data being physically written to disk. Eg, users are adding/deleting records from your database, each such activity is not directly written to the disk. It remains in memory cache for a while before CHECKPOINT forces these modified pages (called dirty pages) to be written back to the disk.

    for basics on backup restore refer http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlbackuprest.mspx



    Pradeep Singh

  • Hi again,

    Thanks to you all for helping me get my head around this which I feel I am quite close to doing? (Or maybe I’m going to look a right fool for saying that?)

    I’ve typed up the following using the info you have all given me, plus other bits that I have found along the way. Can you have a read and see if I’m now on the right track to understanding it a bit better!

    How does SQL use the log?

    When changes are made to a database those changes are first written (hardened) to the log file (LDF) and the data pages are changed in memory. The data pages changed in memory will be written to disk (datafile - MDF) at a later time either by the lazy writer process or by the checkpoint process. We can then say that the transaction is now committed or it is inactive when it has been written.

    One important point to think is, all the inactive transactions in the log file (transactions which completed their tasks and have now been written to the data file) are known by SQL Server, as it is smart enough to think, inactive transactions are already in the data file and have also been backed up (by transactional log backup) hence it thinks, this is the time to get rid of this data and it removes all the inactive transactions. But for Active transactions, which are either incomplete or could not complete because of disaster (sudden power failure) will be stored in transactional log and will be called active transactions until they are in the data file.

    When a server is restarted, SQL uses the transaction log to see if, at the point the server shut down there were any transactions that had completed but whose changes may not been written to disk, or any transactions that had not completed. If there are then the modifications that may not have been written to disk are replayed (rolled forward) and any that had not completed are rolled back. This is done to ensure that the database is in a consistent state after a restart.

    Lazy Writer Process

    The lazy writer process sleeps for a specific interval of time. When it is restarted, it examines the size of the free buffer list. If the free buffer list is below a certain point, dependent on the size of the cache, the lazy writer thread (one Lazy Writer Thread per instance of SQL) scans the buffer cache to reclaim unused pages. It then writes dirty pages that have a reference count of 0. (When the reference counter goes to 0, the dirty page indicator is checked. If the page is dirty, a write is scheduled to write the modifications to disk) On the Windows 2000, Windows Server 2003, and Windows XP operating systems, most of the work populating the free buffer list and writing dirty pages is performed by the individual threads. The lazy writer thread typically has little to do.

    Checkpoint Process

    The checkpoint process also scans the buffer cache periodically and writes any dirty data pages to disk. The difference is that the checkpoint process does not put the buffer page back on the free list. The purpose of the checkpoint process is to minimize the number of dirty pages in memory to reduce the length of a recovery if the server fails. Its purpose is not to populate the free buffer list. Checkpoints typically find few dirty pages to write to disk, because most dirty pages are written to disk by the worker threads or the lazy writer thread in the period between two checkpoints

    Full Recovery Model

    In full recovery model transaction log entries are kept for both database integrity and database recovery purposes. Inactive log records are retained in the transaction log until a log backup is done which then removes them. In full recovery, all operations are fully logged, including operations that qualify as bulk operations. Full recovery can be difficult to manage as the log can grow beyond expected if transaction log backups don’t occur, or if there’s an increase in the amount of database activity that occurs between log backups. Because log records are not discarded until they have been backed up, a database in full recovery mode can be recovered to any time using a combination of full, differential and log backups.

    What happens when you back the log up in Full Recovery Mode

    When SQL Server finishes backing up the transaction log, it automatically truncates the inactive portion of the transaction log.This inactive portion contains completed transactions and is no longer used during the recovery process. Conversely, the active portion of the transaction log contains transactions that are still running and have not yet completed the process of being written to disk. SQL Server reuses this truncated, inactive space in the transaction log instead of allowing the transaction log to continue to grow and use more space.

    How to shrink a Transaction Log

    The physical size of the Transaction Log is not changed by the backup process just the amount of ‘free space’ contained within it. In order to reduce the physical size of the Transaction Log you have to 'shrink' it which would mean doing the following in this order. Back up the database, Back up the log, Shrink the log to reclaim the free space back to the OS. This answers the question to how a 5GB Transaction Log when backed up can only have a backed up size of say 500mb. (500mb of Active Transactions, the other 4.5ish GB is free empty space.

    Can I also ask another question which is this!

    In Full Mode the log is backed up frequently which keeps its size down. Is this the only reason for doing frequent backups?

    Let’s say that you don’t back the log up at all throughout the day. The usual process still carries on i.e. changes are made to a database, those changes are first written (hardened) to the log file (LDF) and the data pages are changed in memory then written to the data file by the lazy writer or checkpoint.

    The log file will continue to grow as the inactive transactions will not be truncated as no back ups are being done.

    If you then get a problem and the database gets corrupt can’t you then just take a log backup at this point, restore the database back to the last full backup and use the log backup just taken (omitting the problem) to restore? Or is it because there is a slim chance that your log may also became corrupt and possibly useless and with out the log backups taken through the day it will all be lost.

    Thanks you all for you help and I’m hoping that the above or at least 90% is now correct.

  • In Full Mode the log is backed up frequently which keeps its size down. Is this the only reason for doing frequent backups?

    You take backup of the log files to assist you in restoration during crisis and minimize your loss. When you take log backup, the committed entries are moved to the log backup file where as active transactions still remain in the log file(.LDF file). During this, committed entries are removed from the Log file and thereby reducing used space of the log files.

    Let’s say that you don’t back the log up at all throughout the day. The usual process still carries on i.e. changes are made to a database, those changes are first written (hardened) to the log file (LDF) and the data pages are changed in memory then written to the data file by the lazy writer or checkpoint.

    The log file will continue to grow as the inactive transactions will not be truncated as no back ups are being done.

    If you then get a problem and the database gets corrupt can’t you then just take a log backup at this point, restore the database back to the last full backup and use the log backup just taken (omitting the problem) to restore? Or is it because there is a slim chance that your log may also became corrupt and possibly useless and with out the log backups taken through the day it will all be lost.

    Yes if you are able to connect to the database, you can take the Log backup with norecovery option(this is called tail-log backup) and use this backup file in conjunction with the previous full backup to restore the database.

    Quoting from http://msdn.microsoft.com/en-us/library/ms179314.aspx

    If the database is damaged, for example, if the database does not start, a tail-log backup succeeds only if the log files are undamaged, the database is in a state that supports tail-log backups, and the database does not contain any bulk-logged changes.



    Pradeep Singh

  • Bristolred (12/11/2008)


    The data pages changed in memory will be written to disk (datafile - MDF) at a later time either by the lazy writer process or by the checkpoint process. We can then say that the transaction is now committed or it is inactive when it has been written.

    Almost. The transaction is committed once the COMMIT TRANSACTION is recorded in the log. The data pages may be written to disk quite a bit later. Once all of the changes to data pages are on disk, the log records for that transaction are considered inactive.

    On the Windows 2000, Windows Server 2003, and Windows XP operating systems, most of the work populating the free buffer list and writing dirty pages is performed by the individual threads. The lazy writer thread typically has little to do.

    No. Individual threads do not write to disk. In SQL only the lazy writer and the checkpoint process actually write data pages to disk. It has nothing to do with the OS, it's SQL's internal memory and IO management

    Checkpoints typically find few dirty pages to write to disk, because most dirty pages are written to disk by the worker threads or the lazy writer thread in the period between two checkpoints

    No. See above for the comment on threads writing to disk.

    In general, the checkpoint process is the one that writes most changes to disk. If the checkpoint is idle and the lazy writer is very busy, it's an indication that the server is under memory pressure.

    The physical size of the Transaction Log is not changed by the backup process just the amount of ‘free space’ contained within it. In order to reduce the physical size of the Transaction Log you have to 'shrink' it which would mean doing the following in this order. Back up the database, Back up the log, Shrink the log to reclaim the free space back to the OS. This answers the question to how a 5GB Transaction Log when backed up can only have a backed up size of say 500mb. (500mb of Active Transactions, the other 4.5ish GB is free empty space.

    Correct, though bear in mind that shrinking a transaction log is not something that should be done regularly, if at all.

    In Full Mode the log is backed up frequently which keeps its size down. Is this the only reason for doing frequent backups?

    No. The interval between log backups should be based on how much data you're willing to lose in the case of a disaster

    If you then get a problem and the database gets corrupt can’t you then just take a log backup at this point, restore the database back to the last full backup and use the log backup just taken (omitting the problem) to restore?

    Yes, providing the log's intact. If the server's failed completely or the log drive has been lost, then there's no way to backup the log.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ps (12/11/2008)


    In Full Mode the log is backed up frequently which keeps its size down. Is this the only reason for doing frequent backups?

    You take backup of the log files to assist you in restoration during crisis and minimize your loss. When you take log backup, the committed entries are moved to the log backup file where as active transactions still remain in the log file(.LDF file). During this, committed entries are removed from the Log file and thereby reducing used space of the log files.

    [

    In Full Mode the log is backed up frequently which keeps its size down. Is this the only reason for doing frequent backups?

    No. The interval between log backups should be based on how much data you're willing to lose in the case of a disaster

    If you then get a problem and the database gets corrupt can’t you then just take a log backup at this point, restore the database back to the last full backup and use the log backup just taken (omitting the problem) to restore?

    Yes, providing the log's intact. If the server's failed completely or the log drive has been lost, then there's no way to backup the log.

    Great. Thanks to you both. I think the penny is starting to drop why all this needs to be done. I've always assumed that you would be able to access the log even if the database was corrupt. Basically your saying you have to be prepared for the complete worst i.e. disaster recovery if the whole server was to pack up/catch fire or the drive the log resides on was lost in anyway.

    I'm so glad that I came across this forum. It's good to know that there are some very helpful and clued up people willing to help and reply so quickly. I'm sure I'll be in a month or so with some more questions for you.

  • Bristolred (12/11/2008)


    Basically your saying you have to be prepared for the complete worst i.e. disaster recovery if the whole server was to pack up/catch fire or the drive the log resides on was lost in anyway.

    Hope for the best, prepare for the worst and you'll never be caught off guard.

    I used to evaluate DR plans against the (extremely unlikely) event of the server room been hit by a meteor. If you can recover from that, you can probably recover from anything.

    I'm sure I'll be in a month or so with some more questions for you.

    We'll be here.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply