In this Level we'll review why and how to take log backups when working in FULL
recovery mode, and how to perform a database restore using these log backup files, in conjunction with a full database backup. FULL
recovery mode supports database restore to any point in time within an available log backup and, assuming a tail log backup can be made, right up to the time of the last committed transaction, before the failure occurred.
What gets Logged?
In FULL
recovery mode, all operations are fully logged. For INSERT
, UPDATE
and DELETE
operations, this means that for every row that is modified, there will be a log record describing the ID of the transaction that performed the statement, when that transaction started and ended, which pages were changed, the data changes that were made, and so on.
Operations that can be minimally logged SELECT
INTO
, BULK
INSERT
and CREATE
INDEX
, are still fully logged when working in FULL
recovery mode, but it is done slightly differently. The rows affected by those operations are not logged individually; instead only the database pages get logged, as they get filled. This reduces the logging overheard of such operations, while making sure that there still exists all the same information exists that is needed to perform rollback, redo and point in time restores. Kalen Delaney has published some investigations into logging for SELECT INTO
(http://sqlblog.com/blogs/kalen_delaney/archive/2011/03/15/what-gets-logged-for-select-into.aspx) and index rebuild (http://sqlblog.com/blogs/kalen_delaney/archive/2011/03/08/what-gets-logged-for-index-rebuilds.aspx) operations, both in FULL
and BULK_LOGGED
recovery modes. The differences in logging of minimally-logged operations, when working in BULK_LOGGED
mode, are discussed in more detail in Level 6 – Managing the Log in BULK LOGGED Recovery Mode.
Why Backup the Transaction Log?
In FULL
recovery mode, only a log backup can cause truncation of the log. As such, the transaction log will hold a full and complete record of the transactions performed since the last time the transaction log was backed up. Since all operations are fully logged, the log file can grow very large, very quickly, in busy systems.
Therefore, when working in FULL
recovery mode, it is vital that you perform regular transaction log backups, in addition to full backups and, optionally, differential backups. Many novice or part-time DBAs perform full backups on their databases, but they don't perform transaction log backups. As a result, the transaction log is not truncated, and it grows and grows until the drive it is on runs out of disk space, causing SQL Server to stop working.
Truncation of the log will occur as soon as the log backup is taken, assuming that a checkpoint has occurred since the previous backup and that no other factors are delaying truncation, such as a data backup or restore operation. For a full list of factors that may delay truncation of recoverable VLFs, as well as factors that keep large swathes of the log active that otherwise wouldn't need to be, such as a rogue, long-running uncommitted transaction or database mirroring or replication processes, see: http://msdn.microsoft.com/en-gb/library/ms345414.aspx.
COPY_ONLY
backups of the transaction log
COPY_ONLY
backups of the transaction log don't truncate the transaction log. A COPY_ONLY
log backup exists "independently" of the normal log backup scheme; it does not break the log backup chain.
In short, transaction log backups perform the dual purpose of allowing restore and recovery to a previous point in time, as well as controlling the size of the transaction log. Probably the most common cause of transaction log-related issues is working in FULL
recovery mode and simply not taking log backups, or not taking log backups frequently enough to control the size of the transaction log file.
If you are unsure whether or not transaction log backups are being taken on a given database, then you can simply interrogate the backupset table in the MSDB
database, using a query similar to that shown in Listing 5.1.
USE msdb ; SELECT backup_set_id , backup_start_date , backup_finish_date , backup_size , recovery_model , [type] FROM dbo.backupset WHERE database_name = 'TestDB'
In the type
column, a D
represents a database backup, L
a log backup and I
a differential backup.
Note that since the data in this backupset table could be manipulated without affecting backup and restore behavior, you might want to verify your findings from this query, by querying sys.database_recovery_status
to see the value of last_log_backup_lsn
(see Listing 3.5), or the sys.databases
table to see the value of log_reuse_wait_desc
(will return LOG_BACKUP
if a backup is required).
How to Back up the Transaction Log
As discussed previously, it is not possible to perform a transaction log backup without first taking at least one full backup. In fact, if you have a database that is in FULL
recovery mode, but has never been backed up, then it will not actually be working in FULL
recovery mode. The database will be in auto-truncate mode until the first full backup is performed.
All database backups, full, log or otherwise, are performed using the BACKUP
command. The command accepts numerous options, which are documented here: http://msdn.microsoft.com/en-us/library/ms186865.aspx. However, at its most basic, which is often how it's used, the command to perform a full backup to disk is as follows:
BACKUP DATABASE DatabaseNameTO DISK ='FileLocation\DatabaseName.bak';
If this were the first backup to be performed, the DatabaseName.bak
file would be created in the specified directory. If such a file already existed, then the default behavior is to append subsequent backups to that file. To override this behavior, and stipulate that any existing file should be overwritten, we can use the INIT
option, as follows:
BACKUP DATABASE DatabaseNameTO DISK ='FileLocation\DatabaseName.bak'WITH INIT;
Most commonly, however, each subsequent backup is given a unique name; more on this in the forthcoming section, Restore to Point of failure.
After each regular (e.g. daily) full backup, there will be frequent (e.g. hourly) log backups, the basic command for which is very similar:
BACKUP LOG DatabaseNameTO DISK ='FileLocation\DatabaseName_Log.bak';
Storing Log Backups
Clearly the backed up data and log files should not be stored on the same drive that hosts the live files. If that drive suffers hardware failure then all your copies will be lost along with the live files, and the backups will have been in vain. The files should be backed up to a separate device, or backed up to a local, mirrored drive.
Frequency of Log Backups
As noted in previous Levels, you may be taking log backups every 15 minutes, or perhaps even more frequently. In such cases, in order to avoid the need to restore a huge number of transaction log files, you may choose to adopt a backup scheme consisting of full backups interspersed with differential backups, interspersed with transaction log backups.
In reality, the backup scheme is often more of a compromise between the ideal and the practical, between an assessment of the true risk of data loss, and what it will cost the company, and the cost involved in mitigating that risk. Many very important business applications use somewhat simpler, but nevertheless rigorous, backup schemes, perhaps involving regularly nightly full backups coupled with hourly transaction log backups.
The frequency of log backups may also be dictated by the number of transactions to which the database is subject. For very busy databases, it may be necessary to backup frequently in order to control the size of the log.
There is no easy way to calculate how often to take log backups. Most DBAs will take their best estimate at how often log backups should be taken, then observe the growth characteristics of the files and then adjust the backup scheme as necessary to prevent them from getting oversized.
The Log Chain and how to break it
As noted, it is not possible to perform a transaction log backup without first taking at least one full backup. In order to recover a database to a point in time, either to the end of a particular log backup or to a point in time within a particular log backup, there must exist a full unbroken chain of log records, from the first log backup taken after a full (or differential backup), right up to the point of failure. This is known as the log chain.
There are many ways to break the log chain, and if you do it means that you will only be able to recover the database to the time of the log backup taken before the event occurred that broke the chain. In short, breaking the chain is not a good idea if you care about the ability to restore your data. Two of the most common ways to break the chain include:
- Loss or corruption of a transaction log backup file – you will only be able to recover to the last preceding good log backup. The log chain will start again at the next good full or differential backup.
- Switch to
SIMPLE
recovery mode – if you ever switch fromFULL
toSIMPLE
recovery mode, this will break the log chain as a checkpoint will be instigated and the transaction log can be immediately truncated. When and if you return toFULL
mode, you will need to take another full backup to restart the log chain. In fact, until you take that full backup, the database will remain in auto-truncate mode and you won't be able to back up the log file.
Pre-SQL Server 2008, there were a couple of command, namely BACKUP
LOG
WITH NO_LOG
or BACKUP
LOG
WITH
TRUNCATE_ONLY
(they are functionally equivalent) that, when issued, would force a log file truncation and so break the log chain. You should not issue these commands in any version of SQL Server, but I mention them here as they do still get used by the unwary, when trying to deal with a "runaway log file", without understanding the implications it has for their ability to restore their database. See Level 8 – Help, my log is full, for more details.
Tail Log Backups
As long as you have a recent full backup and a complete log chain, you can recover your database to the state in which it existed at the end of the final log backup before any failure. However, suppose that you take transaction log backups hourly, on the hour, and a failure occurs at 1:45PM. You could potentially lose 45 minutes worth of data; and indeed, if the failure is so catastrophic that the live transaction log is irretrievable, then that is the amount of data you will lose.
However, sometimes the live transaction log can still be available even if the data files are not, especially if the transaction log is contained on a separate, dedicated drive. If this is the case, you should back up the live transaction log i.e. perform a final backup of the log records generated since the last log backup. This will capture the remaining log records in the live log file, up to the point of failure. This is termed a tail log backup and is the last action that should be performed before beginning the restore and recovery operations.
Tail log backups and minimally-logged operations
If the data files are unavailable as a result of the database failure, and the tail of the log contains minimally logged operations, then it will not be possible to do a tail log backup, as this would require access to the changed data extents in the data file. This will be covered in more detail in Level 6, Managing the Transaction Log in Bulk Logged Mode.
If the database you wish to restore is online, then the tail of the log is backed up as follows:
BACKUP LOG DatabaseNameTO DISK ='FileLocation\DatabaseName_Log.bak'WITH NORECOVERY
The NORECOVERY
option puts the database in a restoring state and assumes that the next action you wish to perform is a RESTORE
. If the database is offline and won't start, you should still attempt to back up the tail of the log as just described (although the NORECOVERY
option can be omitted, since no transactions will be in progress).
If you are sure that the log file is damaged, the documentation suggests that, as a last resort, you try to do a tail log backup with:
BACKUP LOG DatabaseNameTO DISK ='FileLocation\DatabaseName_Log.bak'WITH CONTINUE_AFTER_ERROR
If the master database and data files are damaged, but the logs are available, Microsoft recommends rebuilding the master database and then backing up the last active log. However, these topics are outside the scope of this Stairway, and I refer you to the documentation for further details. See http://msdn.microsoft.com/en-us/library/ms190952.aspx.
Performing Restore and Recovery
Having performed a tail log backup, if possible, the next step is to restore the last full backup (followed by differential backup, if appropriate), then restore the complete sequence of log backup files, including the tail log backup. The basic syntax for this sequence of restore operations is as follows:
RESTORE {DATABASE | LOG} DatabaseNameFROM DISK ='FileLocation\FileName.bak'WITH NORECOVERY;
If when restoring you omit the WITH
NORECOVERY
option, then by default the RESTORE
command will proceed WITH
RECOVERY
. In other words, SQL Server will attempt to reconcile the data and log files, rolling forward completed transactions and then rolling back uncompleted transactions as necessary. By specifying WITH
NORECOVERY
, we are instructing SQL Server that we are entering a restore sequence and that more operations must be rolled forward, before any rollback can be performed. After restoring the last backup in the restore sequence, the database can then be recovered as follows:
RESTORE DATABASE DatabaseName WITH RECOVERY
A common requirement is to restore the database to a different location, in which case you can simply move the files as part of the restores process, as described here: http://msdn.microsoft.com/en-us/library/ms190255.aspx.
Restoring after Database Failure
The following examples describe how to recover a database in response to a failure, whereby the database data files are no longer accessible.
Full Restore to Point of Failure
Assuming that the "live" transaction log can be reached after a database failure, caused perhaps by a hardware failure, then in theory it should be possible to restore and recover your database right up to the point of failure, by using the following steps:
- Backup the tail of the log
- Restore the most recent full back up (plus differential, if applicable)
- Restore, in turn, each of the transaction log backups that were taken after the full (or differential) backup and completed before the time of failure
- Restore the tail log backup
- Recover the database
Many of the examples found on Books Online demonstrate restore and recovery from a "backup set", in other words a single "device" where all backups are stored. In practical terms, this means that, when backing up to disk, the backup device is a single .bak
file located somewhere on that disk.
So, for example, the simple example shown in Listing 5.2 uses a backup set consisting of one full backup and one transaction log backup, and shows how to perform a full restore. In order to run this code, you'll first need to recreate the TestDB
database and then insert a few sample rows of data (for convenience, the script to do this, CreateAndPopulateTestDB.sql, is included with the code download for this Level). You'll also need to create a "Backups" directory on the local C:
drive of your database server, or modify the file paths as appropriate.
-- Perform a full backup of the Test database -- The WITH FORMAT option starts a new backup set -- Be careful, as it will overwrite any existing sets -- The full backup becomes the first file in the set BACKUP DATABASE TestDB TO DISK = 'C:\Backups\TestDB.bak' WITH FORMAT; GO -- Perform a transaction log backup of the Test database -- This is the second file in the set BACKUP Log TestDB TO DISK = 'C:\Backups\TestDB.bak' GO -- ....<FAILURE OCCURS HERE>.... -- The RESTORE HEADERONLY command is optional. -- It simply confirms the files that comprise -- the current set RESTORE HEADERONLY FROM DISK = 'C:\Backups\TestDB.bak' GO -- Back up the tail of the log to prepare for restore -- This will become the third file of the bakup set BACKUP Log TestDB TO DISK = 'C:\Backups\TestDB.bak' WITH NORECOVERY; GO -- Restore the full backup RESTORE DATABASE TestDB FROM DISK = 'C:\Backups\TestDB.bak' WITH FILE=1, NORECOVERY; -- Apply the transaction log backup RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB.bak' WITH FILE=2, NORECOVERY; -- Apply the tail log backup RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB.bak' WITH FILE=3, NORECOVERY; -- Recover the database RESTORE DATABASE TestDB WITH RECOVERY; GO
However, using backup sets seems to be a relic from times when database were backed up to tape. When backing up to disk, it is a bad idea to use this scheme because, of course, the backup file will quickly grow very large.
In practice, it is far more common that each full backup and transaction log backup file will be individually named, and probably stamped with the time and date that the backup was taken. For example, most third party backup tools, popular community-generated scripts, plus the maintenance plan wizard /designer in SSMS, will all create separate date-stamped files e.g. AdventureWorks_FULL_20080904_000001.bak.
As such, a more common backup and restore scheme would use uniquely-named backups, as shown in Listing 5.3.
USE master; BACKUP DATABASE TestDB TO DISK ='C:\Backups\TestDB.bak' WITH INIT; GO -- Perform a transaction log backup of the Test database BACKUP Log TestDB TO DISK ='C:\Backups\TestDB_log.bak' WITH INIT; GO -- ....<FAILURE OCCURS HERE>.... -- Back up the tail of the log to prepare for restore BACKUP Log TestDB TO DISK ='C:\Backups\TestDB_taillog.bak' WITH NORECOVERY, INIT; GO -- Restore the full backup RESTORE DATABASE TestDB FROM DISK = 'C:\Backups\TestDB.bak' WITH NORECOVERY; -- Apply the transaction log backup RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB_log.bak' WITH NORECOVERY; -- Apply the tail log backup RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB_taillog.bak' WITH NORECOVERY; -- Recover the database RESTORE DATABASE TestDB WITH RECOVERY; GO
Point in time Restore to Last Good Log Backup
Sometimes, unfortunately, it may not be possible to perform a full restore; for example if the live transaction log is unavailable as a result of the failure. In this case, we will need to restore just to the end of the most recent log backup. It is the need to prepare for this eventuality i.e. a failure of the drive containing the transaction log, which dictates how often log backups are taken. If you take backups every 15 minutes, then you exposed to the risk of 15 minutes data loss.
Imagine that we had performed the sequence of backups shown in Listing 5.4. For the sake of this demo, we are overwriting previous backup files, and the backup sequence is obviously much shortened than it would be in reality.
-- FULL BACKUP at 2AM USE master ; BACKUP DATABASE TestDB TO DISK = 'C:\Backups\TestDB.bak' WITH INIT ; GO -- LOG BACKUP 1 at 2.15 AM USE master ; BACKUP LOG TestDB TO DISK = 'C:\Backups\TestDB_log.bak' WITH INIT ; GO -- LOG BACKUP 2 at 2.30 AM USE master ; BACKUP LOG TestDB TO DISK = 'C:\Backups\TestDB_log2.bak' WITH INIT ; GO
If a catastrophic failure occurred shortly after 2:30 AM, we may need to restore the database to the state it existed at the end of log backup 2, at 2:30 AM.
The restore sequence in such an example is very similar to that which we saw earlier, in Listing 5.3, but since a tail backup is not possible and we'll only be able to restore to a certain point, we need to use the STOPAT
option, as shown in Listing 5.5.
--RESTORE Full backup RESTORE DATABASE TestDB FROM DISK = 'C:\Backups\TestDB.bak' WITH NORECOVERY; --RESTORE Log file 1 RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB_log.bak' WITH NORECOVERY, STOPAT = 'Jan 01, 2020 12:00 AM'; --RESTORE Log file 2 RESTORE LOG TestDB FROM DISK = 'C:\Backups\TestDB_Log2.bak' WITH NORECOVERY, STOPAT = 'Jan 01, 2020 12:00 AM'; --Recover the database RESTORE DATABASE TestDB WITH RECOVERY; GO
Since we've specified a STOPAT
time in the future, this code will roll forward all completed transactions up to the end of the second transaction log.
Alternatively, it's possible to specify a STOPAT
time that falls within the time range of the transactions recorded in a specific log file. In this case, the database will be restored up to the last committed transaction at the time specified. This is useful when you know what time you want to restore to, but don't know exactly what log backup contains that time.
It is also possible to restore to a particular marked transaction. This is useful when, for example, you need to restore multiple databases, accessed by a certain application, to a logically consistent point. This topic is not discussed further here, but you can find out more on Books Online (http://msdn.microsoft.com/en-us/library/ms187014.aspx), and Mladen Prajdic provides a good worked example here: http://weblogs.sqlteam.com/mladenp/archive/2010/10/20/sql-server-transaction-marks-restoring-multiple-databases-to-a-common.aspx.
Restoring after a "Bad Transaction"
Outside of the context of any database failure, it may be necessary to restore a database backup, plus transaction logs, in order to return a database to a particular point in time just before an erroneous data modification was made, such dropping or truncating a table.
Your response to such a situation would depend on the nature of the problem. If possible, you might disconnect all users from the database (after notifying them), and assess the implications of what just happened. In some cases, you might need to estimate the time the problem occurred and then do a full recovery of the database and logs using a point in time restore. Once the restore was done, you'd have to notify users that some transactions may have been lost, and ask for forgiveness.
Of course, often you will not be able to interrupt normal business operation in this manner, to fix an accidental data loss. Since the live database is still up and running and being accessed, you could try restoring a backup of the database in STANDBY
mode. This allows further log backups to be restored but unlike when using NORECOVERY
, the database is still readable. The restore scheme might look something like this:
- Restore a backup of the database, in
STANDBY
mode, alongside the live database - Roll the logs forward to the point just before the bad transaction occurred, and data was lost.
- Copy the lost data across to the live database and drop the restored copy
Of course, this process is not necessarily straightforward, and can be quite time-consuming. Unless you've purchased a specialized log reading tool, and can interrogate the log backup directly, rolling the logs forward can mean a series of painstaking steps involving restoring a log, checking the data, restoring a bit further, and so on, until you've worked out exactly where the bad transaction occurred. Step 3 can be difficult too, since you will be introducing data into the live system that is not necessarily consistent with the current state of the database, so there could be referential integrity issues.
Let's take a look at an example that implements steps 1 and 2 above. First, let's start again from scratch by running the CreateAndPopulateTestDB.sql script to recreate the TestDB
database, and insert 10 rows of test data into a new LogTest table. In Listing 5.6, we simply do a full database backup (overwriting any previous backup file). You'll need to create the "Backups" directory, if you've not done so already, or adjust the path as appropriate.
-- full backup of the database BACKUP DATABASE TestDB TO DISK ='C:\Backups\TestDB.bak' WITH INIT; GO
We then insert one new row of data into the LogTest
table.
USE TestDB GO INSERT INTO [TestDB].[dbo].[LogTest] ([SomeInt] ,[SomeLetters2]) VALUES (66666, 'ST') SELECT * FROM dbo.LogTest
So now we have a live TestDB
database with 11 rows in the LogTest table, and a backed up version with 10 rows. Let's now capture additional modification in a log backup, as shown in Listing 5.8.
USE master GO BACKUP Log TestDB TO DISK ='C:\Backups\TestDB_log.bak' WITH INIT; GO
Now, we're going to simulate an erroneous "bad transaction", simply by dropping the LogTest table, after which we do a final log backup.
USE TestDB GO DROP TABLE dbo.LogTest ; USE master GO BACKUP Log TestDB TO DISK ='C:\Backups\TestDB_log2.bak' WITH INIT; GO
In order to try to retrieve the lost data, without interrupting normal business operation, we're going to restore a copy of the TestDB
database in STANDBY
mode. The data and log files for the Standby database, called ANewTestDB, are moved to a "Standby" directory (you'll need to create this directory beforehand).
-- restore a copy of the TestDB database, called -- ANewTestDB, in STANDBY mode USE master ; GO RESTORE DATABASE ANewTestDB FROM DISK ='C:\Backups\TestDB.bak' WITH STANDBY='C:\Backups\ANEWTestDB.bak', MOVE 'TestDB_dat' TO 'C:\Standby\ANewTestDB.mdf', MOVE 'TestDB_log' TO 'C:\Standby\ANewTestDB.ldf' GO
We now have a new database, called ANewTestDB
, and it's in "Standby / Read-Only" mode, as shown in Figure 5.1.
A query against the LogTest
table in the ANewTestDB database will reveal 10 rows. However, we'd like to get the table back into the state it was in just prior to it being erroneously dropped. Therefore, the next step is to perform restore a log backup to the standby database.
USE master GO RESTORE LOG ANewTestDB FROM DISK = 'C:\Backups\TestDB_log.bak' WITH STANDBY='C:\Backups\ANewTestDB_log.bak'
At this point, a query against ANewTestDB reveals 11 rows and we can now ready to copy that data back across into the live database. If we went a step further and restored the second log backup, we'd realize we'd gone too far and the table would be missing in the standby database as well.
An alternative to doing a Standby restore is to consider use of a third party tool such as Red Gate's SQL Virtual Restore, which provides a way to mount backups as live, fully functional databases, without a physical restore.
Whether DBAs like it or not, developers often do have access to production databases to perform ad-hoc data loads and changes. It is the joint responsibility of the DBA and developer to make sure these proceed smoothly, and so not cause issues that require the sort of action just described. We return to this topic later in the Level 6 - Dealing with Bulk Operations.
Of course, the exact nature of the reparative action required depends on the nature of the bad transaction. If a table was "accidentally dropped" then it's likely you'll be heading down the RESTORE
WITH
STANDBY
route. At other times, you may get away with simply creating a script to "reverse out" the rogue modifications.
If the damage only affected a single column or a limited number of rows, then it may be possible, as an alternative, to use a tool such as SQL Data Compare, which can compare directly to backup files, and can do row-level restores.
Alternatively, if you run SQL Server 2005 (or later) Enterprise Edition, and have available a recent database snapshot, you may be able to run a query against the snapshot to retrieve the data as it looked at the time the database snapshot was taken, and then write an UPDATE
or INSERT
command to pull the data from the database snapshot into the live, source database.
Finally, as a last resort, a specialized log reader tool may help you reverse out the effects of a transaction although I'm not aware of any that work reliably in SQL Server 2005 and later.
Summary
In this Level, we've covered the basics of backing up and restoring log files for databases operating in FULL
recovery mode, which will be the norm for many production databases.
For most DBAs, the need to perform a point-in-time restore is a rare event, but it's one of those tasks where, if it is necessary, it is absolutely critical that it is done and done well; the DBA's reputation depends on it.
In the case of corruption, drive failure, and so on, point-in-time recovery might involve, if you're lucky, backing up the tail of the transaction log and restoring right to point of failure. If the transaction log is not available, or if you're restoring in order to revert to some point in time before a "bad transaction" occurred, then the situation becomes trickier, but hopefully some of the techniques covered in this step will help.