Database Snapshots
This article is a first part of the three part series. In this series of articles, I will focus on Microsoft's different offerings for SQL Server data availability. In this first part, the focus is on database snapshots, in the second part, the focus will be on snapshot backups and in the last part I will focus on Microsoft Data Protection Manager (DPM).
Recently IT Managers have started talking about the acronyms RPO and RTO. RPO stands for Recovery Point Objective and RTO stands for Recovery Time Objective. RPO can be defined as the acceptable amount of data loss and RTO can be defined as the acceptable amount of business downtime. Lowering RPO and RTO should be an important objective in your backup and recovery strategy.
Microsoft offers following data protective measures to minimize your RPO (data loss) and RTO (down time).
- Full database and differential backup
- Granular backups (file level or log level)
- Log Shipping or Database Mirroring
- Shadow copying data
- Database Snapshots
- Snapshot Backup
- Data Protection Manager
RPO = 0 (or near-zero) and RTO > 0, means some data loss and some business down time. For example, if you perform a point in time restore up to a minute before the disaster, you have achieved an RPO of 1 minute, but your RTO is equal to your recovery time. Conventional recovery strategy (bullet 1 &2) involving full base and log backups would help you in lowering RPO but not in lowering RTO.
To minimize RTO, you can add log shipping or database mirroring (supported only in SQL Server 2005) in your data protection strategy. If you have a well established standby server, during disaster (example, disk crash) you can promote your standby as production and perform a recovery of original production during off-peak hours. This way you can achieve a smaller RTO. The mirroring solution does protect your data against the physical disaster but does not protect your data against the logical disaster (example, data corruption) because the mirroring solution will simply replicate your corrupted data to a mirror server.
To achieve an infinitesimally small RPO and RTO from the logical disaster, consider using 3rd party transactional log management products such as Lumigent's Log Explorer. If you do not want to pay and maintain a 3rd party product, you can consider using SQL Server 2005 database snapshots. So, conventional backups plus mirroring plus database snapshot will protect your data against the physical and logical disasters.
Microsoft is also working towards addressing low RPO and RTO through another product called Data Protection Manager (DPM). The current DPM offerings do not support SQL Server or Exchange.
Database Snapshots
SQL Server 2005 Database snapshots use NTFS sparse files with the Alternate Data Stream (ADS) mechanism. In this scheme, before updating the original data block in a file, it will be copied to the respective sparse file; this concept is called copy-on-write. A database snapshot operates on a data file level, so for each data file of a database 'db' you have to associate a sparse file to create a successful database snapshot. The sparse file mechanism is supported in NTFS but not in FAT32. If you delete the original file, the sparse file will also be deleted. This mechanism is also used in certain DBCC commands such as DBCC CHECKDB, DBCC CHECKTABLE and DBCC CHECKALLOC
In the sparse file system, storage is not fully allocated as in the original data file. When you create a snapshot it does not take much storage, but it grows when your disk blocks change in the original file. So, the older the snapshot is, the larger the disk space requirements (assuming your database changes regularly).
SQL Server Management Studio does not support creating and managing the snapshots. So, you need to rely on T-SQL for creating, reverting and deleting the snapshots.
To create a snapshot database sample_snap1 for the sample database use the following query.
CREATE DATABASE [sample_snap1] ON ( NAME=[sample] ,FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\sample_snap1') AS SNAPSHOT OF [SAMPLE]
The above query takes logical name of the sample as the database's data file. To get logical name of data and log files use the following query
USE SAMPLE SELECT [name] FROM sys.sysfiles
If you try creating a snapshot, when the original database is in the middle of running a transaction, you would get the following error
Server: Msg 226, Level 16, State 5, Line 2 CREATE DATABASE statement not allowed within multi-statement transaction.
After the successful creation of a snapshot, you will see the snapshot database in the object explorer like any other SQL Server database. Also, sys.databases field source_database_id will point to original database id.
Transactional log files, offline files or files in loading state are not considered for database snapshots because a database snapshot is read-only and a snapshot must always be created in a consistent state.
You can compare the amount of disk space that is used by normal and snapshot database by using the following query.
SELECT BytesOnDisk FROM fn_virtualfilestats(DB_ID('sample_snap1') ,FILE_ID('C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\sample.snap1'))
The above query returned 192K for the snapshot database and 1.18MB for the original database. If you make some updates to original database, size of the snapshot database will grow.
For example, if you add a file to your sample database, as shown by the following query
ALTER DATABASE [sample] ADD FILE (NAME=sample1, FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\sample1')
Note, ALTER DATABASE command is not allowed on database snapshots. You cannot alter the snapshot; you need to newly create one, including all the file specifications as shown below.
CREATE DATABASE [sample_snap2] on (NAME=[SAMPLE1],FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\sample1_snap2'), (NAME=[SAMPLE],FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\sample_snap2') AS SNAPSHOT OF [sample]
If you do not specify a file specification, you will get the following error
Server: Msg 5127, Level 16, State 1, Line 1 All files must be specified for database snapshot creation. Missing the file "sample".
One of the key benefits of a snapshot is guarding against data corruption. If you are planning to perform a bulk copy or concerned that your data more prone to corruption, consider taking frequent snapshots. Note having too many snapshots will impact your production environment performance because snapshot access requires access to the original database.
Now, create a table called mytab in the sample database and insert some rows.
CREATE TABLE mytab(c1 int, c2 varchar(12)) INSERT INTO mytab VALUES(1,'hello')
Take a snapshot sample_snap3from sample.
If the original table is truncated, you can recover the original table data using the following query.
INSERT sample..mytab SELECT * FROM sample_snap3..mytab
If the original table data is updated as shown below,
UPDATE sample..mytab SET c2='corrupted' WHERE c1=1
You can recover from the corruption using the following query
UPDATE sample..mytab SET c2=(SELECT c2 FROM sample_snap3..mytab WHERE c1=1)
If your table is huge, the above described two methods of data recovery might take longer than recovering (reverting) the original database using one of the snapshots. Before you recover (or revert) to a snapshot, you have to delete rest of the snapshots.
So, I drop sample_snap1 & sample_snap2 snapshots
DROP DATABASE sample_snap1 DROP DATABASE sample_snap2
Then recover to sample_snap3 using the following query
RESTORE DATABASE sample from DATABASE_SNAPSHOT= 'sample_snap3'
Note you cannot drop or detach the original database unless you drop the snapshots associated with it. SQL Server uses a reference counting mechanism and prevents you from accidentally dropping the original database.
Reverting to a snapshot is not supported, if
- The source database contains any read-only or compressed files
- Any files are offline that were online during snapshot
During the revert operation, both the original and snapshot will be marked as "in restore". Before reverting, it is recommended you perform a log backup and after reverting, perform a full database backup because reverting breaks the log chain.
You cannot backup database snapshots, therefore you cannot restore them. Though a database snapshot protects your data from logical disaster, you need to note the following limitations.
- The original database performance is reduced due to increased I/O between original database and snapshots whenever a page is updated. You can minimize this by limiting the number of snapshots.
- The source database is not scalable with snapshots
- Reverting will recover your data from corruption, but you have to perform the full database backup following the revert operation.
- Snapshots will take up your disk space. So consider deleting older snapshots, if they are not needed.
Understanding these limitations and exercising caution will definitely minimize your Recovery Time Objective.