SQL Server 2008 introduced native backup compression and data compression as features that were only available in the Enterprise Edition SKU. In SQL Server 2008 R2, native backup compression was added to the Standard Edition SKU, while data compression remains an Enterprise Edition feature, although it was improved with the addition of Unicode compression.
I have been using both native backup compression and data compression with SQL Server 2008 and 2008 R2 in Production with very good results for quite a while. I thought it would be interesting to run a few somewhat formal tests to investigate how well these two features work when they are both used together. I wanted to use some actual production data rather than loading up an empty table with bogus data using Replicate. I happened to have a backup of one of my smaller Production databases from a few months ago on an external USB drive, so this allowed me to do some testing with that real data.
To make things relatively simple, I did a SELECT INTO of a 151 million row table that went to a new, empty database on the same instance on my teaching laptop. Then, I created an uncompressed clustered index on the heap table that the SELECT INTO statement had created. I took a full compressed backup of that new database and then restored it twice to create two new copies with different names, but the same data. I then rebuilt the clustered index on the single table on each copy of the database, using either Row compression or Page Compression. After doing this, I had a lot of empty space in the data file of the two databases that were using data compression, so I decided to risk the wrath of Paul Randal, and shrink the data file on each of those databases. Shrinking data files is actually a very evil thing to do, since it severely fragments your indexes. Sure enough, after I shrank the data files on the two databases, the single clustered index had over 99% fragmentation. Running ALTER INDEX REORGANIZE pretty effectively removes the fragmentation though. After all of these gyrations on my poor Toshiba Portege R705 laptop, I had three databases that were identical except for the type of data compression used on the clustered index of the single table in each database. Here are the database sizes for the three databases:
Database Name Data File Size Log File Size
NoCompressionTest 18.6GB 512MB
RowCompressionTest 10.6GB 512MB
PageCompressionTest 4.1GB 512MB
My laptop has 8GB of RAM,and a 500GB 7200rpm Seagate Momentus XT hybrid hard drive. It also has a 2.27GHz Intel Core i3 350M processor with two cores plus hyper-threading. The Core i3 does NOT have Turbo boost.
Here is what the table schema looks like:
-- Table and index schema CREATE TABLE [dbo].[OnlineSearchHistoryNonCompressed]( [OnlineSearchHistoryID] [int] IDENTITY(1,1) NOT NULL, [SearchTerm] [nvarchar](50) NULL, [NumItemsRequested] [int] NULL, [NumItemsReturned] [int] NULL, [SearchElapsedTime] [int] NULL, [SearchDateTime] [smalldatetime] NOT NULL, [SearchSPName] [nvarchar](50) NULL, CONSTRAINT [PK_OnlineSearchHistoryNonCompressed] PRIMARY KEY CLUSTERED ( [OnlineSearchHistoryID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = NONE) ON [PRIMARY] ) ON [PRIMARY]; GO
Backing up a “large” database from the C: drive of a laptop to another directory on the same C: drive is not a best case scenario, but it highlights a common problem where disk I/O is a big bottleneck during the backup process. Using backup compression compresses the data before it is written out to the backup file(s), so it usually reduces the elapsed time for the database backup and reduces the disk space required for the backup file(s), at the cost of some added CPU utilization, which is typically in the 10-15% range. Here are the summarized results from doing a full backup of each database, without and with backup compression.
NoCompressionTest w/o Backup Compression 18.7GB 725 seconds
NoCompressionTest w/Backup Compression 3.9GB 246 seconds
RowCompressionTest w/o Backup Compression 10.6GB 287 seconds
RowCompressionTest w/Backup Compression 3.2GB 165 seconds
PageCompressionTest w/o Backup Compression 4.1GB 89 seconds
PageCompressionTest w/Backup Compression 2.5GB 71 seconds
In each case, we see both a disk space and elapsed time benefit from using both data compression and backup compression, without taking a prohibitive CPU utilization hit. Of course, the structure of this table is in our favor, with two NVARCHAR(50) and several INT columns. Here are the the detailed results:
-- Testing Backup Compression with Data Compression -- Glenn Berry -- August 2010 -- http://glennberrysqlperformance.spaces.live.com/ -- Twitter: GlennAlanBerry -- All three databases have a single table with a clustered index -- The table has 151 million rows of actual production data -- Test 1 -- Full backup of NoCompressionTest database without Backup compression -- (725 seconds, 18.7GB backup file) BACKUP DATABASE [NoCompressionTest] TO DISK = N'C:\SQLBackups\NoCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'NoCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 1 GO -- Test 1 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 611.18 44.36 44.36 --ASYNC_IO_COMPLETION 459.14 33.33 77.69 --BACKUPIO 302.12 21.93 99.62 -- Test 2 -- Full backup of NoCompressionTest database with Backup compression -- (246 seconds, 3.9GB backup file) BACKUP DATABASE [NoCompressionTest] TO DISK = N'C:\SQLBackups\NoCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'NoCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 1 GO -- Test 2 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 255.81 31.09 31.09 --ASYNC_IO_COMPLETION 245.93 29.89 60.97 --BACKUPIO 232.96 28.31 89.28 --IO_COMPLETION 87.53 10.64 99.92 -- ************* End of NoCompression tests -- Test 3 -- Full backup of RowCompressionTest database without Backup compression -- (287 seconds, 10.6GB backup file, 4.76 compression ratio for backup) BACKUP DATABASE [RowCompressionTest] TO DISK = N'C:\SQLBackups\RowCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'RowCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 1 GO -- Test 3 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 368.32 42.92 42.92 --ASYNC_IO_COMPLETION 285.90 33.31 76.23 --BACKUPIO 200.66 23.38 99.61 -- Test 4 -- Full backup of RowCompressionTest database with Backup compression -- (165 seconds, 3.2GB backup file, 3.28 compression ratio for backup) BACKUP DATABASE [RowCompressionTest] TO DISK = N'C:\SQLBackups\RowCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'RowCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 1 GO -- Test 4 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 506.86 51.12 51.12 --ASYNC_IO_COMPLETION 324.98 32.78 83.90 --BACKUPIO 140.87 14.21 98.10 --IO_COMPLETION 17.87 1.80 99.91 -- ************* End of RowCompression tests -- Test 5 -- Full backup of PageCompressionTest database without Backup compression --(89 seconds, 4.1GB backup file) BACKUP DATABASE [PageCompressionTest] TO DISK = N'C:\SQLBackups\PageCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'PageCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 1 GO -- Test 5 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 121.53 45.03 45.03 --ASYNC_IO_COMPLETION 88.46 32.78 77.80 --BACKUPIO 54.34 20.13 97.94 --IO_COMPLETION 5.24 1.94 99.88 -- Test 6 -- Full backup of PageCompressionTest database with Backup compression -- (71 seconds, 2.5GB backup file, 1.64 Backup compression ratio for backup) BACKUP DATABASE [PageCompressionTest] TO DISK = N'C:\SQLBackups\PageCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT, NAME = N'PageCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 1 GO -- Test 6 Top Waits --wait_type wait_time_s pct running_pct --BACKUPBUFFER 298.00 54.63 54.63 --ASYNC_IO_COMPLETION 179.12 32.84 87.47 --BACKUPIO 58.79 10.78 98.25 --IO_COMPLETION 8.44 1.55 99.80
Thanks to Moore’s Law, we increasingly have excess CPU capacity to burn, which can be used to reduce I/O pressure on our system, whether it is a single drive laptop or a large production database server. Powerful new CPUs are much less expensive than added I/O capacity in most cases.