Did you know compression can gain you more than just space on a disk, it can allow you to fit more data into the buffer pool? That means queries have faster and more efficient reads which can help reduce I/O. It is a performance tool I highly recommend you consider for your environments. But how do you know how much compression savings you can gain? SQL Server gives you a handy system stored procedure sp_estimate_data_compression_savings that will help you calculate compression gains. It’s a great tool to use when trying to decide if compression is right for your environment and what type to use.
This procedure allows you to estimate compression gains on whole tables, indexes or partitions. You can test indexes, indexed views, heaps, and index partitions and starting with SQL Server 2019 you will be able to estimate savings on columnstore and columnstore archival compression. One of the great things you can do with it is to compare compression types to see which type you can get the biggest gain from. That’s what we will do in this blog. So, let’s see it in action!
Using AdventureWorks 2014 we will look at the table WorkOrderRouting and see which if any compressing type will give us the most gain. But first we need to get familiar with what arguments this sp_estimate_data_compression_savings procedure uses and what the result set will tell us.
Straight from MSDN here are the arguments this procedure uses.
Arguments | Definition |
[ @schema_name= ] | Is the name of the database schema that contains the table or indexed view. schema_name is sysname. If schema_name is NULL, the default schema of the current user is used. |
[ @object_name= ] | Is the name of the table or indexed view that the index is on. object_name is sysname. |
[ @index_id= ] | Is the ID of the index. index_id is int, and can be one of the following values: the ID number of an index, NULL, or 0 if object_id is a heap. To return information for all indexes for a base table or view, specify NULL. If you specify NULL, you must also specify NULL for partition_number. |
[ @partition_number= ] | Is the partition number in the object. partition_number is int, and can be one of the following values: the partition number of an index or heap, NULL or 1 for a nonpartitioned index or heap. |
[ @data_compression= ] | Is the type of compression to be evaluated. data_compression can be one of the following values: NONE, ROW, PAGE, COLUMNSTORE, or COLUMNSTORE_ARCHIVE. |
In the result sets you will see 4 important values. These are the ones we want to pay attention to, as it gives an a before and after estimation.
Column | Definition |
size_with_current_compression_setting | Size of the requested table, index, or partition as it currently exists. |
size_with_requested_compression_setting | Estimated size of the table, index, or partition that uses the requested compression setting; and, if applicable, the existing fill factor, and assuming there is no fragmentation. |
sample_size_with_current_compression_setting | Size of the sample with the current compression setting. This includes any fragmentation. |
sample_size_with_requested_compression_setting | Size of the sample that is created by using the requested compression setting; and, if applicable, the existing fill factor and no fragmentation. |
Now let’s have some fun.
Using NONE
USE AdventureWorks2014; GO EXEC sp_estimate_data_compression_savings 'Production', 'WorkOrderRouting', NULL, NULL, 'NONE' ; GO
You will note that there is a slight difference using NONE, what this actually will show you what a REBUILD of an index will reclaim. Also note that fill factor of an index can also play a part in these I personally expected the numbers to be exactly the same.
Using ROW
USE AdventureWorks2014; GO EXEC sp_estimate_data_compression_savings 'Production', 'WorkOrderRouting', NULL, NULL, 'ROW' ; GO
Here we a pretty good gain a difference of 1576KB about 27%, not too bad. Remember row compression simply converts fixed length data types into variable length types, so you don’t normally get high levels of compression.
Using PAGE
USE AdventureWorks2014; GO EXEC sp_estimate_data_compression_savings 'Production', 'WorkOrderRouting', NULL, NULL, 'PAGE' ; GO
Here we start to see some real gains at the page level we get significant compression savings 4184 kb that’s almost 75% reclaimed. That’s a HUGE savings!
Compression can be a powerful tool and a real benefit to you but there are some considerations you need to look at before deciding on which type of compression to use. While data compression was an enterprise edition feature, it is now included in all editions of SQL Server starting with SQL Server 2016 Service Pack 1. Although these estimates give you insight don’t just rely on these numbers. Take a look here first. Also keep in mind compression can cause query plans to recompile you may see a slight short-term performance hit on the newly compressed object. Since compressing a table or index will require a rebuild, this is generally a non-issue.
*Caution when running on tables that have migrated legacy LOB datatypes from one version of SQL Server to another. Please read this before running.