Blog Post

Compressing Outcomes

,

You find yourself cruising along, just doing your thing as a database administrator. Then the CIO plops down into a spare chair in your cubicle to talk about the weather.

Well, not entirely true there. The CIO dropped in because the alerts for disk space have been blowing up everybody’s inbox. He informs you that the situation isn’t going to change anytime soon because new shelves for the SAN just can’t be ordered at the moment. In the end, you are given a directive – just without so many words. Fix the problem, and do it without more disk space.

Fantastic you think to yourself. What to do now. Then you get the Gru light-bulb and it is on!

compressionEnter compression. Ok, compression isn’t something entirely new. It has been a feature shipped with SQL Server since 2008 Enterprise edition and can be just the thing to save the day (and everybody’s inbox before they start ignoring all alerts).

The decision has been made and the biggest tables in the database are now compressed using page compression. But you are concerned that it didn’t work because the space savings isn’t necessarily what you had hoped it would be. Lucky for you, you are running SQL Server 2012 and have the ability to start diving into the internals to confirm whether your worry is fact or fiction.

Using this handy dandy query that you are about to copy and paste from the internet, you can quickly investigate to see if there are pages in the tables that were unaffected by your attempts to compress the table.

Thanks to an undocumented function called dm_db_database_page_allocations, we can investigate quickly whether there are pages that failed to compress. We can also see which compression state they are in – if they are compressed at all.

USE TestDB;
GO
--ALTER INDEX PK_Address_AddressID ON Person.Address 
--REBUILD WITH (DATA_COMPRESSION = PAGE);
WITH presel AS (
SELECT DB_NAME(dpa.database_id) AS DBName
, SCHEMA_NAME(o.schema_id) AS SchName
, OBJECT_NAME(dpa.object_id) AS ObjName
, i.name AS IDXName
, allocated_page_page_id
, allocated_page_file_id
, page_free_space_percent
, page_type_desc
, is_page_compressed
, p.data_compression_desc AS RequestedCompressionState
, has_ghost_records
,ios.leaf_update_count * 100.0 /
           (ios.range_scan_count + ios.leaf_insert_count
            + ios.leaf_delete_count + ios.leaf_update_count
            + ios.leaf_page_merge_count + ios.singleton_lookup_count + 1.0
           ) AS [Percent_Update]
       ,ios.range_scan_count * 100.0 /
   (ios.range_scan_count + ios.leaf_insert_count
+ ios.leaf_delete_count + ios.leaf_update_count
+ ios.leaf_page_merge_count + ios.singleton_lookup_count + 1.0
   ) AS [Percent_Scan]
FROM sys.dm_db_database_page_allocations(
DB_ID()
, OBJECT_ID('MyPhonyTable') --object to check. Provide an invalid object to short
--circuit and check all objects !!
, NULL-- index id, null is default
, NULL-- partition id, null is default
, 'DETAILED' -- limited, sampled, detailed
) dpa
INNER JOIN sys.indexes i
ON dpa.object_id = i.object_id
AND dpa.index_id = i.index_id
INNER JOIN sys.objects o
ON dpa.object_id = o.object_id
INNER JOIN sys.partitions p
ON i.index_id = p.index_id
AND i.object_id = p.object_id
LEFT OUTER JOIN sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL) ios
ON ios.database_id = dpa.database_id
AND ios.object_id = dpa.object_id
AND ios.index_id = dpa.index_id
AND ios.partition_number = dpa.partition_id
WHERE o.is_ms_shipped = 0
)

SELECT *
, CASEWHEN ( p.Percent_Scan > p.Percent_Update
AND p.Percent_Scan > 50
)
OR p.Percent_Update < 1 THEN 'Consider Page Compression'
ELSE 'Consider Row Compression'
END AS CompressionToUse
FROM presel p
ORDER BY p.allocated_page_page_id,RequestedCompressionState DESC,p.percent_update DESC, p.percent_scan DESC;

Adding a script like this to your repository can be an easy aid in the struggle to ensure your expected results either match or don’t match. This would save a bit of time and allow you to move on to bigger and badder things – like 2048.

In addition to looking at the compression status for each page, I have thrown in a little extra. Call it the “considerations” column. Based on activity hitting the table or index, you may want to consider a different level of encryption. This additional data on the end of the output will help start you in that direction.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating