Blog Post

What’s the Size of that Index?

,

I have previously written about finding what the big space consumers are within the database. I even shared scripts in those articles on how to find those big consumers (read all about it here).

sizematters2That is grand when it comes to quickly finding which objects happen to be consuming the greatest amount of space within the database. It really helps to quickly assess where some extra TLC may need to be spent. But what about more specific details? What if you wished to figure out what the size was for each of the indexes within that object?

Well, the solution is pretty simple. I have another script to help determine which indexes are the big space consumers within the specified table for the given database.

 

First, let’s see what that script is.

DECLARE @objname sysname = 'LocationHistory';
SELECT o.name AS ObjName
, i.name AS IdxName
, ReservedMB = CONVERT(DECIMAL(19, 2), SUM(ps.reserved_page_count) / 128.0)
/*
-- ps.in_row_reserved_page_count + ps.lob_reserved_page_count + ps.row_overflow_reserved_page_count
*/, TotalUsedMB = CONVERT(DECIMAL(19, 2), SUM( ps.used_page_count) / 128.0)
/*
-- ps.in_row_used_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count
*/, TotalDataMB = CONVERT(DECIMAL(19, 2), SUM(ps.in_row_data_page_count
 + ps.lob_used_page_count
 + ps.row_overflow_used_page_count) / 128.0)
/*
--in_row_data_page_count = Number of pages in use for storing in-row data in this partition. If the partition is part of a heap, the value is the number of data pages in the heap. If the partition is part of an index, the value is the number of pages in the leaf level
--lob_used_page_count = Number of pages in use for storing and managing out-of-row columns within the partition. IAM pages are included.
--row_overflow_used_page_count = Number of pages in use for storing and managing row-overflow columns within the partition. IAM pages are included.
*/, UnusedMB = CONVERT(DECIMAL(19, 2), SUM(ps.reserved_page_count) / 128.0 - CONVERT(DECIMAL(19, 2), SUM( ps.used_page_count)) / 128.0)
, RowCnt = MAX(ISNULL(row_count, 0))
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.objects o
ON o.object_id = ps.object_id
INNER JOIN sys.indexes i
ON i.object_id = o.object_id
AND ps.index_id = i.index_id
WHERE o.name = @objname
AND i.index_id NOT IN ( 0, 1, 255 )
GROUP BY GROUPING SETS(o.name, i.name);

As you can see, I am parsing a few different measures on space usage from the dynamic management view sys.dm_db_partition_stats. This will tell me information such as how much space is being reserved for each index as well as how much space is actually being used on each index. Then I use a grouping set to summarize that size information so I can then compare the total index size back to the size I would have seen from the table space script (discussed lightly earlier).

Why might I want to do something like this?

That is a really good question! The first obvious answer to a question such as this is in the off chance that you discover (again, from the use of the previous script), that one of your terabyte tables is actually only 150GB of data and the rest of that space consumption is actually from all of the indexes on that table.

When I see that a table has more space allocated due to indexes than due to data, I might be curious which indexes are contributing to that consumption. In addition, I can look at numbers much faster to see if maybe there are multiple indexes that are the exact same size. If they are, I might want to evaluate and dig a little deeper into the definitions of those indexes. We would only really love to have 850GB of indexes on a table to only find that three or four of those indexes are exact duplicates.

From there, I might even decide to look at the usage stats for the indexes in question. Why go about it in this fashion? Well, on some databases it is much faster to run a few quick queries than to try and run a big query against the entire database to get all of the index usage stats. I am sure you are familiar with those times when trying to query physical and operational stats takes hours to complete.

But But But…

I can hear you thinking right now. Another question has popped into your head. What about sp_spaceused. That old proc does work just fine for this as well. But there is something about being able to get to know what contributes to the space of an index and being able to work through how to get it calculated. Additionally, with a minor tweak, this query can provide that (detailed) information very quickly for all objects in the database.

Tweaked query

DECLARE @objname sysname = NULL;
SELECT o.name AS ObjName
, i.name AS IdxName
, ReservedMB = CONVERT(DECIMAL(19, 2), SUM(ps.reserved_page_count) / 128.0)
/*
-- ps.in_row_reserved_page_count + ps.lob_reserved_page_count + ps.row_overflow_reserved_page_count
*/, TotalUsedMB = CONVERT(DECIMAL(19, 2), SUM( ps.used_page_count) / 128.0)
/*
-- ps.in_row_used_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count
*/, TotalDataMB = CONVERT(DECIMAL(19, 2), SUM(ps.in_row_data_page_count
 + ps.lob_used_page_count
 + ps.row_overflow_used_page_count) / 128.0)
/*
--in_row_data_page_count = Number of pages in use for storing in-row data in this partition. If the partition is part of a heap, the value is the number of data pages in the heap. If the partition is part of an index, the value is the number of pages in the leaf level
--lob_used_page_count = Number of pages in use for storing and managing out-of-row columns within the partition. IAM pages are included.
--row_overflow_used_page_count = Number of pages in use for storing and managing row-overflow columns within the partition. IAM pages are included.
*/, UnusedMB = CONVERT(DECIMAL(19, 2), SUM(ps.reserved_page_count) / 128.0 - CONVERT(DECIMAL(19, 2), SUM( ps.used_page_count)) / 128.0)
, RowCnt = MAX(ISNULL(row_count, 0))
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.objects o
ON o.object_id = ps.object_id
INNER JOIN sys.indexes i
ON i.object_id = o.object_id
AND ps.index_id = i.index_id
WHERE o.name = ISNULL(@objname, o.name)
AND i.index_id NOT IN ( 0, 1, 255 )
AND o.is_ms_shipped = 0
GROUP BY GROUPING SETS(ROLLUP(o.name,i.name));

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating