Blog Post

Estimating the storage savings by removing columns with NULL value across the table or marking them as SPARSE

,

In the previous article Find columns with NULL values across the table we discussed that storage space can be saved by removing columns with NULL value across the table or marking them as SPARSE. We also learnt about the query to find all such columns across the tables of a database.

In this article we’ll learn to estimate the storage saving by taking the necessary action on the columns with NULL value across the table, either by removing them or by marking them as SPARSE.

It becomes extremely important to be ready with the relevant data and stats when we propose anything. Similarly, when we’ve to approach the Sr. Leadership for the approvals to take any such actions on the Production database, we need to have the data supporting our claim of storage savings.

I found this query very useful. It helped me with the table wise data which we finally aggregated for the total storage savings. This query provides the following columns as the output.

  • TableName : This gives the name of the table
  • TotalColumns : This gives the count of columns in the table with NULL values across.
  • TotalRows: This gives the count of rows of the table
  • Estimated_Savings_Bytes: This gives the estimation of storage savings in bytes.

Note: You may find a table tables_with_null_values_across being referred in the query. This is the same table which was created in the previous article. This article is the continuation of Find columns with NULL values across the table.

SELECT DV.TableName
, COUNT(DISTINCT DV.ColumnName) AS TotalColumns
, DV.TotalRows
, SUM(DV.TotalRows * 
CASE
WHEN COL.DATA_TYPE IN ('CHAR', 'NCHAR')
THEN COL.CHARACTER_OCTET_LENGTH
WHEN COL.DATA_TYPE = 'TINYINT'
THEN 1
WHEN COL.DATA_TYPE = 'SMALLINT'
THEN 2
WHEN COL.DATA_TYPE = 'INT'
THEN 4
WHEN COL.DATA_TYPE = 'BIGINT'
THEN 8
WHEN COL.DATA_TYPE IN ('NUMERIC', 'DECIMAL')
THEN 9
WHEN COL.DATA_TYPE = 'FLOAT'
THEN 8
WHEN COL.DATA_TYPE = 'DATE'
THEN 3
WHEN COL.DATA_TYPE = 'TIME'
THEN 5
WHEN COL.DATA_TYPE = 'SMALLDATETIME'
THEN 4
WHEN COL.DATA_TYPE = 'DATETIME'
THEN 8
WHEN COL.DATA_TYPE = 'BIT'
THEN 1
ELSE 2
END) Estimated_Savings_Bytes
FROM tables_with_null_values_across DV WITH (NOLOCK)
INNER JOIN INFORMATION_SCHEMA.COLUMNS COL WITH (NOLOCK)
ON COL.TABLE_NAME = PARSENAME(DV.TableName, 1)
AND COL.COLUMN_NAME = PARSENAME(DV.ColumnName, 1)
GROUP BY DV.TableName
, DV.TotalRows

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating