Determinining which tables have data...

  • Is there any easy way to determine which tables in a database contain actual data?

    Also, is there an easy way to tell which tables have been updated the most recently?

    Basically, I have inherited an HP Service Manager database.

    The database itself supports the HP Service Manager application, but it has no referential integrity, relationships, etc. All that is supported through the application code.

    Knowing how "bad" it is to not have the relationships, I'm going to create them myself.

    But in doing that, I will have to first create a relational model of how the data flows.

    So that HP application has about 400 tables in it, and I'm wanting an easy way to determine which of those tables we are actually using and then which ones are are used "frequently".

    Any ideas?

  • From what I can understand you need to know more about your database structure... Basically a better understanding of the database... Try out some third party tools... I'll suggest Dezign... It's a database modeling tool... You can create database reports using that...

    To know which all tables contains data run this script...

    Use [Database_Name]

    -- Create the temp table for further querying

    CREATE TABLE #temp(

    rec_idint IDENTITY (1, 1),

    table_namevarchar(128),

    nbr_of_rowsint,

    data_spacedecimal(15,2),

    index_spacedecimal(15,2),

    total_sizedecimal(15,2),

    percent_of_dbdecimal(15,12),

    db_sizedecimal(15,2))

    -- Get all tables, names, and sizes

    EXEC sp_msforeachtable @command1="insert into #temp(nbr_of_rows, data_space, index_space) exec sp_mstablespace '?'",

    @command2="update #temp set table_name = '?' where rec_id = (select max(rec_id) from #temp)"

    -- Set the total_size and total database size fields

    UPDATE #temp

    SET total_size = (data_space + index_space), db_size = (SELECT SUM(data_space + index_space) FROM #temp)

    -- Set the percent of the total database size

    UPDATE #temp

    SET percent_of_db = (total_size/db_size) * 100

    -- Get the data

    SELECT *

    FROM #temp

    ORDER BY total_size DESC

    Hope this helps... 🙂

    Sanz
  • Here's a SQL 2005 query to get the same info without the temp table and cursor (inside sp_msforeachtable). I like using CAST( AS MONEY) in this kind of query to get large numbers formatted with commas every three digits.

    WITH tableinfo AS (

    SELECT s.[name] + '.' + o.[name] AS table_name, i.[rows], i.total_pages, i.used_pages, i.data_pages

    FROM (

    SELECT i.[object_id], MIN(i.index_id) AS index_id,

    SUM(CASE WHEN i.index_id 1 THEN 0 WHEN a.[type] 1 THEN a.used_pages ELSE a.data_pages END) AS data_pages

    FROM sys.indexes i

    INNER JOIN sys.partitions p ON p.[object_id] = i.[object_id] AND p.index_id = i.index_id

    INNER JOIN sys.allocation_units a ON a.container_id = CASE a.[type] WHEN 2 THEN p.partition_id ELSE p.hobt_id END

    GROUP BY i.[object_id]

    ) i

    INNER JOIN sys.all_objects o ON o.[object_id] = i.[object_id]

    INNER JOIN sys.schemas s ON s.[schema_id] = o.[schema_id])

    SELECT table_name,

    REPLACE(CONVERT(VARCHAR, CAST([rows] AS MONEY), 1), '.00', '') AS [rows],

    REPLACE(CONVERT(VARCHAR, CAST(total_pages * 8 AS MONEY), 1), '.00', '') AS total_kbytes,

    REPLACE(CONVERT(VARCHAR, CAST(used_pages * 8 AS MONEY), 1), '.00', '') AS used_kbytes,

    REPLACE(CONVERT(VARCHAR, CAST(data_pages * 8 AS MONEY), 1), '.00', '') AS data_kbytes,

    REPLACE(CONVERT(VARCHAR, CAST((used_pages - data_pages) * 8 AS MONEY), 1), '.00', '') AS index_kbytes,

    CAST(total_pages * 100.0 / NULLIF(db_pages, 0) AS DECIMAL(6,2)) AS percent_of_db

    FROM tableinfo t

    CROSS JOIN (SELECT SUM(total_pages) AS db_pages FROM tableinfo) d

    ORDER BY t.total_pages desc

    There is no easy way to check table usage, unless the application is nice enough to maintain date-changed columns on all tables. (And if you can trust them.) You can use triggers and tracing to track this info, if you're confident that the server can handle the overhead. If the server is heavily loaded and/or this is a high volume OLTP application, this is probably not a good idea.

    You could track data modifications by adding a trigger to every table that logs the table name, date + time, and #rows to a new table.

    Tracking reads is more complicated, and involves a lot more overhead. You can start a trace that gets the object_id from schema lock events to find out when tables are referenced. Be careful to limit the amount of data collected to make sure it doesn't eat all the disk space on your server. You also have to decide how long a table must go without being read to consider it unused.

  • Scott,

    Thanks for the script man, it gives me just what I needed in terms of being able to tell what are my primary storage concerns. The database itself is a fairly heavy laden OLTP setup, so the traces will most likely be too intensive. But I think your script gives me quite a bit of info anyway.

    Thanks again.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply