empty table is large

  • I have a table used for staging during ETL. It is empty most of the time, but on occasion, during data loading, it can get quite large.

    Even after all records are deleted, the table is still very large & this is making table scans very slow regardless of how many rows it has.

    Table is a heap, no pk but it does have 2 foreign key's.

    Because of these, truncate cant be used when emptying the table. They were added a long time ago to prevent poor data quality causing problems.

    I can resolve this now by creating a clustered index on the table and deleting it, which rebuilds the table but i dont want to have to do this on a semi-regular basis.

    Is my only option to delete the FK's and do a truncate to prevent it getting too large ?

    What is sql servers behavior regarding deallocating space from a table after records are deleted vs truncated?

     

     

    FYI -  i am using this query to get the table size:

    SELECT 
    t.NAME AS TableName,
    s.Name AS SchemaName,
    p.rows AS RowCounts,
    SUM(a.total_pages) * 8 AS TotalSpaceKB,
    CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS TotalSpaceMB,
    SUM(a.used_pages) * 8 AS UsedSpaceKB,
    CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS UsedSpaceMB,
    (SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB,
    CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2) AS NUMERIC(36, 2)) AS UnusedSpaceMB
    FROM
    sys.tables t
    INNER JOIN
    sys.indexes i ON t.OBJECT_ID = i.object_id
    INNER JOIN
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
    INNER JOIN
    sys.allocation_units a ON p.partition_id = a.container_id
    LEFT OUTER JOIN
    sys.schemas s ON t.schema_id = s.schema_id
    WHERE
    t.NAME NOT LIKE 'dt%'
    and t.name = 'mytableName'
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID > 255
    GROUP BY
    t.Name, s.Name, p.Rows
    ORDER BY
    t.Name

     

     

    • This topic was modified 5 years, 4 months ago by  winston Smith.
    • This topic was modified 5 years, 4 months ago by  winston Smith.
  • I'm not sure I follow your point about the foreign keys and truncation. A foreign key doesn't stop you truncating a table, it's having a foreign key referencing the table that stops you (which it clearly won't, as you keep clearing the table down). As you can see in this DB<>Fiddle, the TRUNCATE on the staging table works fine.

    You can try it on your own instance too, if you like:

    USE Sandbox;
    GO

    CREATE TABLE dbo.TestTable1 (ID int PRIMARY KEY CLUSTERED);
    CREATE TABLE dbo.TestTable2 (ID int PRIMARY KEY CLUSTERED);

    CREATE TABLE dbo.StagingTable (fID1 int FOREIGN KEY REFERENCES dbo.TestTable1(ID),
    fID2 int FOREIGN KEY REFERENCES dbo.TestTable2(ID));
    GO

    INSERT INTO dbo.TestTable1 (ID)
    VALUES(1),(2);
    INSERT INTO dbo.TestTable2 (ID)
    VALUES(4),(5);
    INSERT INTO dbo.StagingTable (fID1,
    fID2)
    VALUES(1,4),
    (1,5),
    (2,4);
    GO

    TRUNCATE TABLE dbo.StagingTable;

    GO

    --This won't work
    TRUNCATE TABLE dbo.TestTable1;

    GO

    DROP TABLE dbo.StagingTable;
    DROP TABLE dbo.TestTable2;
    DROP TABLE dbo.TestTable1;

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Apologies, i got mixed up, i remembered that if a table is referenced by a FK it cant be truncated, but thats not the case here, so yes, truncate will work.

    Would appreciate any info on why the table grows and stays the same size when delete is used.

  • You could try using alter table..rebuild. With heaps, empty pages aren't automatically deallocated when doing a delete:

    SQLskills SQL101: Why does my heap have a bunch of empty pages?

    Sue

  • winston Smith wrote:

    Apologies, i got mixed up, i remembered that if a table is referenced by a FK it cant be truncated, but thats not the case here, so yes, truncate will work. Would appreciate any info on why the table grows and stays the same size when delete is used.

    It's going to be quickest to truncate the table if you are sure is no data in other tables referencing it.

    To do this you need to drop the FKs on the other tables then recreate them after the truncate. I've written a stored procedure that I use when I want to load a data warehouse from scratch  (see below). You just call it with the schema name and table name and it will drop the FK's, truncate the table, then recreate the FKs.

    -- ****************************************************************
    -- Will truncate a table that has FK's pointing at it
    -- Sample Call:
    -- EXEC dbo.INFTruncateTable 'DimComponent', 'edw'
    -- EXEC dbo.INFTruncateTable 'DimOrganisation', 'edw',1
    -- EXEC dbo.INFTruncateTable 'Orders', 'dbo',1
    -- ****************************************************************
    CREATE PROCEDURE [dbo].[INFTruncateTable]
    (
    @TableName sysname,
    @TableSchema sysname = 'dbo',
    @Debug bit = 0 -- If called with value 1 just prints information - not truncation takes place
    )
    AS
    BEGIN

    SET NOCOUNT ON

    IF OBJECT_ID('tempdb..#DropCreateFKs','U') IS NOT NULL
    DROP TABLE #DropCreateFKs

    DECLARE @NewLine nvarchar(MAX) = CHAR(13) + CHAR(10)
    DECLARE @FKCount int

    ;WITH CTE AS
    (
    SELECT ss.name AS TABLE_SCHEMA,
    OBJECT_NAME(si.[object_id]) AS TABLE_NAME,
    si.name AS CONSTRAINT_NAME
    FROM sys.indexes si
    INNER JOIN sys.objects so
    ON so.[object_id] = si.[object_id]
    INNER JOIN sys.schemas ss
    ON ss.[schema_id] = so.[schema_id]
    AND ss.name = @TableSchema
    WHERE si.is_unique=1
    AND OBJECT_NAME(si.[object_id]) = @TableName
    )
    SELECT QUOTENAME(@TableSchema) + '.' + QUOTENAME(@TableName) AS TableName,
    'ALTER TABLE ' + QUOTENAME(tc.TABLE_SCHEMA) + '.' + QUOTENAME(tc.TABLE_NAME) + @NewLine
    + ' DROP CONSTRAINT ' + QUOTENAME(tc.CONSTRAINT_NAME) + ';'+ @NewLine AS Drop_FK,

    'ALTER TABLE ' + QUOTENAME(tc.TABLE_SCHEMA) + '.' + QUOTENAME(tc.TABLE_NAME) + @NewLine
    + ' WITH NOCHECK' + @NewLine
    + ' ADD CONSTRAINT ' + QUOTENAME(tc.CONSTRAINT_NAME) + @NewLine
    + ' FOREIGN KEY (' + ColsOnFKTable.Cols + ')' + @NewLine
    + ' REFERENCES ' + QUOTENAME(ccu.TABLE_SCHEMA) + '.' + QUOTENAME(ccu.TABLE_NAME) + '(' + UniqueCols.Cols + ') ' + @NewLine
    + ' ON DELETE ' + rc.DELETE_RULE + @NewLine
    + ' ON UPDATE ' + rc.UPDATE_RULE + ';' + @NewLine AS Create_FK
    INTO #DropCreateFKs
    FROM CTE as ccu
    CROSS APPLY (SELECT STUFF((SELECT ', ' + QUOTENAME(ccu2.COLUMN_NAME)
    FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu2
    WHERE ccu2.TABLE_SCHEMA = ccu.TABLE_SCHEMA
    AND ccu2.TABLE_NAME = ccu.TABLE_NAME
    AND ccu2.CONSTRAINT_NAME = ccu.CONSTRAINT_NAME
    FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)'),1,2,'') AS Cols
    ) AS UniqueCols
    CROSS APPLY (SELECT rc.CONSTRAINT_CATALOG,
    rc.CONSTRAINT_NAME,
    rc.DELETE_RULE,
    rc.UPDATE_RULE
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS rc
    WHERE rc.UNIQUE_CONSTRAINT_NAME = ccu.CONSTRAINT_NAME
    ) AS rc
    CROSS APPLY (SELECT tc.CONSTRAINT_CATALOG,
    tc.TABLE_SCHEMA,
    tc.TABLE_NAME,
    tc.CONSTRAINT_NAME
    FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
    WHERE tc.CONSTRAINT_TYPE = 'FOREIGN KEY'
    AND tc.CONSTRAINT_NAME = rc.CONSTRAINT_NAME
    AND tc.CONSTRAINT_CATALOG = rc.CONSTRAINT_CATALOG
    ) AS tc
    CROSS APPLY (SELECT STUFF((SELECT ', ' + QUOTENAME(ccu2.COLUMN_NAME)
    FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu2
    WHERE ccu2.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
    AND ccu2.CONSTRAINT_CATALOG = tc.CONSTRAINT_CATALOG
    FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)'),1,2,'') AS Cols
    ) AS ColsOnFKTable

    SET @FKCount = @@ROWCOUNT

    DECLARE @sSqlDrop AS nvarchar(MAX) = ''
    DECLARE @sSqlCreate AS nvarchar(MAX) = ''
    DECLARE @sSqlTruncate AS nvarchar(MAX) = ''
    SET @sSqlTruncate = @TableSchema + '.' + @TableName

    SELECT @sSqlDrop = @sSqlDrop + x.Drop_FK,
    @sSqlCreate = @sSqlCreate + x.Create_FK
    FROM #DropCreateFKs x

    BEGIN TRY

    BEGIN TRANSACTION

    IF @Debug = 1 BEGIN
    PRINT '-- Dropping FKs on table ' + @TableSchema + '.' + @TableName
    PRINT @sSqlDrop
    END
    ELSE BEGIN
    EXEC (@sSqlDrop)
    END

    SET @sSqlTruncate='TRUNCATE TABLE ' + QUOTENAME(@TableSchema) + '.' + QUOTENAME(@TableName)

    IF @Debug = 1 BEGIN
    PRINT @sSqlTruncate
    END
    ELSE BEGIN
    EXEC (@sSqlTruncate)
    END

    PRINT '-- Creating FKs on table ' + @TableSchema + '.' + @TableName

    IF @Debug = 1 BEGIN
    PRINT @sSqlCreate
    END
    ELSE BEGIN
    EXEC (@sSqlCreate)
    END

    COMMIT

    END TRY

    BEGIN CATCH

    PRINT 'Rolling back transaction';
    IF @@TRANCOUNT > 0 BEGIN
    ROLLBACK
    END;
    THROW;

    END CATCH

    IF @Debug = 1 AND @FKCount > 0
    SELECT *
    FROM #DropCreateFKs

    DROP TABLE #DropCreateFKs

    END
  • winston Smith wrote:

    Apologies, i got mixed up, i remembered that if a table is referenced by a FK it cant be truncated, but thats not the case here, so yes, truncate will work. Would appreciate any info on why the table grows and stays the same size when delete is used.

    The reason the table continues to grow is because of the way a HEAP is structured.  Deleting data from a HEAP does not free up the previously utilized space - and you end up with unused (and unusable) space allocated to the table.  New rows are always added to the end of the table.

    ALTER TABLE ... REBUILD; should restructure and clear that space, returning it back to the database to be reallocated.  However - there are some situations where that may not free up all unused space in the table.

    You can either truncate the table or permanently add a clustered index.  If you already have a step in your load process that rebuilds those FK indexes - then including a clustered index and rebuilding all indexes shouldn't cause a huge increase in processing time.  That will all depend on how much data - and the order the data is loaded.  If you can build your clustered index to match the order of the data being inserted - it would have minimal effect (note: the clustered index does not have to be unique).

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply