Table Size Growing Over Time

  • I have a table with two columns:

    Col1 is a varchar(250)

    Col2 is a varchar(20)

    I have an SSIS package that runs a few times a day that deletes all rows from the tables, and then imports data from a flat file. The total row count is ~ 830,000

    For some reason, the table size creeps up over time, right now it is 23,264,888 kb reserved with 20,870,048 kbs in data; that is way too big for the amount of data in the table.

    For testing, I created a second table and imported the same data, and the row count is the same but the size is 37,704 kb reserved and 29,712 of data which is more in line with what I would expect.

    No indexes or primary keys on either table.

    Any reason this would grow this large over time?

  • Is the table a heap?

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • I ran this:

    select

    o.name,

    o.object_id,

    case

    when p.index_id = 0 then 'Heap'

    when p.index_id = 1 then 'Clustered Index/b-tree'

    when p.index_id > 1 then 'Non-clustered Index/b-tree'

    end as 'Type'

    from sys.objects o

    inner join sys.partitions p on p.object_id = o.object_id

    where name = 'TableName'

    And it shows as a Heap.

  • When you delete rows in a heap, it can't reallocate the space it had been using.

    One of the myriad ways heaps suck.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Wow, seriously? So it will just keep adding space to the table?

    If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.

    Any suggestions?

    Also, what's the best way to shrink that table; drop and add back?

    Thanks!

  • josh-1127203 (11/10/2016)


    Wow, seriously? So it will just keep adding space to the table?

    If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.

    Any suggestions?

    Also, what's the best way to shrink that table; drop and add back?

    Thanks!

    ALTER TABLE [i]tablename[/i] REBUILD

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • OK, thanks for the alter statement. I've been reading up on heap versus clustered and I am not sure why in this type of situation you would use a clustered index. There may or may not be a unique value to build it on so what's the point.

    If I added a unique ID, it would just grow indefinitely w/ all of the inserts/deletes without reseeding.

  • Using TRUNCATE rather than DELETE will reset the identity

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • OK, something is off here...please tell me I am not crazy.

    So, for testing, I added two tables to my db:

    TABLEC

    column ID int, identity

    column PN varchar(250)

    I added a clustered index to this table on the ID column.

    TABLEH

    column PN varchar(250)

    I added the same records into the PN column (~350,000)

    TABLEC space

    10,440 kb reserved

    10,304 kb used

    TABLEH space

    8,264 kb reserved

    8,200 kb used

    I get the size diff because of the extra column and index, but when I deleted the records from these two tables, it appeared to release the space of the Heap table, contrary to what I was thinking the original problem was. So does SQL 2012 actual remove the unused data?

    After delete:

    TABLEC space

    88 kb reserved

    8 kb used

    TABLEH space

    200 kb reserved

    144 kb used

  • With a heap, if you delete all the rows, the only time the pages are deallocated is if the delete took a table lock (by force or by choice iirc)

    Safest thing: put a clustered index on (doesn't have to be on an identity or even unique)

    Other option: truncate the table instead of deleting all rows

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK, so TRUNCATE TABLE TableName definitely took it to 0 KB; that seems like a good fix for this w/o adding the clustered index.

    I guess my follow up question is what is the reason to add the clustered index to this table other than it freeing the space for me since it's basically a hard coded temp table? Are there other considerations in this instance that I am overlooking?

    Also, in a different scenario, say I have a Products table, with ID (int, autonum), and ProductName (varchar). If I just leave as a heap and create a non-clustered index what is the disadvantage of taking that approach versus adding a clustered on the ID column? (And if I have a bunch of tables like that, should I go back and add a clustered or leave as a heap?)

    Thanks, the fog is almost lifting here for me.

  • I'm afraid, as ever, it depends.

    If this is a staging table where you will be running a lot of transformations on the data, then the addition of an appropriate clustered index (CI), and possibly nci's, can significantly increase the speed of your ETL processes (loading as a heap and adding the CI later is a popular pattern to increase load speed here). Or it may not.

    For transformations you'll also potentially run into this problem too http://www.sqlskills.com/blogs/paul/forwarding-and-forwarded-records-and-the-back-pointer-size/ which may have a negative impact on you.

    It all boils down to one thing. Test what's best for your use case.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply