Table Size Growing Over Time

Question

Post reply

Table Size Growing Over Time

josh-1127203

SSC Eights!

Points: 876
More actions
November 10, 2016 at 9:12 am

#322824

I have a table with two columns:
Col1 is a varchar(250)
Col2 is a varchar(20)
I have an SSIS package that runs a few times a day that deletes all rows from the tables, and then imports data from a flat file. The total row count is ~ 830,000
For some reason, the table size creeps up over time, right now it is 23,264,888 kb reserved with 20,870,048 kbs in data; that is way too big for the amount of data in the table.
For testing, I created a second table and imported the same data, and the row count is the same but the size is 37,704 kb reserved and 29,712 of data which is more in line with what I would expect.
No indexes or primary keys on either table.
Any reason this would grow this large over time?

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

andrew gothard SSChampion Points: 12315 More actions · Answer 1

Is the table a heap?

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

josh-1127203 SSC Eights! Points: 876 More actions · Answer 2

I ran this:

select

o.name,

o.object_id,

case

when p.index_id = 0 then 'Heap'

when p.index_id = 1 then 'Clustered Index/b-tree'

when p.index_id > 1 then 'Non-clustered Index/b-tree'

end as 'Type'

from sys.objects o

inner join sys.partitions p on p.object_id = o.object_id

where name = 'TableName'

And it shows as a Heap.

andrew gothard SSChampion Points: 12315 More actions · Answer 3

When you delete rows in a heap, it can't reallocate the space it had been using.

One of the myriad ways heaps suck.

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

josh-1127203 SSC Eights! Points: 876 More actions · Answer 4

Wow, seriously? So it will just keep adding space to the table?

If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.

Any suggestions?

Also, what's the best way to shrink that table; drop and add back?

Thanks!

Perry Whittle SSC Guru Points: 234013 More actions · Answer 5

josh-1127203 (11/10/2016)
Wow, seriously? So it will just keep adding space to the table?
If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.
Any suggestions?
Also, what's the best way to shrink that table; drop and add back?
Thanks!

ALTER TABLE [i]tablename[/i] REBUILD

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

josh-1127203 SSC Eights! Points: 876 More actions · Answer 6

OK, thanks for the alter statement. I've been reading up on heap versus clustered and I am not sure why in this type of situation you would use a clustered index. There may or may not be a unique value to build it on so what's the point.

If I added a unique ID, it would just grow indefinitely w/ all of the inserts/deletes without reseeding.

andrew gothard SSChampion Points: 12315 More actions · Answer 7

Using TRUNCATE rather than DELETE will reset the identity

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

josh-1127203 SSC Eights! Points: 876 More actions · Answer 8

OK, something is off here...please tell me I am not crazy.

So, for testing, I added two tables to my db:

TABLEC

column ID int, identity

column PN varchar(250)

I added a clustered index to this table on the ID column.

TABLEH

column PN varchar(250)

I added the same records into the PN column (~350,000)

TABLEC space

10,440 kb reserved

10,304 kb used

TABLEH space

8,264 kb reserved

8,200 kb used

I get the size diff because of the extra column and index, but when I deleted the records from these two tables, it appeared to release the space of the Heap table, contrary to what I was thinking the original problem was. So does SQL 2012 actual remove the unused data?

After delete:

TABLEC space

88 kb reserved

8 kb used

TABLEH space

200 kb reserved

144 kb used

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

With a heap, if you delete all the rows, the only time the pages are deallocated is if the delete took a table lock (by force or by choice iirc)

Safest thing: put a clustered index on (doesn't have to be on an identity or even unique)

Other option: truncate the table instead of deleting all rows

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

josh-1127203 SSC Eights! Points: 876 More actions · Answer 10

OK, so TRUNCATE TABLE TableName definitely took it to 0 KB; that seems like a good fix for this w/o adding the clustered index.

I guess my follow up question is what is the reason to add the clustered index to this table other than it freeing the space for me since it's basically a hard coded temp table? Are there other considerations in this instance that I am overlooking?

Also, in a different scenario, say I have a Products table, with ID (int, autonum), and ProductName (varchar). If I just leave as a heap and create a non-clustered index what is the disadvantage of taking that approach versus adding a clustered on the ID column? (And if I have a bunch of tables like that, should I go back and add a clustered or leave as a heap?)

Thanks, the fog is almost lifting here for me.

andrew gothard SSChampion Points: 12315 More actions · Answer 11

I'm afraid, as ever, it depends.

If this is a staging table where you will be running a lot of transformations on the data, then the addition of an appropriate clustered index (CI), and possibly nci's, can significantly increase the speed of your ETL processes (loading as a heap and adding the CI later is a popular pattern to increase load speed here). Or it may not.

For transformations you'll also potentially run into this problem too http://www.sqlskills.com/blogs/paul/forwarding-and-forwarded-records-and-the-back-pointer-size/ which may have a negative impact on you.

It all boils down to one thing. Test what's best for your use case.

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.