November 10, 2016 at 9:12 am
I have a table with two columns:
Col1 is a varchar(250)
Col2 is a varchar(20)
I have an SSIS package that runs a few times a day that deletes all rows from the tables, and then imports data from a flat file. The total row count is ~ 830,000
For some reason, the table size creeps up over time, right now it is 23,264,888 kb reserved with 20,870,048 kbs in data; that is way too big for the amount of data in the table.
For testing, I created a second table and imported the same data, and the row count is the same but the size is 37,704 kb reserved and 29,712 of data which is more in line with what I would expect.
No indexes or primary keys on either table.
Any reason this would grow this large over time?
November 10, 2016 at 9:16 am
Is the table a heap?
I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.
November 10, 2016 at 9:19 am
I ran this:
select
o.name,
o.object_id,
case
when p.index_id = 0 then 'Heap'
when p.index_id = 1 then 'Clustered Index/b-tree'
when p.index_id > 1 then 'Non-clustered Index/b-tree'
end as 'Type'
from sys.objects o
inner join sys.partitions p on p.object_id = o.object_id
where name = 'TableName'
And it shows as a Heap.
November 10, 2016 at 9:28 am
When you delete rows in a heap, it can't reallocate the space it had been using.
One of the myriad ways heaps suck.
I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.
November 10, 2016 at 9:35 am
Wow, seriously? So it will just keep adding space to the table?
If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.
Any suggestions?
Also, what's the best way to shrink that table; drop and add back?
Thanks!
November 10, 2016 at 10:01 am
josh-1127203 (11/10/2016)
Wow, seriously? So it will just keep adding space to the table?If that's the case, what is the best approach to deal with this type of situation? I use the table as a "staging" table, delete, import and the update a production table with the data from it.
Any suggestions?
Also, what's the best way to shrink that table; drop and add back?
Thanks!
ALTER TABLE [i]tablename[/i] REBUILD
-----------------------------------------------------------------------------------------------------------
"Ya can't make an omelette without breaking just a few eggs" 😉
November 10, 2016 at 10:22 am
OK, thanks for the alter statement. I've been reading up on heap versus clustered and I am not sure why in this type of situation you would use a clustered index. There may or may not be a unique value to build it on so what's the point.
If I added a unique ID, it would just grow indefinitely w/ all of the inserts/deletes without reseeding.
November 10, 2016 at 10:33 am
Using TRUNCATE rather than DELETE will reset the identity
I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.
November 10, 2016 at 1:08 pm
OK, something is off here...please tell me I am not crazy.
So, for testing, I added two tables to my db:
TABLEC
column ID int, identity
column PN varchar(250)
I added a clustered index to this table on the ID column.
TABLEH
column PN varchar(250)
I added the same records into the PN column (~350,000)
TABLEC space
10,440 kb reserved
10,304 kb used
TABLEH space
8,264 kb reserved
8,200 kb used
I get the size diff because of the extra column and index, but when I deleted the records from these two tables, it appeared to release the space of the Heap table, contrary to what I was thinking the original problem was. So does SQL 2012 actual remove the unused data?
After delete:
TABLEC space
88 kb reserved
8 kb used
TABLEH space
200 kb reserved
144 kb used
November 10, 2016 at 1:13 pm
With a heap, if you delete all the rows, the only time the pages are deallocated is if the delete took a table lock (by force or by choice iirc)
Safest thing: put a clustered index on (doesn't have to be on an identity or even unique)
Other option: truncate the table instead of deleting all rows
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 10, 2016 at 1:30 pm
OK, so TRUNCATE TABLE TableName definitely took it to 0 KB; that seems like a good fix for this w/o adding the clustered index.
I guess my follow up question is what is the reason to add the clustered index to this table other than it freeing the space for me since it's basically a hard coded temp table? Are there other considerations in this instance that I am overlooking?
Also, in a different scenario, say I have a Products table, with ID (int, autonum), and ProductName (varchar). If I just leave as a heap and create a non-clustered index what is the disadvantage of taking that approach versus adding a clustered on the ID column? (And if I have a bunch of tables like that, should I go back and add a clustered or leave as a heap?)
Thanks, the fog is almost lifting here for me.
November 11, 2016 at 5:23 am
I'm afraid, as ever, it depends.
If this is a staging table where you will be running a lot of transformations on the data, then the addition of an appropriate clustered index (CI), and possibly nci's, can significantly increase the speed of your ETL processes (loading as a heap and adding the CI later is a popular pattern to increase load speed here). Or it may not.
For transformations you'll also potentially run into this problem too http://www.sqlskills.com/blogs/paul/forwarding-and-forwarded-records-and-the-back-pointer-size/ which may have a negative impact on you.
It all boils down to one thing. Test what's best for your use case.
I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply