July 25, 2010 at 1:47 am
I have a table of more than 1 billion rows. It does insertion and deletion (>=20,000 rows each time) many times a day.
It has one NON-UNIQUE Clustered index, and that's it. I would think that with this much of data modification on this big table every day, its fragmentation would get pretty high, but its fragmentation is only 2%. I don't understand this. Do you have any idea or what could explain its fragmentation so low? The table has not been defrag since it was created. Any inputs would be greatly appreciated.
July 25, 2010 at 2:40 am
Without seeing things like what the indexes (clustered and nonclustered) are on, what ranges of data is inserted and deleted, it's near impossible to say.
Do note that deletes do not cause fragmentation. Inserts can, depending on where the new rows are in the index, but deletes cannot.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 25, 2010 at 9:33 am
Thanks Gail. The non-unique clustered is on the companyID(integer) key, and insertion and deletion is done at each of these companyid (at the where clause). I'm glad that it has very low fragmentation, but it just puzzles me.
July 25, 2010 at 9:39 am
Int Identity? Int something else? Always deleting the same number of records per companyID that you're inserting?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 25, 2010 at 10:04 am
The int is the companyID (it's probably was an identity from the source we get the data from, but not when it is insert into our table), and the companyID would repeat for every record of thousands/hundred thousands for each insert and delete, this is why it we can't put a primary key (since PK is clustered and unique; so the companyid can't be repeated).
the table is something like this (it has more columns than the example, but just to give you an idea, and the companyID
was put in the middle (it's not the good design, but I don't think that should make a difference, do you?)).
productId saleid url companyid inputdate
1 1 http://www.somecompany.com 12345 2009-01-01
July 25, 2010 at 10:07 am
oh, and yes, it always delete the same number of records as the number were inserted. This is because once those records for that particular companyid have moved to the 2nd table, the process would delete those records from the first table. So, the number of inserts and delete for a particular company on the 1st table are always the same.
July 25, 2010 at 10:39 am
So the deletes make space and the inserts reuse that space. If the rows are all the same size, I'm not too surprised this causes minimal fragmentation.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 25, 2010 at 11:19 am
Thanks much. Your explanation makes sense. I feel better now :).
July 26, 2010 at 5:16 am
IIRC the way frag is presented/calculated on heaps is a bit different than for clustered tables. Bit early this morning so could be confuzled here. 🙂 You didn't specify what level of the index_physical_stats scan you were doing either, which can also cause reported differences.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 26, 2010 at 5:24 am
TheSQLGuru (7/26/2010)
IIRC the way frag is presented/calculated on heaps is a bit different than for clustered tables.
Yup. The avg_fragmentation_in_percent in dm_db_index_physical_stats is extent fragmentation for a heap and logical fragmentation for a cluster/nonclustered index.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 26, 2010 at 6:35 am
This table has one non-unique clustered, so it couldn't be a heap (a heap is a table has no index, correct?).
I'm using the avg_fragmentation_in_percent in dm_db_index_physical_stats. I also compared it to the old way
to do it in sql 2000, DBCC Showcontig, and the % is the same as the logical fragmentation %.
July 26, 2010 at 6:40 am
sqlblue (7/26/2010)
This table has one non-unique clustered, so it couldn't be a heap (a heap is a table has no index, correct?).
A heap is a table without a clustered index.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply