September 23, 2011 at 4:05 pm
Two excerpts from Paul Randal's Blog on removing heap fragmentation http://www.sqlskills.com/BLOGS/PAUL/post/A-SQL-Server-DBA-myth-a-day-(2930)-fixing-heap-fragmentation.aspx
Yes, you can use ALTER TABLE ... REBUILD in SQL Server 2008 to remove heap fragmentation, but that is almost as bad as creating and dropping a clustered index!
.
. {truncated}
.
If you think you can use ALTER TABLE ... REBUILD in SQL Server 2008 to fix heap fragmentation, you can, but it causes all the nonclustered indexes to be rebuilt as the heap record locations obviously change.
I looked around online for a bit and could not find a definitive answer. I am wondering if anyone knows if during a heap rebuild if the non-clustered indexes are updated with the new RIDs as the heap is being defragmented then rebuilt, or if the rebuild process is coded to skip the RID updates while the the heap is being defragmented?
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 5:53 am
Yeah, it would have to be. That's my understanding of how it works.
Heaps are generally weak tables. They have a place in design, but their use should be relatively exceptional. SQL Server is built around the clustered index, so should database designs be.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 25, 2011 at 7:43 am
Grant Fritchey (9/25/2011)
Yeah, it would have to be. That's my understanding of how it works.
Thanks for the response Grant.
Heaps are generally weak tables. They have a place in design, but their use should be relatively exceptional. SQL Server is built around the clustered index, so should database designs be.
In new design I default to having a clustered index, and only revert to using a heap when I am convinced it will help in some significant way. I am only having to go deep on this issue because I have a client with a third-party OLTP app where the backend database is all heaps. The third-party app did not start out using SQL Server, they ported from another database platform and did not catch on to the idea of using unique clustered indexes.
My client is migrating to 2008 and I want to know my options for database maintenance of the third-party app. On 2005 I have essentially coded the ALTER TABLE...REBUILD by hand. I want to be sure ALTER TABLE...REBUILD is not performing extra work on non-clustered indexes before I wholesale replace my routine post-migration. I am also doing a little research to push back to Ola. I am hoping he will add heap-defragmentation support to IndexOptimize sometime in the near future, even if it is only supported on 2008. The comparable routine for 2005 (which includes adding a unique clustered index then dropping it) gets quite messy in a hurry.
-Orlando
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 7:53 am
opc.three (9/23/2011)
I am wondering if anyone knows if during a heap rebuild if the non-clustered indexes are updated with the new RIDs as the heap is being defragmented then rebuilt, or if the rebuild process is coded to skip the RID updates while the the heap is being defragmented?
They'd have to be rebuilt with the new RID, otherwise the NC index for rowX would point somewhere completely different. With a heap, nonclustered indexes have the RID as a row pointer (where the rest of the row is). That's file, page, slot, so if those change (in a heap rebuild), all nonclustered indexes would have to be rebuilt or be incorrect and invalid
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 25, 2011 at 8:00 am
GilaMonster (9/25/2011)
opc.three (9/23/2011)
I am wondering if anyone knows if during a heap rebuild if the non-clustered indexes are updated with the new RIDs as the heap is being defragmented then rebuilt, or if the rebuild process is coded to skip the RID updates while the the heap is being defragmented?They'd have to be rebuilt with the new RID, otherwise the NC index for rowX would point somewhere completely different. With a heap, nonclustered indexes have the RID as a row pointer (where the rest of the row is). That's file, page, slot, so if those change (in a heap rebuild), all nonclustered indexes would have to be rebuilt or be incorrect and invalid
I get how the RIDs work and are used in non-clustered indexes. That is one internal design decision that makes heaps so bad. My question was more:
Does ALTER TABLE...REBUILD update all the non-clustered indexes with the new RIDs as the heap is being rebuilt followed by a rebuild of all indexes (i.e. double work) or is it intelligent enough to disable the indexes, then rebuild the heap, then rebuild the indexes from scratch at the end once the new RIDs are all assigned?
In 2005 I am doing the following:
1. DISABLE all non-clustered indexes
2. Add an identity column and make it the unique clustered index
3. Drop the identity column
4. Issue ALTER INDEX...REBUILD on all non-clustered indexes I disabled in step 1
In 2008 I am wondering, can I:
1. Issue ALTER TABLE...REBUILD
Or should I:
1. DISABLE all non-clustered indexes
2. Issue ALTER TABLE...REBUILD
3. Issue ALTER INDEX...REBUILD on all non-clustered indexes I disabled in step 1
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 8:20 am
Hi,
I also have to work with a third part OLTP that uses all heaps (It sounds just like yours - historically coming from C-ISAM), and was just wondering what is the benefit you gain from your process of defragmenting these heaps?
I can only think that you could gain some free space from losing forwarding entries, but other than that, what does it help with? I am not questioning your decision to do it, just interested to see if it's something I/we should be doing :w00t:
Thanks
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
September 25, 2011 at 8:21 am
PS maybe "double work" is not the right term...how about "extra work" updating presumably fragmented non-clustered indexes
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 8:38 am
mister.magoo (9/25/2011)
Hi,I also have to work with a third part OLTP that uses all heaps (It sounds just like yours - historically coming from C-ISAM), and was just wondering what is the benefit you gain from your process of defragmenting these heaps?
I can only think that you could gain some free space from losing forwarding entries, but other than that, what does it help with? I am not questioning your decision to do it, just interested to see if it's something I/we should be doing :w00t:
Thanks
As I understand it, forwarding entries are what cause a lot of the overhead. You cannot provide a fill factor for a heap so as soon as you start doing updates you see splits and end up with forwarding entries. The entries have to be kept up to date if that page ever splits again in the future, and there is a a back-pointer on the forwarded page to maintain as well. It gets messy in a hurry for an insert/update-intense system, like an OLTP system such as this one. A high number of forwarding entries cumulatively creates a lot more work when retrieving and updating the heap. What I have to deal with is a design that was taken to the extreme, dozens of heaps with 50+ columns and 25+ composite non-clustered indexes on them. The bad properties of heaps are emphasized by this type of design and as they added more users the problems escalated in a hurry. Defragmenting the heaps and indexes using the method I showed above reduced blocking in the database significantly. Hopefully the system your adminning is not as poorly designed. My rebuild job is scheduled to run multiple times per week just to keep blocking manageable. Luckily the data volume is somewhat manageable.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 8:53 am
Thanks for the info.
My understanding is that you only get forwarding entries if an update changes the size of the row such that it won't fit in the existing physical storage location - is that correct?
If so, then I am lucky because the system I deal with has fixed size rows (all chars, ints, floats and datetimes - nothing else).
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
September 25, 2011 at 9:31 am
mister.magoo (9/25/2011)
Thanks for the info.My understanding is that you only get forwarding entries if an update changes the size of the row such that it won't fit in the existing physical storage location - is that correct?
If so, then I am lucky because the system I deal with has fixed size rows (all chars, ints, floats and datetimes - nothing else).
You lucky dog. Since you have all fixed sized columns you are OK. Your ISAM comment makes prefect sense now 🙂
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 10:58 am
Edit: Blah.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 25, 2011 at 2:43 pm
Edit: blah
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 25, 2011 at 3:05 pm
Sorry, not rebuilds (it's a heap anyway), and can't find the ref I thought existed for triggers, so ignore.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 26, 2011 at 1:08 pm
@mm Something occurred to me, you may still have heavy fragmentation from the continuous addition of new rows and the deleting of existing rows over time. Here is what I use for finding heap fragmentation:
SELECT DB_NAME(database_id) AS database_name,
OBJECT_SCHEMA_NAME(object_id, database_id) AS [schema_name],
OBJECT_NAME(object_id, database_id) AS table_name,
index_type_desc,
alloc_unit_type_desc,
avg_fragmentation_in_percent,
fragment_count,
avg_fragment_size_in_pages,
page_count,
avg_page_space_used_in_percent,
record_count,
ghost_record_count,
version_ghost_record_count,
min_record_size_in_bytes,
max_record_size_in_bytes,
avg_record_size_in_bytes,
forwarded_record_count
FROM sys.dm_db_index_physical_stats(NULL, NULL, NULL, NULL, 'DETAILED')
WHERE index_type_desc = 'HEAP'
AND page_count > 1000 ;
Note this will check all DBs on the instance. I use DETAILED because LIMITED does not show you the forwarded_record_count. It takes longer but it's relevant for my system, it may not be for yours.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
September 26, 2011 at 7:20 pm
Sorry Grant and Gail if I am being dense, but I am still not sure I got an answer for my original question. Maybe I am incorrect, but it seems both of you originally answered the question as to whether the indexes would be updated with the new RIDs as the heap was rebuilt and pages were physically moved. I was never in doubt as to whether the non-clustered indexes would be updated with the new RID locations, rather I was in doubt as to the efficiency of simply calling ALTER TABLE...REBUILD versus whether I should take control of the index rebuilds that would certainly need to occur by disabling them first and then rebuilding them after the heap was rebuilt. I started creating a test-bed for this problem but found that creating fragmented heaps from scratch was more difficult than it initially seemed like it would be. Am I chasing minor bits here?
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 15 posts - 1 through 15 (of 31 total)
You must be logged in to reply to this topic. Login to reply