Fill Factor, DB REINDEX and Tables On A Heap

  • Hi

    We have a number of quite large tables (100 million rows) which are inserted into regularly and which have a non-clustered primary key and no other clustered index, ie they are on a heap.

    We have a weekly maintenance job which performs a DBCC REINDEX on all tables (including those on a heap) specifying a fill factor of 85. I have a reasonably good understanding of what happens with the clustered indexes but am not sure what is happening with the non-clustered indexes. Is there any point in specifying a fill factor for a non-clustered index or indeed doing a DBCC REINDEX on it at all?

    Thanks in advance. 

  • Here's an interesting white-paper:

    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Fill factor seems to be one of those terms that sometimes are important and at other times is not. My mentor made sure I understood what a fill factor was and always used it when creating indexes. Now I have read in several articles that fill factors should be ignored because it is set when the index is created but never used afterward. "Fill factors are only applied when the index is created. As keys are inserted and deleted, the index will eventually stabilize at a certain density." BOL - Table Indexes.

    Maybe it is a hold over from the 6.5 days....

    Quand on parle du loup, on en voit la queue

  • That is partly right. Fill Factors are disregarded whilst performing standard insert/update/delete operations on a table (otherwise SQL Server would need to do page splits to maintain the original density which would defeat the point) but when a DBCC REINDEX command is issued against a clustered index SQL Server will attempt to get the page density back as close as possible to the original fill factor.

    My question is more related to how Fill Factors apply (if at all) to non-clustered indexes and to what happens when a DBCC REINDEX is run against a non-clustered index.

  • Data pages in Non-Clustered Indexes are just data pages like any other, they can be up to 8k in size.

    If the 8k is reached and a new record needs to insert inside the page a split will occurr.

    Fill Factor of course allows the data to be filled to a specific percentage so you have free space and can reduce the number of splits that occurr within the index.

    If a page split occurrs not only is the data moved to a new page but several underlying things are updated that you cannot see to make note of the new page and the data shift.

    To reduce splits you need to look at you data as it will occurr with the given index.

    For instance if the index is on an indentity column you know the value that inserts will always be greater than those before it so a fill factor of 100% is fine as no data will insert before the end.

    If though you have an index on a column with last names you may want to use 70-80% and monitor over a period of time the amount of change with DBCC SHOWCONTIG to see if the percentage is fine between maintainence timeframes.

    Now another factor to keep in mind is what happens with the data based on query type.

    On Select of course it is readonly and used for location.

    With Insert the data is added and when a page is already full a split will occurr.

    With Delete the data is just removed from the structure and nothing else happens (so you have a hole in that location until reindex).

    With Update keep in mind a Delete occurrs then an Insert and their related action applies.

    Now with DBCC DBREINDEX same rules as always apply to it's action with regards to non-clustered as clustered, just doesn't affect the actual data data pages in anyway.

    I hope that answers your question.

  • Okay, I think I've got my head round this at last.

    What was confusing me was that I was under the impression that Fillfactor affected data pages and the rows in data pages for a table on a heap are in random order so why would SQL Server need to perform a page split on an insert? The answer is that Fillfactor affects the leaf level of an index and not specifically the data pages. The fact that it affects the data pages for a clustered index is because the data is at the leaf level.

    Of course, the leaf level of a non-clustered index on a heap contains a key value and a pointer to the data. The point is that this data is stored in the order specified by the index and if the page into which the insert needs to be made is full then a page split will occur so as to maintain the order. 

    Thanks for the input.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply