Fill Factor clarification

  • Marios Philippopoulos (2/19/2008)


    You also may want to consider making your primary key ID column non-clustered and define your clustered index on a more suitable column, eg. a record timestamp.

    Queries involving a *range* of values on a certain column/columns, such as a datetime range, benefit greatly from a clustered index defined on that column/columns.

    Defining your PK as non-clustered means that page splits due to table UPDATES will NEVER happen - an update of a varchar column will not affect the stacking of the PK pages.

    Personally, I find that defining an identity column PK as clustered is wasting a precious resource, unless this is the ONLY index you think you will EVER need on that table.

    Must need some more coffee- but I fail to see how the choice of the clustered index is ever going to prevent page splits due to updates of varchars. If a row is in a page, and its varchar column is updated from holding 20 characters to 5000 characters, then it's going to cause a page split if the page doesn't have enough room to accomodate the extra 4980 characters.

    Also - the PK has nothing to do whatsoever with page splits. The Clustered index does on the other hand.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • If a row is in a page, and its varchar column is updated from holding 20 characters to 5000 characters, then it's going to cause a page split if the page doesn't have enough room to accomodate the extra 4980 characters.

    That's correct.

    What I'm saying is that the page split will not be on the PK, IF it is made into a non-clustered index. Page splits will potentially occur on the clustered index (or heap) of the table, whichever one that index may be.

    The original question was about setting a fill factor value for the PK identity column. If that is made into a non-clustered index, setting the FILL Factor to 100 (or 0) should be OK. I don't see a scenario of page splits for the non-clustered PK in this case, do you? Unless I am missing something.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • I'm being nitpicky about this - but let's avoid using the word PK in this case. Its function as Primary Key has no bearing whatsoever on the page splits. The clustered index is the only thing driving that decision. The fact that they're both on the same field is incidental only. The fact that SQL server allows us to use the same index for both is regrettable IMO, but again - that's not the question of the moment.

    If your primary goal is to prevent page splits, then actually - an identity is a GOOD choice for a clustered index, since all of the inserts will have at the "end", and not in the middle (already full) pages. Now - you don't get it to leverage the clustered index in range seeks/scans that way - but again - that's a choice based on what's most important.

    Also - Heaps don't have page splits. They're a byproduct of having a clustered index.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt, I completely agree with most of your points, but are you sure about the last one?

    Heaps don't have page splits.

    An update of a VARCHAR column in a table with no clustered index cannot cause a page split in the data pages? I think it can.

    Sorry if I'm splitting pages (... I mean hairs 🙂 ), but I want to make sure my understanding is correct here.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Marios Philippopoulos (2/19/2008)


    Matt, I completely agree with most of your points, but are you sure about the last one?

    Heaps don't have page splits.

    An update of a VARCHAR column in a table with no clustered index cannot cause a page split in the data pages? I think it can.

    Sorry if I'm splitting pages (... I mean hairs 🙂 ), but I want to make sure my understanding is correct here.

    No issue at all - it's actually not very well documented IMO. Take a read through this:

    http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/clusivsh.mspx

    The part that's relevant is of course "buried" most of the way down, but here it is anyway:

    For the table with the clustered index, the Page Splits/sec is about four times higher than the corresponding number of Page Splits/sec for the table with the nonclustered index. In addition, the ratio of Page Splits/sec to Pages Allocated/sec for the table with the clustered index is higher and increases more rapidly as the number of processes increases. This behavior is due to the fact that page allocations for a table without a clustered index (heap) never result in a page split (the data is just added to the heap), and there are fewer page splits for the nonclustered index pages because for the given table and index structure, almost twice as many rows can fit in a data page. The ratios of Page Splits/sec to Pages Allocated/sec are depicted in Figure 16.

    Meaning - when you run out of space in a heap - you just whack a pointer to the end towards some more space somewhere else and put the "extra info over there" (in a REALLY simplified framing of the behavior). As I recall - that's called a "forward pointer".

    Of course - GilaMonster has a better handle on this behavior than I, so I hope she'll stop on by and flesh out the explanation where it's lacking.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • page allocations for a table without a clustered index (heap) never result in a page split

    Sweet! Thanks for the info and link, that's great!

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Now, that's the way I like to see an article... "Alice's Restaurant" style... full of actual tests with "circles and arrows and a paragraph on the back".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Page splits only occur in clustered tables because a clustered table must store records in the order specified by the index. If an update enlarges a record so it will not fit on the original page, a new page is allocated and all the rows from the original page are distributed between the two pages.

    Heap tables don't have page splits because there is no ordering to preserve, but they do have forwarding pointers which in many cases can be worse. An updated row in a heap table that won't fit on the heap table will be stored somewhere else, but to avoid having to update all the indexes a pointer to the new location is left in the original location. All accesses to the table from then on must do extra IO to follow the pointers, kind of like forcing extra bookmark lookups. There are other disadvantages to heaps, but they don't need to be discussed here because they don't have any relation to the original question.

    The original question was whether a 90% fill factor makes sense for a particular table. I think the first answer is to learn how to monitor page splits and table fragmentation so you know whether page splits are an issue. If they are an issue, try lowering the fill factor and/or rebuilding the index more frequently. Also analyze what is causing the page splits. If records are inserted with a lot of NULL varchar fields that are filled in later, try adding strings of spaces as default values if the applications can handle it. Only if the fragmentation is unmanageable or if the updates are so frequent that the extra page split IO is an issue would I consider switching to a heap (no clustered index).

    The first reply from the original poster said the updates do not enlarge the records and that indexes are rebuilt daily, so page splitting is not a problem. He wanted to increase the fill factor from 75 or 80 to 90 so less space would be wasted. My answer would be (if there really is no page splitting) to increase it to 100 for that table. Nothing in his questions would lead me to suggest getting rid of the clustered index.

  • hey,

    these sounds gr8!!! thanks.....

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply