Rebuilding Indexes - Clustered v Non-Clustered

  • Hi,

    I rebuild some indexes on a table recently and noticed some unusual behavior. I was hoping someone could explain why this is the case:

    The clustered index with one column (Row_Id) on a table is about 3.9 million pages and rebuild in about 16 minutes.

    A non-clustered index with one column also on the same table is about 800,000 pages and rebuilt in over twice as long, about 36 minutes.

    Just wondering why it took over twice as long to rebuild the non-clustered index seeing as it has about 5 times less pages on disk. Both were rebuild online, with sort in TempDB set to on.

    Any input appreciated, thanks 🙂

  • The clustered index is based on RowID which, ostensibly, is an ever-increasing value. That means that new data is added to the logical end of the table (the clustered index IS the table) and the index probably didn't need much in the form of defragmentation.

    Non-clustered indexes are rarely created in the same order that data would be inserted. Especially if the Fill Factor is 100%, they can very quickly become horribly fragmented depending on the what leading column of the index is. It's takes a lot longer to fix such fragmentation even though there's less data in the index than in the table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Also remember that these processes are subject to blocking and resource contention. It could just be that there were more shared locks or something causing waits on the rebuild process for the nonclustered index.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks for your reply Jeff.

    The clustered index was %53 fragmented where the non-clustered was %41 fragmented.

    I understand what you are saying but to summarize the sort operation had a lot less work to do to arrange the clustered index in order, even though is was more fragmented. Would that be correct?

  • Thanks for you reply Grant (and thanks for SQL Server Execution Plans, very useful book).

    I did notice a lot of contention on the table during the index maintenance and the shared schema lock wait type made an appearance in the top 10 wait types (it wasn't there previously). There was no other activity on the table so all the contention came from the index maintenance process.

    I guess what I really want to know is how can I accelerate the process? Besides taking the index offline as it is a business requirement to keep it online. Do you have any suggestions?

    Thanks very much.

  • For best overall performance, review "missing index" and index usage stats from SQL Server, then determine first and by far the most important: do you have the best clustered index on the table? Hint: This is most often NOT the identity column.

    Then determine if you need nonclustered index(es) and what columns you need in them.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • If you are on Enterprise Edition, you can do online rebuilds of most indexes, depending on the data types appearing in the index.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (1/19/2015)


    For best overall performance, review "missing index" and index usage stats from SQL Server, then determine first and by far the most important: do you have the best clustered index on the table? Hint: This is most often NOT the identity column.

    Then determine if you need nonclustered index(es) and what columns you need in them.

    Whatever the case, it had better be narrow, unique, ever-increasing, non-nullable, and immutable in SQL Server.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Doing the rebuilds online can reduce contention, but it will increase the amount of time it takes. It's a tradeoff.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (1/20/2015)


    Doing the rebuilds online can reduce contention, but it will increase the amount of time it takes. It's a tradeoff.

    I've been doing some experiments concerning the differences between online and offline rebuilds. I've found that, because it has to maintain a "mapping" table for any changes to the data that may occur during the rebuild, that it uses more resources, runs more slowly, and doesn't do as good a job at defragmenting. The testing so far has only been qualitative. I'll start testing at a demonstrable, quantitative level in the next couple of days. I know I need to get a different hobby 😛 but it's very interesting.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/19/2015)


    ScottPletcher (1/19/2015)


    For best overall performance, review "missing index" and index usage stats from SQL Server, then determine first and by far the most important: do you have the best clustered index on the table? Hint: This is most often NOT the identity column.

    Then determine if you need nonclustered index(es) and what columns you need in them.

    Whatever the case, it had better be narrow, unique, ever-increasing, non-nullable, and immutable in SQL Server.

    No! It has to be the best key column(s) for that specific table's uses. It's just wrong to fixate on the single INSERT to the table vs. the thousands, or millions, of reads of the row afterward. A less narrow key that covers a natural range and/or joins naturally to another table is vastly better for overall performance than an identity in the big majority of cases.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply