How to improve index rebuild for very large table.

  • My online index rebuild job for a 200G table with 40mil rows took me up to 30hrs to complete, this is a once off activity done since we have an allocated downtime for server maintenance.

    I would want to know, is there any way for me to do it faster? I had an idea of inserting all the data from the current table to a new table.  Then truncate the old table, this will wipe clean all the data and also indexes.  Once this is done, I load back all the data from that new table to the old table, I know by doing this SQL will also rebuild all the indexes while the data is being loaded.

    Will this be faster than I just rebuild the single index?

  • A) Testing is your buddy. Don't take anyone's word for it. Test it.

    B) Probably not. In fact, that's almost the opposite of what I'd want to do. Presumably, guessing here, this is the clustered index you're rebuilding? If you create a table with the cluster, and the nonclustered indexes, then do a load to that table, not only are you constantly updating the cluster, and depending on the data, distribution, which column(s) the clustered key is on, You can, possibly end up with the clustered index being as fragmented as it gets over time naturally. PLUS, as you insert each row into the cluster, you've also got to update the nonclustered indexes... which are also splitting depending on the data load, not to mention slowing down all the inserts.

    C) If it is the clustered index, three possible ideas come to me. First, leave it alone. Most of us, most of the time, can deal with index fragmentation just fine. Sure, broad scans, table scans, will hit more pages, thus slowing down. However, point lookups aren't affected at all by a fragmented index, and, for most of us, we have a lot more point lookup queries than scans, so performance is fine. OK, can't leave it alone for whatever reason, drop all the nonclustered indexes, do the index rebuild, then put the nonclustered indexes back on. Recreating them means they won't be fragmenting as you do the data load. Or, third option, if you want to move the data between tables, sure, try that, but again, no nonclustered indexes till you're done and then rebuild them.

    Overall, I lean heavily on don't bother rebuilding the index. I know the old advice was just how important this is. However, modern testing (there's that word again) shows that for most of us, most of the time (weasel words are important), index fragmentation doesn't cause us much pain. In fact, for most people, the performance gains they saw from index rebuilds actually came from the cleaned up statistics, not the elimination of fragmentation.

    Others may have better or different suggestions. And whatever they may be, test them too.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Don't do online, do offline.  That should be ok if you have scheduled down time anyway.

    Otherwise, hard to say without more details.  As stated above, you may not really need to rebuild.  And you'd want to rest how best to handle the non-clus indexes.  Hopefully you have another machine where you can test.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • It's faster to rebuild indexes offline.

    You might also be able to get it to rebuild in parallel which will be faster: https://learn.microsoft.com/en-us/sql/relational-databases/indexes/configure-parallel-index-operations?view=sql-server-ver16

    But I think the question needs to be asked: Why do you want to rebuild the index?

  • JasonO wrote:

    My online index rebuild job for a 200G table with 40mil rows took me up to 30hrs to complete, this is a once off activity done since we have an allocated downtime for server maintenance.

    I would want to know, is there any way for me to do it faster? I had an idea of inserting all the data from the current table to a new table.  Then truncate the old table, this will wipe clean all the data and also indexes.  Once this is done, I load back all the data from that new table to the old table, I know by doing this SQL will also rebuild all the indexes while the data is being loaded.

    Will this be faster than I just rebuild the single index?

    Don't do a thing, yet...

    I rebuild a table that's roughly 1/3 of a Tera Byte with 5 NCI's that take it up to 475 GB.  It is PAGE Compressed and so the Rebuilds take roughly twice as long (1:00 hour without for the CI, and 1:48 with the compression).  It takes an additional hour for the 5 page compressed NCI's but the total of all that is way less (< 4 hours) than 30 hours and I have more that twice the size I'm working worth.  The rebuild of the CI is actually done twice (worth it to not have ANY extra unused space when I'm done) and working on making that just a one way rebuild with no extra unused space (trick with file groups).

    First, can you temporarily go to the BULK LOGGED Recovery Model?

    If so, please post the CREATE TABLE statement for this table including all of the indexes including the index for the clustered index so I can figure things out a bit for you.  It would also be helpful to know the index sizes and page densities are.

    So, I might be able to help. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply