Creating indexes in large tables

  • We have a situation where we have some tables with over a million records and we need to add some indexes.  Are we beter off unloading the data, creating the index and then reloading or adding the index to the million record tables?  Time is not a concern if it takes a week to create the index so be it!  Will EM timeout while this process is running? Do users have to be out of the tables to create the indexes.  The server is a data repository that is not really being used by anyone yet. 

    Data can be re-loaded from our Legacy system if we choose to do it that way.  I could delete all records in the SQL Data repository tables and have the legacy system re-populate the tables again.

    To create an index all users need to be out of the tables correct?

    thanks,

     

     

  • Add the indexes to the populated tables. If it takes more than a minute or two, I'd be surprised. A million records is nothing to SQL Server, as you'll find people here with well over a billion rows in a table. EM shouldn't time out, but it will lock it up while it's creating the index. I'd probably use QA personally, as I won't have to load another instance of EM while it's running, and it's also easily repeatable if I save the script.

  • Create the indexes with the data in place.  It would take longer to reload the data to the tables if you created the indexes on empty tables, as it would be building the indexes during the loads, taking more time.

  • I agree you might want to do the index changes via QA instead of EM. However keep in mind, when the clustered index is altered all indexes are forced to update. If you need to change your clustered index (or set one) do that first then your non-clustered. As for time it will take you will should get bettr performance in doing in place rather than to truncate and reload from a remote resource.

  • A million rows is very small. Nonclustered indexes should complete in under 5 min.

    While the index is being built, the table will be read-only, but there's no need to kick users out.

    The only time I've needed to create a new table and copy data for an index change was when I was changing the clutered index on a 150 million row table on a server with poor I/O performance

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • As has been hinted at in other posts the performance depends a lot on whether or not the index you are (re)creating is clustered or not. In simple terms a clustered index is actually the data itself as the last node of the index btree is the data page so rebuilding a clustered index means moving the data about so if there is a lot of data that is a massive amount of disk i/o and hence slow.

    A non-clustered index is a set of pointers to the data so you are just adjusting pointers about which are small and hence much quicker.

    This also explains why you must recreate any clustered index first - because changing the clustered index means moving all the data about and that invalidates all the pointers to data in other indexes (because the data may have moved) hence they are being adjusted all the time the clustered index is changing.

    Having non-clustered indexes in place while you rebuild a clustered index thus also makes that (even) slower because as each data page is moved as part of the clustered index build SQL Server then has to adjust the pointers in any non-clustered index that pointed to it.

    Best perfomance for a total rebuild is thus:

    1. Drop all indexes

    2. Apply the clustered index

    3. Apply other indexes

    Obviously this approach needs to be done off line (i.e. no users) - this is often not that practical so SQL server has some tricks up it's sleeve to give varying levels of rebuild with varying levels of offlining - have a look att DBCC DBREINDEX and DBCC INDEXDEFRAG in Books Online for more info

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply