Sequence Order - Index creation or Data Load

  • Our build process involves creation of database, DB objects all driven through TSql scripts.

    Some of the master tables have Seed data. Once the Database, table and procs are created, including indexes and constraints, the master seed data(insert) scripts are executed, in the sequence. So far fine.

    As part of supporting multi languages, one of the our languange attributes related data has grown 10 fold. Instead of a single large script file, the seed data script is split on per language basis, confining the script file size.

    The issue being encountered, is that there is a long delay in the execution of all the scripts.

    If we split the sequence of execution as

    create tables---load data---apply indexes and constraints,

    Would this help. If the data is pre-validate to support the index and constraint definition.

    Any inputs/suggestions appreciated.

  • Are you doing individual insert statements, or a bulk copy?

  • Index & constraint creation / enabling should be done after data load. If you are inserting data in batches, you can disable / enable the index just for the batch.

    Even if there is any time interval between data load batches, make sure your data is consistent & equally performing.

  • Dev (12/17/2011)


    Index & constraint creation / enabling should be done after data load. If you are inserting data in batches, you can disable / enable the index just for the batch.

    Even if there is any time interval between data load batches, make sure your data is consistent & equally performing.

    I wouldn't agree to use this as a general approach.

    Assuming the data can be loaded using a single process (no parallelism) in the same order as the clustered index, then I would not disable the clustered index.

    However, I agree to disable the nonclustered indexes.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (12/17/2011)


    Assuming the data can be loaded using a single process (no parallelism) in the same order as the clustered index, then I would not disable the clustered index.

    Disabling the clustered index takes the table offline and makes it completely inaccessible, so disabling the clustered index before a load will just cause errors. Dropping it, I wouldn't recommend that regardless of the order of the data, as recreating the cluster is a huge job.

    Nonclustered indexes, probably either create after or disable before and rebuild after, but it has to be tested, there are cases where the indexes are necessary during an import (unique constraints, possibly speeding up foreign key checks).

    Test both ways, especially if it's not a simple data load.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply