performance while importing data

  • I want to import large data in one table from csv file.

    When the performance of import would be better.

    1. When I remove the index in the table and then import.

    2. Have indexes in table and then import

    Please let me know the suitable option and reason as well.

    thanks

  • ritesh.saluja (12/4/2008)


    I want to import large data in one table from csv file.

    When the performance of import would be better.

    1. When I remove the index in the table and then import.

    2. Have indexes in table and then import

    Please let me know the suitable option and reason as well.

    thanks

    It depends (don't you love that answer?).

    If you are doing a straight insert, no updates or deletes, in most circumstances, dropping all the indexes (and foreign keys), performing the inserts and then restoring the indexes (and foreign keys) works best. If you have to rely on looking up existing rows as you do when updating or deleting, you may find that leaving all or some of the indexes in place is better. Testing in your environment is the only way to be sure.

    If you have partitioned data, you may be required to leave the indexes in place in order to get the data distributed across the partition appropriately.

    Like I said, it depends.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • FROM BOL:

    http://msdn.microsoft.com/en-us/library/ms177445(SQL.90).aspx

    The relevant Part for your case:

    Importing Data from a Single Client (or Stream) into a Non-Empty Table

    You should have a rough estimate of how many rows are new and the table "guides" you on what to do:


    * Noel

  • I never import directly into the final target table... I always use a staging table. That way, I can do validity checks before the new data actually get's added.

    If you add lot's of data on a regular basis, then I also recommend that your clustered index be in some form of temporal order to keep from nuking the target table with page splits.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply