Best way to archive out old data from table into new archive database

  • HI,

    Here is the scenario - I have 6 very large 'staging' tables that hold very old and very new staging data.

    I need to create a new database on a new server that will be used to contain all the data in the staging tables. My process should leave one weeks worth of data in the live (production) staging tables and move out all the old 'archive' data into the new server/database. The structure of the tables will remain the same - just migrating out the old data.

    What is the best way to accomplish this? I need to set this up as weekly job.

  • krypto69 (11/16/2016)


    HI,

    Here is the scenario - I have 6 very large 'staging' tables that hold very old and very new staging data.

    I need to create a new database on a new server that will be used to contain all the data in the staging tables. My process should leave one weeks worth of data in the live (production) staging tables and move out all the old 'archive' data into the new server/database. The structure of the tables will remain the same - just migrating out the old data.

    What is the best way to accomplish this? I need to set this up as weekly job.

    We do a sequence of SSIS packages, one to archive off the data to Ye Olde Archive Server, and one to delete the data in small batches over a weekend so the delete doesn't lock anything up.

  • krypto69 (11/16/2016)


    HI,

    Here is the scenario - I have 6 very large 'staging' tables that hold very old and very new staging data.

    I need to create a new database on a new server that will be used to contain all the data in the staging tables. My process should leave one weeks worth of data in the live (production) staging tables and move out all the old 'archive' data into the new server/database. The structure of the tables will remain the same - just migrating out the old data.

    What is the best way to accomplish this? I need to set this up as weekly job.

    This list of steps are "one-time" steps for the original copy and clear...

    1. Use minimal logging to copy all but the last week of data to the new table on the archive database. Done correctly, the INSERT/SELECT can be minimally logged with the Clustered Index in place (Bulk Logged Recovery Model, WITH(TABLOCK) on the INSERT, an ORDER BY that's in the same order as the Clustered Index on the target table).

    2. Use minimal logging to copy the last week of data to a new table of whatever name.

    3. Lock the original table so that no one can add new data to it. Start the timer.

    4. Copy any files from the original table to the new table.

    5. Rename the original table to be appended with _Old.

    6. Rename the new table to be the same as the original table name. Stop the timer. This should have take just seconds or less and is the only "outage" the users should have seen if they see it at all.

    7. After a final check of rows in the newly renamed table, drop the original now-name _Old table.

    8. If your databases were in the FULL recovery model, make sure to return them to that state.

    For on-going...

    Set up a NIGHTLY job to copy and delete rows for just one day keeping 10 to 14 days because of weekends/holidays and people wanting to process the previous week 10 or more days after the previous week started.

    As a bit of a sidebar, the Clustered Index should be temporally based so that the copies and deletes all occur near the logical beginning of the database to help prevent interference with inserts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are the production staging tables to be entirely deleted (truncated), or will you delete only rows aged within a specific date range?

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Are the production staging tables to be entirely deleted (truncated), or will you delete only rows aged within a specific date range?

    The staging tables will not be deleted only reduced - deleting all rows over one week.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply