Archiving big table

  • Hey All,

    We have a very big table which stores transaction data and it is SCD Type 2 table. Records are identified by ID and effective date columns, primary key on these two columns.

    Table has grown close to 500GB in size with 1.4billion records.  2years ago we had moved records older than 2018/03 to backup table and we had around 380millon records, active as well as 2years of history data, at that time. Backup was done by inserting active and 2years historical data to a new table and swapping the tables, and created required index on new table

    Within 2years table has grown over 1billion records. Now we want to build a process to archive data every month to backup table. Please note backup table also has 1billion+ records and indexes. We are also planning to move backup table to different DB.

    I am planning to insert all the unique records to temp table and insert data in batches say 1million records at a time to backup table, delete from original table and comit.

    Any approach to achieve our objective would help us. Thanks in advance.

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • The best way to do this is monthly partitioning using either Partitioned TABLE or Partitioned VIEW technology.  Not intending to sound the part of an iconoclast , I prefer the use of Partitioned VIEW's out of those two suggestions.

    1. My first question would be, could you post the CREATE TABLE ddl for the SCD 2 table including an indexes and constraints, please?
    2. How many base tables does the SCD 2 table receive transaction data from?
    3. If this is a "homegrown" bit of auditing, can you post the CREATE TRIGGER ddl for at least one of the tables that captures the transaction data in the SCD 2 table?

    As a bit of a sidebar, I also have such a table.  It's ridiculous in nature because it contains the recordings for calls since mid 2010 and they won't let me delete a lick of it.  I did (make the mistake of) converted it to a Partitioned Table on a monthly basis using one file/filegroup per month and I make the legacy months (all but the Current Month and at-the-ready Next Month) file groups READ_ONLY because they'll never change in content.  That allows me to "repack" their indexes to as tight and small as possible and save a huge amount of time by doing one final backup after they've been set to READ_ONLY and then telling my backup procedure to not backup any of the READ_ONLY filegroups.

    Again, I feel like I drank the Microsoft "Purple Flavoraid" when it comes to Partitioned Tables and I have it planned to convert it all to a Partitioned View for many reasons.  One good reason is during DR restores... you can't run backups on a partially restored Partitioned Table (or at least I've not found a way).  Partitioned Views also allow for non-aligned indexing which Partitioned Tables do allow but at the expense of losing some of the functionality that I need or jumping through hoops to drop such indexes and then having to rebuild them every month.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply