Archinving Data

  • Hi folks,

    I have a general design querstion that I would like to hear you ideas about.

    I currently building DWH, I have come so far that I got a working partion distributed over 24 filgroups and datafiles. The database will reach TB size once it has been up and running for a period.

    To retrie data Im currently using the "Sliding Windows" concept, I create a staging table on the same filegroup a the partion. From the staging table I then transfer the data to a archive database. After that I drop the table again. Once new data is inbound the process repeats it self.

    How ever I was wonder if it would also be and option to skip the above process and tranfser the data directly to the Archive DB and after that delete that data portion from the DWH.

    The reason why Im considering this alternative is that when ever I create a staging table, the database will grow and additional space is required, secondly and less importaint is the fact that the filegrowth will also have an impact on the performace.

    I would just like to know what you think about all this.

    Thanks

  • It's not clear from your note, but I'd stronmgly recommend youy use the partitioning feature in SQL Server 2005. This'll allow you to segment the data easily on an available column (I.e. date) then instantly swap out the data to your archive table as nw data rolls in. You can also use this same technqiue to instantly swap in new data (i.e. insert / prepare to a temp table) if high availability is a concern.

    There's no need to recreate the tables, since the data empties automatically. For the archive table, once you've transferred it to a separate database, you simplly truncate the table.

  • David,

    Thanks for your reply.

    Sorry that I was unclear.

    At the moment I have a warehouse that is filled with data for BI analysis.

    My DW contains data for the past 24 months, this data is than spread in 24 different parts(its an paritioned table).

    When all the 24 partition as filled up I would like to start filling up data in the first parition again. But before I fille up this parition with new data I would like

    to remove the old data. This is where Im not to sure what I should next. I have seen examples where old data is switched out via a staging table and then backuped.

    Another way to remove this old data would to simply transfer it away from the DW to another database(where only very old data is stored).

    So my question is actually if its bad to use the last method I mentioned(transfer the data away instead of using a staging table).

    Best regards

  • Got it (I think):

    I recommend you use SQL Server 2005, Enterprise. You'll be able to define partitions and maitenance will be a snap:

    To extend a new period:

    alter partition scheme [my_Data] next used [primary];

    alter partition function byPeriod() split range ('2007-12-29');

    To swap old data:

    alter table [my_Data] switch partition 1

    to [my_Data_temp] partition 1;

    alter partition function byPeriod() merge range ('2005-12-31')

    This is way easier than the old partitioned views, fast, and much more robust.

    Additional: you might think you need to cluster the data by period, but don't bother. Cluster it based on the best cardinality / usage senario. Typically, you're accessing all periods (for trending), so put period last.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply