Best design for Historical database

  • Hi ,

    In my project we pull in around 10 million records records on a daily basis from Oracle database and store it in SQL server 2005 database . We are maintaining this data as historical data and currently we are not using using this data for any reporting or loading purpose but in future there are requirements for monthly reporting and ETL processes which will use these tables as source tables.

    Currently the database is growing on a daily basis . Please suggest some best designs for historical database.

  • What's bad about that? What's your problem with the situation?

    Assuming you are running short of space, you can always buy a cheap 1+ TB drive and keep pilling up the data, and when you need it then plan a bigger server for the specs you need for the [reporting] project. Just make sure your boss knows you need some planning leeway in that particular case.

    Another way to save space is to maybe move the data to backups on DVDs or other external devides and remove from permanent server... again make sure the proper autorities know that the data is available but that you require some time to access it and that it'll be much slower than the prod server.

  • It depends too much on what you want to do with the data. If you're just storing it as is, no modifications, will that satisfy these future reports? Or do you need to look at generating aggregates as you go or do some other operations on the data to store it in a more efficient manner for historical reporting.

    You have options, but you need to define the needs better in order to more fully explore the options.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Appears to me this is a text book ODS table scenario - a table that will sit there until some day in the future an ETL process starts sourcing from it.

    Assuming source database - Oracle side - ensures uniqueness I would not even consider creating a PK constraint.

    I wouldn't create any indexes. If forced to do it I wouldn't create a clustered one.

    Most probably I'll partition the table in a way that when the table becomes a source for ETL partitioning strategy allows for easy purging - a.k.a. get rid of a whole partition.

    If lots of null values are expected I would take a look at "sparce" option.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply