August 11, 2006 at 4:47 pm
We are putting our first DW together. One of the consultants, not familiar with the MS SQL database, suggested we consider "parititioning" our data for ease or reloading, distaster recovery etc. His idea was that we could parition the data based on, say, year, and that way, in case of a disaster, we would only need to reload that year's worth of data from our backups. (So I guess he is talking about software partitioning, not hardware partitioning?)
I can't imagine how this approach wll work with our nightly backup strategy or the changes that need to be made to the DW import code to to handle these partitons. Furthermore, in case of a disaster how is one to even know what parititon(s) to reload. (I'd be incliined to reload the entire warehouse to make sure I got everything!). I'm lost. Any thoughts you can share on this headache will be much appreciated!
Bill
August 14, 2006 at 3:41 am
Partitioning should possibly be considered for your DW, however there is insufficient information in your post to give a definitive answer but partitioning is not really a solution for reloading or DR
I would suggest a little reading on the subject particularly Kimberly Tripps white paper on partitioned tables and indexes. This can be found on http://www.sqlskills.com and is well worth a read.
I am currently building a DW which will hold 2 years worth of data and the fact table will be in excess of 30 billion rows. Partitioning in this case is essential to allow for incremental data loads, reasonable cube build and processing times and more importantly the speed of removal of data when it is older than 2 years. The partition features allow almost instant removal of old data rather than a delete from the fact table.
August 14, 2006 at 8:17 am
Here here to what Bill said. TO follow up a little on the tech side, if you're using SQL2005 the partitioning of tables is now even easier to implement (again look up Kim Tripps whitepaper on partitioning, she has a good walkthrough). And yes, using SQL to do the partitioning is 'software' partitioning but in general, doing this without some hardware partitioning may not lead to the benefits you seek (ie having physically seperate disks for partitions can seriously improve I/O). Remember too that you can partition your cube data also which can lead to improvements in processing and query speeds.
Steve.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply