October 28, 2013 at 10:31 am
Hi,
I'm fairly certain that I know the answer to this question, but would like some validation nonetheless. A large table, with roughly 7 billion rows and 360GB of storage, receives daily updates via a MERGE statement. Some rows are updated, some are inserted, but none are currently deleted. In my mind, such a table is NOT a candidate for partition switching. Do you concur?
Thanks,
Bennett
October 28, 2013 at 3:10 pm
If you can't guarantee that the data to be merged is focused at the most recent data, you have to consider the following example:
Data2010
Data2011
Data2012
Data2013
What happens in your system if you now try to merge more 2010 data (for whatever reason)?
New rows would appear of course.
Keep in mind: these new rows would appear regardless of if the table is partitioned.
Partitioning might put them into the 2011 dataset (depending on how you set up your partition functions)
But with sizable data like you mention (360GB) - you will need a sufficient maintenance window to process logged deletes. If you don't have that window, partitioning will be your only way to "keep up" with rolling purging.
In the example above, it's really not that bad to "add" more 2010 data. It may hang around for a year, but it will eventually purge.
The merge issue you give would not stop me from partitioning this table if otherwise partitioning was a good fit.
Partitioning does add complexity to managing the system. But it's hard to beat for rolling purges of data if the table qualifies to be partitioned.
Are there other concerns?
October 28, 2013 at 8:38 pm
Hi Jim,
Thank you for your reply. I have a fairly good understanding the process of switching in and switching out. I agree that partitioning would be useful for switching out the old data. What I cannot see is any way to switch in new data since we're bringing it in via a MERGE statement rather than an INSERT statement - or am I overlooking something?
Thanks,
Bennett
October 29, 2013 at 9:46 am
Hey Bennett,
For switching-in partitions - you're dead on.
As soon as you're doing "Updates" (via the MERGE) partition switching won't help.
I you have a way to split the "new data" into two sets there may be value in...
Split 1: just yesterday's data enters the table via partition switching-in.
Split 2: data older than yesterday enters the table via MERGE (hopefully just a handful of rows)
But just doing the MERGE is a lot less complex. The added complexity of the above logic and adding partitioning would be weighted against the speed value provided. It may not be worth it.
-Jim
November 10, 2013 at 12:11 pm
SPAM reported.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply