May 28, 2011 at 3:01 am
Hi,
Not sure if this question fits into "Administering" or "Developement" really - I am a developer but the question seems to me to require some DBA guru-ness to help get the right answer! It's a little bit of a conceptual question at the moment as development is underway but I have not yet had a full dataset available to test against for benmarking etc.. perhaps I am a little previous on the question, but I want to have a good deisgn in place before the development is done!
The scenario is that we have a table of circa 50 million records. Daily (overnight SSIS) we import approx 1/100th of the data and archive the same amount into another table - i.e. we import a days worth of data from various sources and archive a days worth of data off - so we have a rolling 100 days of live data.
The ongoing issue is that of balancing import efficiency with end user querying efficiency via the GUI. Currently we are dropping the indexes prior to the import and the re-applying them at the end - I've not yet bench marked the performace of this against leaving the indexes in place as I have not got a full data set in place just yet - but I am aware that as we reach the tables full capacity the gain in doing this may be lost and it may in fact be quicker to insert with the indexes in place.
However, reading around this subject the idea of partitioning the table seems to be a possible solution. Or is it? The idea of loading each days data into an empty table - so we have rapid load - and then adding it to the partitioned schema with indexes - so we have efficient search queries - seems to be a nirvana for the task in hand!
However does that schema - one of adding partitions on a daily basis - fit well with the notion of partitioning? Would this in turn become unmanageable or resource hungry? Would there in fact be a better partioning schema to made use for such a problem?
One possible beneficial side effect of partitioning could possibly be that the 'archiving' aspect is no longer required? All the data can be kept in one table across many partitions?
Sorry for the brain dump and question bombarding - I realise this is not an exact science, but pointers or general ideas from a DBA point of view would be great.
I've not even started on figuring out the way partitions should handle indexes to keep them efficient... a major search criteria for the GUI is date based - so an index on the date field would be required. However, although the import is daily, the data for each day actually spreads across several days (as the imports come from different sources at different times)... thus the main index would be spread across partitions. From what I've read this may be an issue.. not sure?!
Many thanks in advance for any help!
Cheers!
May 30, 2011 at 3:30 am
Table partitioning is a good new feature in sql server,when you go to partition a table then you have some type of policy like next year data must be in FG 2012 ,Total number of Rows like 1000000 in FG20,here FG means FILE GROUP
Use of partioning a table must increase the performance of a query,your DML response decrease and selection response also decrease
When you perform a DML on a table which one is partitioned and table escalation is on Auto not Table then SQL server will lock the only particular partition and you can perform DML on other partitions at a same time and same in selection scenerio
In your scenerio , you know about new data will be on this FG or partition then you can query direct to the particular partition rather than the whole table,here is your import time will be decrease
My recommendation is go to yearly partition like FG 2011 , FG 2012 , FG 2013 etc
How much row count for a year in this table ?
How create partitions of a table in SQL Server
=============================
http://blogs.msdn.com/b/manisblog/archive/2009/01/18/easy-table-partitions-with-sql-server-2008.aspx
http://technet.microsoft.com/en-us/library/dd578580(SQL.100).aspx
http://msdn.microsoft.com/en-us/library/ms188730.aspx
Partition Function and Partition Scheme
=========================
http://msdn.microsoft.com/en-us/library/ms187802.aspx
http://msdn.microsoft.com/en-us/library/ms179854.aspx
Regards,
Syed Jahanzaib Bin Hassan
BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
www.aureus-salah.com
May 31, 2011 at 12:21 am
1) Syed - this person stated they have 100 rolling days of data (day comes in, one drops off). Why would you recommend YEARLY partitioning??
2) Partitioning CAN be great for efficient data loading and movement. That is really what it was created for.
3) Partitioning can ALSO be great for query performance. But you gotta do things right and you gotta have queries that can offer up partition elimination.
4) Partitioning is an ADVANCED feature with MANY opportunities to go wrong - perhaps badly so. PLEASE consider getting a qualified professional on board for a bit to guide you along, or at least do some very rigorous study and testing before you make the switch to production.
5) Partitioning can act as an 'archive' set with all the data in the same table just 'off to the side' in a partition constraining it's data to 'old crap'. I still like true archival, and better yet data pruning! 😀
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 31, 2011 at 2:29 am
Hi Guys - thanks for the replies. All good stuff and confirming to me that a) things are never quite as straight forward as we hope and b) that a partitioning slution would need a bit more thought, design and testing prior to implementing than the project will probably allow. As normal!
Regarding the rolling 100 days - that really is a the current solution put in place to avoid the percieved problem of storing and retrieving the quantity of data. So Syed is possibly right in that we could move to a yearly partiton schema (and therefore implicitly have the partitioning performing the archiving) - but of course Mr Guru is correct as that would be a change of the current scenario. I think i agree also that a seperate archiving policy would be a better solution for this particular project.. particularly with some associated vigorous data pruning!
I may, and hope, to get some time to revisit this sort of solution but I think for now it may be left for the wise DBA's to suggest and manage this... where's the irony smily when you need it?
May 31, 2011 at 8:33 pm
If you don't have the Enterprise Edition, Table Partitioning isn't an option (IIRC).
You could pull off a similar trick with a "Partitioned View". Look for it in Books Online.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 1, 2011 at 11:27 am
You can use a strategy calling sliding window to make import and archiving faster and much easier to automate.
You can read about it here.
http://technet.microsoft.com/en-us/library/aa964122(SQL.90).aspx
-------------------------------------------------
Will C,
MCITP 2008 Database Admin, Developer
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply