December 22, 2015 at 1:09 am
Hi,
I have a large table of over 400Million rows with a daily addition of 10Million rows.
Data will be added through packages and there will be no other manipulations on table.
Could you suggest me the best partition boundaries i.e., Monthly, Quarterly, HalfYear, Year etc..
Thanks,
Naren
December 22, 2015 at 1:12 am
Why are you partitioning?
If you're partitioning for data loads and data archives, then the granularity of the data imports will determine what you want to partition by.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2015 at 1:47 am
I understand.. The issue is, before we insert data into the table, we are comparing with the existing data.
Also, this table is used for reporting purposes etc..
December 22, 2015 at 1:50 am
For this reason the package is taking 4+ hours. If we estimate the table size for a couple of years, can the table be able to handle those many millions of rows
December 22, 2015 at 1:57 am
Ok, so why are you looking at partitioning?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2015 at 3:03 am
Hmm..
I believe partitioning the table will boost up the performance for reporting purposes.
Will partition gain performance for data loads..?
Also, as the table is growing bigger and bigger, am concerned about the table size.
December 22, 2015 at 3:05 am
Granularity of the data imports - - The granular level i see is day. You mean to go ahead with daily partitions.
After the partitions reaches certain limit delete the partitions and archive the data. Is that..?
December 22, 2015 at 3:42 am
Narendra-274001 (12/22/2015)
I believe partitioning the table will boost up the performance for reporting purposes.
Very unlikely. Partitioning is not for performance. It's for data management.
Will partition gain performance for data loads..?
Providing it's a straight insert of an entire partition, yes, via partition switching
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2015 at 4:46 am
GilaMonster (12/22/2015)
Narendra-274001 (12/22/2015)
I believe partitioning the table will boost up the performance for reporting purposes.Very unlikely. Partitioning is not for performance. It's for data management.
Will partition gain performance for data loads..?
Providing it's a straight insert of an entire partition, yes, via partition switching
Is there a very very narrow case somewhere when partition elimination works better than plain old vanilla proper design of table, choice of clustering and indexes?
Reason I am asking is because so many many many people think partitioning is for performance, so maybe I am missing something here. (maybe they are falling for the ad populum fallacy).
So far, it has only helped me rebuild index partitions on the last changing partition, reducing the maintainance window, giving me extreme savings in my rebuild window, but no other improvement.
December 22, 2015 at 5:27 am
MadAdmin (12/22/2015)
So far, it has only helped me rebuild index partitions on the last changing partition, reducing the maintainance window, giving me extreme savings in my rebuild window, but no other improvement.
Yup, that's what it's there for.
Unfortunately Books Online specifies that partitioning is for manageability and performance, which is probably why people think that 'partitioned table = faster', which is not the case in most circumstances.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2015 at 7:53 am
How about the - Having millions and millions of rows in a single partition
If we assume the data count for another 4 years we will be ending up with 3000 Million of rows.
We can archive the data based on business requirement.
While archiving, do you suggest Partitioning or archive data to a New table.
And yes, Here I'm speaking partitioning w.r.t Data Management. In this scenario would you suggest Month,Quarter,Twiceyearly,year.
December 22, 2015 at 8:41 am
1) Narendra, let me start by saying that I have NEVER had a client try to do partitioning with no prior experience and without getting a professional to help them have a good experience. Forums are littered with those that have tried and failed to accomplish their objectives too (and often make things WORSE than if they had just left things alone). Partitioning is a COMPLEX subsystem and it can not be slapped in successfully based on some forum Q & A. LOTS of analyses must be done on many fronts to be successful!!!
2) As Gail has said, partitioning was introduced primarily for data management and data loading on LARGE tables.
3) I have worked on systems with billions of rows in tables and they perform just fine when properly tuned, maintained and on adequate hardward.
4) Partitioning CAN make PROPERLY DESIGNED report queries go much faster when partition elimination can allow said queries to hit a small fraction of the total data. There is no guarantee that ANY of your queries can benefit from this.
Please do yourself and your company a favor and seek professional guidance on this matter. It very well may be discovered that you don't need partitioning. Or that your hardware is completely incapable of managing and serving up 3B rows in any construct. Or that your reports need to be altered to be more efficient, etc., etc.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
December 22, 2015 at 9:03 am
TheSQLGuru (12/22/2015)
Partitioning CAN make PROPERLY DESIGNED report queries go much faster when partition elimination can allow said queries to hit a small fraction of the total data. There is no guarantee that ANY of your queries can benefit from this.
To be honest, I've never seen that to be true when compared to a properly indexed monolithic table and a good query. The only place I've ever seen it help with performance is on poorly formed code, SWITCHing in and out, and index maintenance on temporally partitioned tables that suffer few, if any, updates on earlier partitions. Not saying that it couldn't happen but I've not seen it happen so far and it actually makes sense why it wouldn't (except for junk code) and so have my doubts that it ever actually helps properly.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 22, 2015 at 12:10 pm
Narendra-274001 (12/22/2015)
How about the - Having millions and millions of rows in a single partition
Not by itself an indication for partitioning. Now, if you want to use fast swithing for the archiving, that would be a reason to partition. But the raw rowcount alone is not
In this scenario would you suggest Month,Quarter,Twiceyearly,year.
The only person who can answer that is you. Would you archive monthly, quarterly, yearly? The answer to that, along with the answer to how often you load data will determine your partition boundaries.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2015 at 12:43 pm
Jeff Moden (12/22/2015)
TheSQLGuru (12/22/2015)
Partitioning CAN make PROPERLY DESIGNED report queries go much faster when partition elimination can allow said queries to hit a small fraction of the total data. There is no guarantee that ANY of your queries can benefit from this.To be honest, I've never seen that to be true when compared to a properly indexed monolithic table and a good query. The only place I've ever seen it help with performance is on poorly formed code, SWITCHing in and out, and index maintenance on temporally partitioned tables that suffer few, if any, updates on earlier partitions. Not saying that it couldn't happen but I've not seen it happen so far and it actually makes sense why it wouldn't (except for junk code) and so have my doubts that it ever actually helps properly.
There are three cases that come to mind:
A) Concurrent access where partition-level locking lets multiple queries run at the same time because they don't block other partition accesses. This is obviously in systems where NOLOCK isn't the default mode for SELECTs and some updates may occur to the data during reporting periods.
B) Tables where the number of rows and/or sizes of keys leads to exceptionally deep indexes over the entire table. The N extra page reads per seek can add up. And yes, there are access patterns on large fact tables at are most efficient using non-clustered index seeks as opposed to scans.
C) Not a report query improvement, but various maintenance activities that can be done at the partition level instead of across the entire table can win as well.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply