February 5, 2014 at 1:51 pm
Simple question.
The partitioning scheme with indices calls for about 1100 file groups per year (one file per file group), and there is 20 years of data to start. This is a rather large number of files, but there are also 100 billion rows, so the amount of data is non-trivial.
Are there any architectural or other practical limitations to the number of file groups and files?
February 5, 2014 at 2:27 pm
Oh dear.
Msg 7719, Level 16, State 1, Line 1
CREATE/ALTER partition function failed as only a maximum of 1000 partitions can be created.
I can see now that the SQL2012 docs have included the maximum of 15000, so that should do me nicely.
return.
February 5, 2014 at 2:43 pm
That is an absolutely staggering amount of data to me. I work at about 1/100 that scale - 1B rows and about 80 partitions/filegroups for my largest project with a lot of fast-moving data and almost zero downtime allowance.
I've subscribed to this thread to see if someone with VLDB experience chimes in - I'm genuinely curious as to how to manage this much data.
22,000 partitions and file groups would have to be managed with automation or very well tested scripting. And so would the backups. 🙂
February 5, 2014 at 3:20 pm
sqlcentral2 (2/5/2014)
Simple question.The partitioning scheme with indices calls for about 1100 file groups per year (one file per file group), and there is 20 years of data to start. This is a rather large number of files, but there are also 100 billion rows, so the amount of data is non-trivial.
Are there any architectural or other practical limitations to the number of file groups and files?
Yeah... not to mention a shedload of other nuances and caveates.
First, lookup "Maximum Capacity Specifications" in Books Online.
Second, ask your self why you think you need 3 partitions per day especially for data that's more than a year old.
To be honest, it sounds like you and your team are just now trying partitioning and that you haven't done very much actual planning. For example, have you actually planned how to rebuild the clustered index on a 100 billion row table and get it done the same month? 😉 Have you actually checked on what happens to a unique index in a partitioned table? There's dozens of limitations and restrictions. You and your team should spend the next week studying all of them and the week after that re-studying them because there's so many.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 6, 2014 at 2:43 am
markjholmes (2/5/2014)
That is an absolutely staggering amount of data to me. I work at about 1/100 that scale - 1B rows and about 80 partitions/filegroups for my largest project with a lot of fast-moving data and almost zero downtime allowance.I've subscribed to this thread to see if someone with VLDB experience chimes in - I'm genuinely curious as to how to manage this much data.
22,000 partitions and file groups would have to be managed with automation or very well tested scripting. And so would the backups. 🙂
Welcome to the real world where 99% of the advice that is contained in forum posts is completely useless.
Your assumption "would have to be managed with automation" is incorrect.
While time might be "an illusion", as far as we can determine it only moves forward, in this case a useful attribute.
February 6, 2014 at 2:55 am
Jeff Moden (2/5/2014)
sqlcentral2 (2/5/2014)
Simple question.The partitioning scheme with indices calls for about 1100 file groups per year (one file per file group), and there is 20 years of data to start. This is a rather large number of files, but there are also 100 billion rows, so the amount of data is non-trivial.
Are there any architectural or other practical limitations to the number of file groups and files?
Yeah... not to mention a shedload of other nuances and caveates.
First, lookup "Maximum Capacity Specifications" in Books Online.
Second, ask your self why you think you need 3 partitions per day especially for data that's more than a year old.
To be honest, it sounds like you and your team are just now trying partitioning and that you haven't done very much actual planning. For example, have you actually planned how to rebuild the clustered index on a 100 billion row table and get it done the same month? 😉 Have you actually checked on what happens to a unique index in a partitioned table? There's dozens of limitations and restrictions. You and your team should spend the next week studying all of them and the week after that re-studying them because there's so many.
0: Thankyou, I found the list. As I noted before I discovered that SQL2012 docs. do reference the maximum partition function limitation. There are no other restrictions in the list that seem relevant, other than the 32K file and file group limit which we hopefully will not reach in my lifetime.
1: You have no idea who I am or my level of experience. So addressing my post as if I am clueless is not really apprpriate. Since I currently manage a set of 24*7 databases with a data arrival volume of almost 100 million/day and a peak data arrival rates of 3000/sec, there is a good chance I have a clue and that I might have used partitioning once or twice before.
2: The partitioning scheme is simple (an integer representing yearday) and does not have to be managed, merely created. For each day, 1 file for data and 1 for an index (perhaps 2 indices - we may avoid that yet, but that changes nothing).
3: I don't really care how long it takes to build, as long as it eventually gets built. I only have to do it once. What on earth makes you think I need it this month or even this year? If I already have 10s of billions of rows, do you think they might already be usefully partitioned and in use? I am not embarking on this for the health reasons, there are clear commercial and technical reasons.
4: The optimizer has a lot of issues with partitioned tables. I have dealt with the ones that are relevant so far (see 1. above). Feel free to chime in with information that might actually be useful.
If you had wanted to be helpful, you might have given a reference/link detailing these ".. dozens of limitations and restrictions", or perhaps you might have enlightened me (and our now breathless follower(s)) exactly what "...happens to a unique index in a partitioned table" that might be relevant for large numbers of partitions and/or high cardinality.
I don't claim to know everything, so feel free to share your superior wisdom.
February 6, 2014 at 4:20 am
sqlcentral2 (2/5/2014)
Oh dear.Msg 7719, Level 16, State 1, Line 1
CREATE/ALTER partition function failed as only a maximum of 1000 partitions can be created.
I can see now that the SQL2012 docs have included the maximum of 15000, so that should do me nicely.
return.
The comments stand with regards the number of partitions and whether they are actually needed.
However the limit on the number of partitions can be raised to 15000 as of SQL Server 2008 SP2 and SQL Server 2008 R2 SP1
February 6, 2014 at 6:03 am
Thanks for that info - that's news. I will investigate that possibility.
The poster comment about number of partitions can stand forever, it will still be wrong. I have simply not made sufficient data available about the application, the use of the actual data or the structure of the data to make any intelligent guess. The comment is the result of a "best practice" mentality without reference to anything - never likely to reach a satisfactory result.
Now, I'm off to technet for a little read.
February 6, 2014 at 6:12 am
sqlcentral2 (2/6/2014)
Thanks for that info - that's news. I will investigate that possibility.The poster comment about number of partitions can stand forever, it will still be wrong. I have simply not made sufficient data available about the application, the use of the actual data or the structure of the data to make any intelligent guess. The comment is the result of a "best practice" mentality without reference to anything - never likely to reach a satisfactory result.
Now, I'm off to technet for a little read.
I think you have taken things a little too personally. Given the fact that you didn't know about my reference proves that we can all learn and share things among the community and that perhaps you would benefit by accepting that you do not know everything about SQL Server.
This forum has a massive amount of accumulated knowledge and experience amongst its members.
I'm glad I have helped in some way from my many years of experience.
February 7, 2014 at 2:13 am
a) I thanked you politely for the reference.
b) I have not indexed the internet in my head and neither have you.
c) "it proves" that I didn't "google" well enough
c) Which part of "I don't claim to know everything..." didn't you understand?
It's not personal at all, I called BS on you. Get over it.
ps: Had you re-read the article you referenced, you will find a case example of why large partition numbers are needed/used.
February 7, 2014 at 9:27 am
sqlcentral2 (2/6/2014)
Jeff Moden (2/5/2014)
sqlcentral2 (2/5/2014)
Simple question.The partitioning scheme with indices calls for about 1100 file groups per year (one file per file group), and there is 20 years of data to start. This is a rather large number of files, but there are also 100 billion rows, so the amount of data is non-trivial.
Are there any architectural or other practical limitations to the number of file groups and files?
Yeah... not to mention a shedload of other nuances and caveates.
First, lookup "Maximum Capacity Specifications" in Books Online.
Second, ask your self why you think you need 3 partitions per day especially for data that's more than a year old.
To be honest, it sounds like you and your team are just now trying partitioning and that you haven't done very much actual planning. For example, have you actually planned how to rebuild the clustered index on a 100 billion row table and get it done the same month? 😉 Have you actually checked on what happens to a unique index in a partitioned table? There's dozens of limitations and restrictions. You and your team should spend the next week studying all of them and the week after that re-studying them because there's so many.
0: Thankyou, I found the list. As I noted before I discovered that SQL2012 docs. do reference the maximum partition function limitation. There are no other restrictions in the list that seem relevant, other than the 32K file and file group limit which we hopefully will not reach in my lifetime.
1: You have no idea who I am or my level of experience. So addressing my post as if I am clueless is not really apprpriate. Since I currently manage a set of 24*7 databases with a data arrival volume of almost 100 million/day and a peak data arrival rates of 3000/sec, there is a good chance I have a clue and that I might have used partitioning once or twice before.
2: The partitioning scheme is simple (an integer representing yearday) and does not have to be managed, merely created. For each day, 1 file for data and 1 for an index (perhaps 2 indices - we may avoid that yet, but that changes nothing).
3: I don't really care how long it takes to build, as long as it eventually gets built. I only have to do it once. What on earth makes you think I need it this month or even this year? If I already have 10s of billions of rows, do you think they might already be usefully partitioned and in use? I am not embarking on this for the health reasons, there are clear commercial and technical reasons.
4: The optimizer has a lot of issues with partitioned tables. I have dealt with the ones that are relevant so far (see 1. above). Feel free to chime in with information that might actually be useful.
If you had wanted to be helpful, you might have given a reference/link detailing these ".. dozens of limitations and restrictions", or perhaps you might have enlightened me (and our now breathless follower(s)) exactly what "...happens to a unique index in a partitioned table" that might be relevant for large numbers of partitions and/or high cardinality.
I don't claim to know everything, so feel free to share your superior wisdom.
Heh... I'm not addressing the post as you being clueless. Because of what you posted, I'm addressing the post as you not knowing much about partitioning and trying to give you a heads up of how complicated it actually can be. No need to be offended.
The reason I didn't post any links is because Books Online is absolutely your best bet here and there are dozens of Books Online articles that you're going to have to spend a good amount of time reading. Since you are a pro, I figured that you'd appreciate the heads up. Guess I made a mistake there.
Please drop the overly sensitive/defensive attitude. It's not helping here.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 7, 2014 at 9:58 am
sqlcentral2 (2/7/2014)
a) I thanked you politely for the reference.b) I have not indexed the internet in my head and neither have you.
c) "it proves" that I didn't "google" well enough
c) Which part of "I don't claim to know everything..." didn't you understand?
It's not personal at all, I called BS on you. Get over it.
ps: Had you re-read the article you referenced, you will find a case example of why large partition numbers are needed/used.
This attitude of yours, very professional.
You have no idea of my own experience, so to pass judgment on my decades of DB experience is rather shortsighted since it's likely you may get very few replies in the future.
To answer your statement I never said there was never a case for many partitions. Just that it's worth considering whether it's worth the administrative overhead.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply