April 23, 2007 at 3:29 pm
Last week at the Code Camp in Colorado, Kevin Cox of Microsoft talked about some of the work his team is doing on 100+TB databases with SQL Server. He mentioned that they really need to have multiple filegroups and set some read-only to ensure that you can easily recover or restore the databases when necessary.
So when I saw this blog about too many filegroups, it got me thinking and almost makes me worry about SQL Server scalability. But if you dig in, then it appears that it's not a significant amount of time until you have thousands of filegroups. In fact, 2,000 filegroups take 29 seconds to start up and then 5,000 groups take 74 seconds. I'm not surprised that it takes longer, though I'm not sure who really needs 5,000 filegroups. If you do, then let me know.
The challenges with very large databases are many and I can understand partitioning out into dozens of filegroups along logical lines, especially if you can isolate out read-only data. I've never really had to deal with extremely large amounts of data, but I know that just thinking about being unable to easily back up or recover a database is something that worries me.
When people asked what the Code Camp speakers' favorite features of SQL Server were, my answer was the backup and restore processes. I've been using these for over a decade and they've been rock solid for me. I've depended on them and never had an issue. So I was interested to hear how you deal with very large systems. We didn't get details on all aspects, but using read-only filegroups, which you then don't have to back up, was one technique.
On the surface that sounds like a good idea, save space, reduce the backup window, etc. However it's not quite so simple. If you actually needed to restore the whole system, then you need a copy of that read only filegroup. And if all of your tapes that have been made in the last year don't have a copy, that's bad. Just the logistics of keeping copies of your data handy is important.
Especially if you have a smaller, maybe 500GB database around. If you have some amount of read-only data, even if it's 10MB, be sure you keep copies handy or even make extra copies on a regular basis.
Steve Jones
April 23, 2007 at 7:43 pm
This is very good subject that can get very complex because you have to calculate the size and growth relative to your san capacity. Paul is very good but it is not easy to create all that file groups because you have to monitor file growth for none read only data. I am assuming that is the reason Microsoft provide very informative documentation that is very light on indepth explanations of the file groups on MSDN.
And I agree Backup and Restore is the best followed closely by the Agent, I am hoping it will one day be an automation subsystem without Replication tasks.
Kind regards,
Gift Peddie
April 24, 2007 at 12:20 am
File groups are key to IO performance, having all your tables stuck in one 2.5 TeraByte file doesnt help I can tell you and trying to move tables between filegroups in a production system is a nightmare....
You may also hit SAN issues when restoring/backing up very large files (cache issues, firmware, etc. )
Most of the problems these days are associated with someone else setting up the server and creating various drive letters and then the DBA not understanding what logical drives sit behind that.
April 24, 2007 at 4:15 am
Hi Steve
Slightly off topic,
But it has interested me that in your last three posts you have had the sentence “Last week at Code Camp”. I say this because when you sent your mail on the 16th (UK date) I got the impression that you were looking forward to this event only slightly more than say…going to the dentist.
One of your closing sentences being - “I'm still struggling to really find a few good ways to get people interested”. It appears to me from the latest posts you have sent out into the SQL world, that something got you interested and maybe you have answered your own question.
Don’t get me wrong buddy, I am not having a go. Whilst I am writing I would like to throw in a bit about my relationship with this site. I work for a company in the UK who have an “off the shelf” database with an SQL back-end (Goldmine). I found this site a few years back when trying to learn some tricks on how to query a base. Three years later, I still subscribe to your site, but to be honest 99% of what you tell me is completely over my head (I like the 1% cars tho). When I come in of a morning my usual routine is to make a coffee, and then sit and read your mail, I think in three years I have maybe got three of your QOD’s right.
So although SQL and this site interest me a lot, I think it is fair to say, the old red-gate would probably be hanging on its last rusty hinge if it relied on the likes of me.
Anyhow back to my point, maybe you would like to expand on your code camp experiences and let us know what got you so enthused.
Cheers
Wayne
April 24, 2007 at 5:53 am
I know that there must be many of you that are concerned about jumongo databases. And I enjoy reading about them, but unless something changes in my world, it's not an issue.
This reminds me a little of the days when articles in the trade discussed database engines and the need for central data storage when hardly any of the companies I knew had a network:-)
And, I appreciated Wayne's code camp comments... 🙂
April 24, 2007 at 6:31 am
Another interesting problem with a 100 GB database and a 3rd party backup and restore utility. The restore took about an hour ONCE IT GOT STARTED. The backup had been copied to the local disk, and the 3rd party utility took 3 hours to find the start of the back and begin the restore.
Sure limits your testing when you expect something to take 1-2 hours, and it winds up taking 4.
April 24, 2007 at 7:17 am
Interesting points and thanks for the comments.
I know most of us live in the world of < 1TB databases and actually it's prob < 100GB for the vast majority. A survey of the 80+ people at Code Camp showed the largest db at 650GB and very few people over 100GB.
It's not so much that we need these techniques as they show that SQL Server is capable of scaling. If you start with a 50GB DB and 100 people, you shouldn't be worried about scaling to 4TB if you get there. It's mainly for the boss to get him to relax.
On the Code Camp. I wasn't looking forward to it that much, but I did enjoy it. However I didn't really answer my question. It was still hard at the Code Camp and user groups (I went to 3 meetings) to get people interested. I think the large turnouts were mainly because Kalen was speaking, but I doubt we can get her to all the Code Camps or meetings
I'll write more about that and I have a couple things in the work. Just struggling to get out what I want to say.
Steve
April 24, 2007 at 9:52 am
How does this article
http://support.microsoft.com/kb/304101
fit into the discussion for those of us with databases over 160 gig? The third party backup software is choking on that database in windows 2003/ sqlserver 2005.
Are these registry changes really necessary?
April 24, 2007 at 10:49 am
Carolyn,
That's interesting. I've only had a few datbases over 200GB, but I've never seen this issue, even on W2K. It seems like a file handle thing, which could be an issue. If it's affecting you, I don't think the registry changes are that impactful on your system, though more memory will go to OS tasks, so be sure that you watch it on your system.
May 2, 2007 at 2:14 pm
At my previous company I had a couple of databases in the 150-250 GB range, always just used SQL to back them up to disk and never had a problem. They had tried before I got there to back them up using Veritas, when I started there I quickly found out they had not had a good backup in months.
I never particularly wanted to work with a huge database. After listening to a friend that was the DBA for a company that in 2000 had a 24 TB database on SQL, I never felt a great desire to deal with it....
To me it was kind of like going to school for flying. Everyone wanted to fly the 747 when they graduated because it was the biggest thing in the air, what most people in school did not realize was that the long range large aircraft where just horribly boring. Much more fun to fly smaller aircraft Small here being DC-9s and the like.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply