February 21, 2013 at 1:45 am
Hi all,
We have a 2 node (active/passive) SQL 2008 SP1 Enterprise Cluster that hosts about 500 databases.
Now while the Cluster environment gives us the High Availability, we would also like to build some sort of Disaster Recovery solution as well.
We're considering replication, but I'm not sure how big of an additional load will replicating 500 databases to another SQL Server create on the Cluster.
Has anyone been in a similar situation?
Any thoughts/suggestions on how we handle our Disaster Recovery solution for this environment (the DR site can be a few hours "old")
Please do ask for additional info if you think something is unclear.
Thanks in advance.
February 21, 2013 at 1:56 am
Although replication may seem like a good idea, I'm not sure it was intended by Microsoft to be used as a DR solution. Have you considered mirroring or log shipping?
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
February 21, 2013 at 3:10 am
Hi, thanks for the quick response.
Of course, I see what you're saying.
Yes, mirroring/log shipping is all an option and under consideration.
That still asks the question how would it work if I enable it for 500 databases?
How much additional load would it create? How much bandwidth would it take up?
I've done tons of DR solutions in the past, but never on this scale, and I can't just start ramping up log shipping for the databases one by one just to see "how will it go".
Is there maybe a way I could test/estimate/calculate this sort of thing before I actually go and implement it for 500 dbs.
Any ideas/thoughts. Thanks
February 21, 2013 at 3:31 am
tomes12 (2/21/2013)
Hi, thanks for the quick response.Of course, I see what you're saying.
Yes, mirroring/log shipping is all an option and under consideration.
That still asks the question how would it work if I enable it for 500 databases?
How much additional load would it create? How much bandwidth would it take up?
I've done tons of DR solutions in the past, but never on this scale, and I can't just start ramping up log shipping for the databases one by one just to see "how will it go".
Is there maybe a way I could test/estimate/calculate this sort of thing before I actually go and implement it for 500 dbs.
Any ideas/thoughts. Thanks
To be honest I've never been exposed to a SQL instance with so many databases so this is new to me too.
Let's see what the other experts have to say. I'm intrigued!
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
February 21, 2013 at 8:50 pm
I am not sure that mirroring will be a viable option if you are needing to mirror 500 databases. MS publish a document on the server resources required (http://support.microsoft.com/kb/2001270). Have a read before getting too serious about mirroring.
Replication really isn't a DR type of product - it doesn't do DR all that well. You would need to deal with failover yourself. As far as I am concerned, it would be fairly easy to handle the first failure with a set of manual steps needed but going back will most likely require either another publication to the original server and/or backup restores to that server with some possible outage needed to achieve it).
Your SAN vendor may be able to help out - some have technology to sync SANs between data centres. This is not my area of expertise or experience so I can't add anything else on this.
February 21, 2013 at 9:30 pm
Hi,
Replication is more like a reporting solution, it gives you choice of tables ore even columns that needs to be replicated.
Mirroring can be considered as DR (I had a set up we implemeted Mirroring across datacenters) but it is costly either in terms of budget or performance.
(One advantage for mirroring is it give you capability for failback (role reversal))
I think best option for you is log shipping. You can build it over your existing backup plan. Considering the number of databases only thing you might need have a good plan for monitoring, failover and faiback.
In log shipping failback is possible but you need to create your on custom steps.
February 22, 2013 at 5:09 am
Hi, I love DR so I'm also intrigued as to what people think....
Firstly, how big are the Db's on the server and secondly, do you have a budget to work with for your DR solution or do you have to 'price up' various scenarios first?
If budget isn't an issue and the DB's aren't too big then I'd definitely look at mirroring with an automatic fail-over and witness server. If you're worried about overhead etc then maybe have a look at switching it to asynchronous mirroring (especially as you've indicated that the 'DR site can be a few hours old'). Depending on what transactions/usage your primary server gets and how that affects performance, you could also log ship to a 'warm standby' just to be on the safe side (overkill though maybe if mirroring is in place - depends how paranoid you are!! :-)).
Do you have a testing environment that you can use, I'm not a huge fan of installing SQL Server on a virtual server but that might be one way of creating a viable test environment to check the impact of things like mirroring/log shipping on the sort of scale you're talking about.
Do you have a plan with regard to backups etc?
email: info@weekendwebdesign.co.uk
Personal Website: http://markallen.co.uk/
Business Website: https://www.weekendwebdesign.co.uk
February 22, 2013 at 5:15 am
From what I gather and researched, I would agree that log shipping would probably be the way to go.
Currently the backup strategy is just "Daily Full", but implementing custom log shipping in a way that it would be handled by just a few jobs (instead of jobs per database) wouldn't be that difficult.
I'm still not that clear on how it would impact the server having to tlog ship 500+ databases.
Do you have any ideas how to test/calculate/analyze this before-hand?
Thanks,
February 22, 2013 at 6:10 am
tomes12 (2/22/2013)
Do you have any ideas how to test/calculate/analyze this before-hand?,
I can't think of a way (other than creating a dedicated log shipping/mirroring test environment and using copies of your production DB's within that environment) that you'd be able to test or analyse the impact of the solution.
Does anyone have any other thoughts?
email: info@weekendwebdesign.co.uk
Personal Website: http://markallen.co.uk/
Business Website: https://www.weekendwebdesign.co.uk
February 22, 2013 at 8:47 am
We've tried mirroring many DB on the same instance and we started having problem at around 50 DBs, that might have changed on SQL 2012 though but mirroring is not designed for that kind of things.
Log shipping is the only real solution (beside geo-cluster + SRDF SAN).
The initial load is pretty easy to compute, just check the size of all your full backups, copy of few of them to your DR site and find out an AVG copy speed (in your case you might want to get another network card dedicated to copying files to the DR site).
For log shipping to work all your DB need to be in full recovery, if that's not already the case you should switch all the DBs to full recovery and see how much log backup you get per hour.
The limiting factor is the network speed, doing log backup is only an issue on heavily used system.
Handling the Copy/Restore/Cleanup is just an industrialized version of any log shipping scripts, powershell would probably be the quickest way to get you started. Log shipping being so old there are probably scripts floating around resilient enough to resist anything you can throw at it.
Depending on your RTO and RPO, and the size of some DBs you can do a mix of log shipping and regular backup/restore (Full & Diff can also be mixed).
In the end it all depend on how much data can be lost. If "none" is the answer, check the price of a geocluster + admins with enough experience + 2 san box linked with SRDF + san admins, show it to the service owner and ask the question again.
February 22, 2013 at 8:56 am
Oliiii (2/22/2013)
We've tried mirroring many DB on the same instance and we started having problem at around 50 DBs, that might have changed on SQL 2012 though but mirroring is not designed for that kind of things.
@ Oliiiii - What problems did you come up against when you hit 50 DB's, I have heard that MS recommend no more than 10 databases on a 32bit system as well - is that you're experience?
email: info@weekendwebdesign.co.uk
Personal Website: http://markallen.co.uk/
Business Website: https://www.weekendwebdesign.co.uk
February 22, 2013 at 11:28 am
You need to do mirroring at the disk level, not the database level. The number of databases is then irrelevent, it's only the total volume of changes that is an issue. A good mirroring system will support a lag time when things are very busy and/or the channel to feed data to the mirror is slow/busy.
Trying to recover 500 separate databases using log shipping would be quite a task, even if automated. Make sure you can accept the length of time it would take you get 500 databases back up and running before choosing that solution.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 22, 2013 at 11:35 am
Nice to see a discussion getting started, and I really appreciate all your ideas.
Just talked to management, and before we start looking at geo-clusters, mirroring at disk level (really like that idea), as always, I need to try and figure out what is possible without any of that.
We all agree though, seeing that network bandwidth would be our bottleneck, that we should definitely invest some in that area.
Btw, db sizes range from a few MBs to a several dozen GB, with most of them being around 5GB.
I would also add that there's no considerable growth observed during the last year, so no problems there.
To add one more level of complexity, we have 2 production clusters with 500 DBs EACH, and we would like to DR that to another location but to ONE Server if possible (mind you this would really just serve the disaster scenario, so the server probably won't be as powerful as those on Production) ?
Do you think this changes anything somehow?
Thanks,
February 25, 2013 at 12:48 am
WWDMark (2/22/2013)
Oliiii (2/22/2013)
We've tried mirroring many DB on the same instance and we started having problem at around 50 DBs, that might have changed on SQL 2012 though but mirroring is not designed for that kind of things.@ Oliiiii - What problems did you come up against when you hit 50 DB's, I have heard that MS recommend no more than 10 databases on a 32bit system as well - is that you're experience?
We started to get have more and more delay in the synchro and a once any of the DB required more resources, the effect would impact all the other much quicker. In the end the DBs would fail several time a day, even with high timeout.
We don't have 32 bits server, but for 64bits our local msft contact said we should not expect it to work beyond 50 DBs mirrored (no idea if that's an official recommendation or just a friendly advice).
February 25, 2013 at 1:00 am
tomes12 (2/22/2013)
To add one more level of complexity, we have 2 production clusters with 500 DBs EACH, and we would like to DR that to another location but to ONE Server if possible (mind you this would really just serve the disaster scenario, so the server probably won't be as powerful as those on Production) ?Do you think this changes anything somehow?
Thanks,
It all come down to how much data goes into the log files, so you need to find that out first.
Unless your DR server is really underpowered, you'll have network speed issue before you have restore speed issue.
If you do log shipping with 2 source servers, you'll have to make sure you have no DB name conflict or login name conflict.
On a side note, avoid copying files around, backup directly to the DR server (trough a share) or to a network share and then restore directly from there. Copying files around in windows might bring it's own memory issues (especially if you have big files).
And don't forget to synchronize your logins and specific jobs (if any) 🙂
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply