September 28, 2004 at 6:04 am
I've been asked to look into resilence with regard to providing 24/7/365 database service for an entirely database driven web content management system.
Virtually all my experience is as a development DBA/data analyst so I'm in over my head here.
If you have a cluster will service continue if either one of the servers fail or is there a primary node that can act as a single point of failure?
When would I use Active/Active and Active/Passive?
For this particular application I am thinking along the lines of having a completely separate SQL Server (cluster) looking after content maintenance and then the final data being replicated across to a SQL server (cluster) looking after the actual surfing of the site.
I know that RAID 5 has some issues with write performance, plus I read somewhere that when the drives begin to degrade there is a risk of RAID 5 copying the bad blocks across the array.
What I was considering was having the content maintenance database server using RAID 1+0 but the live server using RAID 5 as the content server will have a lot of writing taking place as part of the content processing where as the front end server will only have the results of the processing written across.
What should I consider with regard to other points of failure?
September 28, 2004 at 7:19 am
Unfortunately, even with Active/Active (meaning if the primary fails the secondary will kick in with no intervention (hopefully)) there will be downtime while the system fails over...
Other points of failure are
You can make your server "farm" as robust as possible but.. If 3 goes down you are off-line. If 5 goes down your server will take itself down as protection (Have recently experienced this during 2 different hurricanes. Gotta love sunny florida). If you lose power 4 will get you.
If your network doesn't have redundancy or your ISP doesn't you are offline.
Just some more things to make your hair turn prematurely gray...
Good Hunting!
AJ Ahrens
webmaster@kritter.net
September 29, 2004 at 2:52 am
Usually when you are considering points of failures for a web 24/7 database you also need to keep in mind other factors like a denial of service attacks, virus infections. These might sound too primary to be considered, but if they are taken lightly, it could be a cause for some very major embarrassment.
Also poorly written application could end up using much of your system resources thereby reducing response times, though this do not come under resilience issues.
Oh yes, another point of failure could be your RAM.
I sincerely hope u don’t encounter any of the points mentioned by us and your servers keep running 24/7.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply