December 20, 2002 at 12:00 am
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/awarren/anotherdisasteralmost.asp>http://www.sqlservercentral.com/columnists/awarren/anotherdisasteralmost.asp
January 13, 2003 at 4:10 pm
All I can say is OUCH!
I'm sitting here thinking that I sure better go back over my disaster recovery plan again and make another dry run to make sure I haven't missed anything! This is the type of wake up call we all don't need to have happen but must be prepared for. Excellent job keeping your users up Andy!
Gary Johnson
DBA
Sr. DB Engineer
Gary Johnson
Microsoft Natural Language Group
DBA, Sr. DB Engineer
This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed in this post are my own and may not reflect that of my employer.
January 13, 2003 at 4:25 pm
Thanks. We actually sat down today with our vendor to figure out what clustering will cost..another OUCH! I'll probably write up some info on that as well, amazing how quickly the costs add up.
Andy
January 14, 2003 at 3:18 am
Andy,
Great article. I think this topic is fantastic. Probably one of the best area's on this site, even tho there are only three entries. It's a good insight to other DBA's Disaster Recovery (DR) and it's always interesting to see what extra work you have to do on the fly when the problem falls out of the DR plan scope.
Fortunately (touch wood) we've not experienced anything like that.....but who know's what the future holds...especially as were moving to a new production box....
I guess you can test and test your DR plan, but computers, as wonderfull as they are, are always full of surprises!
Don't wish to curse all the rest of you out there, but any further disasters (or near disasters) would make interesting reading....
Clive Strong
January 14, 2003 at 6:55 am
While the content of the article was useful, I must offer some well intended criticisim. I am not sure if english is your second language, or if you intended the article to read in the style of a personal journal. The article would have been much easier to read if you had used proper puctuation, spelling or even close to the correct grammar. I would expect higher quality writting on a site such as this one.
January 14, 2003 at 7:01 am
Nope, English is my first and only language. Well, I speak geek a little. We're pretty informal here, though we do try to get the spelling correct. We (Steve, Brian, myself) find most technical content hard to read, so we try for a less intense approach. In this case I did write in journal fashion because I wanted to try to present as best I could what happened/how I felt, not really try to clean it up.
We've got a book project under way to compile info from the site and the majority of readers did want us to correct typo's and minor mistakes, so we'll be reviewing old content and putting more time into proofing new ones. In the interim, if you see something wrong, please do point it out and we'll try to fix it.
Andy
January 14, 2003 at 8:03 am
I had some bad memories while reading your article. Things can get ugly in a big way when dealing with SQL Server...Can you believe that SQL Server 7 was supposed to be able to run itself without needing DBA's (right)? SQL 2000 should probably fly the space shuttle or something. I have found that when SQL breaks now...it's not a small issue.
At the United Network for Organ Sharing, we had SQL 2000 running on clustered servers with the databases on a SAN. All of a sudden (I'm not kidding), a very important (1 megabyte) database file disappears off the SAN and now our 240 GB database is suspect. Now nobody in the country can match organs.
Having the cluster didn't help, since the problem was on the SAN. We pointed the app to our Hotsite, which was kept up to date via Log Shipping. It took several hours the next day to get everything back to normal, but lives were saved, thanks to Log Shipping.
January 14, 2003 at 1:56 pm
Glad I'm not Andy or the last guy
I've been pretty lucky with very few disasters, and hopefully that will continue.
Steve Jones
January 15, 2003 at 9:49 am
Reads like a story. We know we need to restore or re-snapshot when such a thing happens. What would have added value to this article is, "what originally caused the problems" and "how to identify the root of the problem" and "how to avoid such hardware failure (Checks one should do on a regular basis)"
January 15, 2003 at 3:59 pm
As I mentioned above, it IS written in journal style. As for your questions, you tell me? How could I know that the container would drop? What could I do to prevent it next time? Easy to say cluster everything, keep a warm standby, etc, but sometimes companies truly cannot afford it (or you can't convince them to afford it).
What I was hoping to share was how something gets handled when regardless of whether you tried to prevent it, could have prevented it, etc, things go bad and you have to make decisions. If you can learn something, think of something, not do something from reading my sad story, then I've done some good.
Andy
January 16, 2003 at 7:01 am
using replication i had a rollover to backup site the other day only to find out that all the text fields were empty seems that if text fields are updated with writetext then replication doesnt work. Plus the verify program so graciously doesnt verify text fields. So manual restores field by field
January 16, 2003 at 7:02 am
Has any body tried the sql up program by incepto
January 16, 2003 at 7:31 am
FWIW - I think the journal style is fine and easy to read.
Clustering / high availability is something we too have considered. Unfortunately the cost of traditional clustering is prohibitive.
I second 'danschl's request for any real-life experience with SQL-Up. Even pricing would be helpful. I dislike the policy of some companies who do not even post the pricing on their sites.
January 16, 2003 at 10:32 am
We are looking at testing SQL-UP at SQL Server Central. We met with the vendor and do like the program. There are some limitations and issues (we need another box ), but it's a neat product and I was impressed.
I won't delve into pricing, send your feedback to Incepto on that, but it's much less expensive than Clustering from MS and you don't need shared disk. In fact the two servers can be different models/brands. You do need some good bandwidth in between the servers, so it's not really a WAN solution.
Steve Jones
January 16, 2003 at 11:56 am
I was at the demo with Steve and was really going in thinking I would not like it. Pleasantly surprised. Failover is quick! The downside (and keep in mind this comment is based on a 15 min demo/conversation) is that you have to create a shadow db on each server to store/reconcile the changes/manage stuff, so potentially you increase your disk usage quite a bit. Not sure by what factor.
Andy
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply