July 26, 2005 at 4:19 pm
As I write this, our engineers are switching our servers to a new UPS. This results in us being exposed to outside power cuts. They say a power failure could corrupt our SQL Server databases, and have asked us to stop our instances while they cut over.
Can a power failure cause a database or an installation to be corrupted? I understand that a brown-out can cause servers to fry, but they have not asked us to bring our servers down -just SQL.
July 26, 2005 at 6:14 pm
A power failure can be trouble if the server was doing a write operation, etc. However, this doesn't just affect SQL Server. It affects the OS as well.
K. Brian Kelley
@kbriankelley
July 26, 2005 at 6:19 pm
As far as I can remember if you don't have battery backed up disk controllers you can suffer from a torn page if you suffer a power outgae in which case I think all you can do is go back to you're last good back up.
This can happen as a sql server page is 8kb but each disk sector is 512 bytes and once the first sector has been written sql server assumes that the rest has been written succesfully but if you suffer an outgae this may not be the case.
hth
David
July 27, 2005 at 7:23 am
You need to assess the risk of it happening against the consequences if it does. If you want to leave the system up take a backup immediately before the work begins. If the switch isn't going to take long then you will probably be ok assuming they don't accidently knock off the power when switching the UPS.
Nigel.
Nigel Moore
======================
July 27, 2005 at 7:35 am
A couple of months ago during some construction on our building a painter hit the emergency kill power switch for our entire data center. The phone system, the mainframe, 300+ Windows/Linux/AIX servers.. .Oracle, SQL Server, DB2 ALL instantly went down. We slowly brought everything back up and as amazing as it seems nothing was corrupted. I ran integrity checks on all SQL Server databases (19 production servers total). But, as early threads have stated there is always that possiblity of corruption.
You really could simply take a full db backup right before they switch over and leave everything up...
July 27, 2005 at 3:47 pm
Here is a quote from "Microsoft SQL Server 2000 Administrator's Companion" by Marcilina S. (Frohock) Garcia, Jamie Reding,
Edward Whalen, Steve Adrien DeLuca
Chapter "I/O Subsystem Concepts"
".........Don't use write caching without a battery backup. Most caching controllers include a battery or offer one as an option. This battery retains data in the cache in the event of a power failure. Without this battery, the data in the cache would be lost, and the database might become corrupted..........."
Regards,Yelena Varsha
July 28, 2005 at 1:04 pm
".........Don't use write caching without a battery backup. Most caching controllers include a battery or offer one as an option. This battery retains data in the cache in the event of a power failure. Without this battery, the data in the cache would be lost, and the database might become corrupted..........."
That quote is correct as far as it goes. HOWEVER, do note that merely having a battery backed controller is not necessarily sufficient in and of itself (as testing with SqlIOStress may often easily demonstrate with systems that implement lower end controllers).
July 28, 2005 at 1:38 pm
Personally, I'd have small UPSs on the boxes that you are most concerned about. Usually SQL does a great job of rolling transactions back and forth and everything comes up clean, but I've always thought that paranoia was the best trait for a DBA: stay alert, trust no one, keep your laser handy. And protect everything that you are responsible for with as many layers as your boss will pay for.
We had a lovely incident when I was working for the police department. The generator was being serviced and was taken out of the circuit. Later there was a power glitch and the ENTIRE computer room went down: mainframe, two minis, a dozen or 18 file servers, two optical juke boxes, radio controllers.
Apparently when the generator was taken out of service the power distribution box wasn't switched back in. When external power was lost, the backup batteries never had a chance to switch in.
Dispatch has a seperate facility with their own HP 3000, so 911 wasn't affected, though Patrol wasn't able to query criminal records for a little bit.
Fortunately everything on the micro side came up without a hitch. I think the two HP 3000s took a bit more work, I'm not sure about the Unisys box, but those weren't my concern. All my SQL boxes were 6.5 except for one 7 box (it was a few years ago), I immediately ran DBCCs and everything was clean.
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
July 28, 2005 at 2:21 pm
I've been through a similar experience. We've considered going to smaller UPSes but they take up rack space and add considerable weight.
K. Brian Kelley
@kbriankelley
July 28, 2005 at 3:23 pm
I wonder if you could rent a medium/big UPS, say, for a week, that would be sufficient to keep servers up for long enough to do an orderly shutdown or for the backup power system to switch in.
Tricky thing when you're working on your standby power system. We had a nasty situation where this one server kept shutting down and coming back up. It turned out that the UPS battery was failing and telling the controller that utility power was gone (initiate shut down), oops!, never mind, power is back, wait -- no it isn't....
That drove us nuts until a tech went out with a volt meter. 🙂
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
July 29, 2005 at 10:35 am
We recently had a few total datacenter power outages. One the week before last, 3 in a 2 hour period due to storm and the utility company. And one the prior week due to an accident and the utility company. We have generators and UPS filtered power. However everything went down each of those 4 occasions due to mechanical UPS and generator failures. 200+ servers in all were affected and our SAN, NAS and CAS. When things came back up each time none of our 250+ SQL Server databases were corrupt (damn were we lucky). I have worked at other sites and experiences power losses and equipment (disk) failures associated with the power losses (2% of the equipment was in a 'failed' state'
If I were in your shoes I'd definitely shut down your SQL Servers just to be safe. Definitely coordinate this shutdown with the electrical engineers for off hours. Having been through outages (and the associated upgrades needed afterward) your actual downtime shopuld be less than 15 minutes if the electrical contractors really know their stuff.
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
July 29, 2005 at 1:10 pm
I would cover your back and make sure that you have a full database backup before the work begins, after all this is one of the main job functions of a dba to always have a good backup stratgey. This way you know that if anything were to heppen that you can always revert to your backup. Always remember backup, backup, backup.
Conor.
July 29, 2005 at 3:29 pm
Another thing that helps is to set the DBs that are less used in read-only mode and you will, at least, have no problems with those
* Noel
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply