May 19, 2009 at 7:44 am
Andy Hogg (5/19/2009)
This has become an interesting secondary debate worthy of a separate discussion thread.Should one follow directives from above, even when those directives at best contradict best practice, and at worst endanger data and or system availability?
First, let's give credit to Andy for the suggestion. Now, on to the topic at hand. I had a very similar conversation this morning with a peer. I started at my current place of employment about 16 months ago. The corporate solution for backups is NetBackup, direct from SQL via a Veritas service on the servers. Different than what I've done in the past, as I always used native SQL to disk and then sweep to tape. Always reliable. Anyway, so I went with the flow here. Let NetBackup do it but I would continually call for a restore to prove it works. Now, for the last three days, the Network Admin has been attempting to restore a production database onto a staging server for me and CANNOT. Some issue that he is working with Veritas on. Seems to me reliability is now out the window. If this is not resolved soon, should I take this to our CIO, basically bypassing the network manager (my boss?)? My thinking is yes because if we needed to recover, and couldn't, I essentially would take the blame for not being able to do so, even though I cannot get my boss to understand my concerns. For now, I am do separate backups to disk but I feel like I am back-dooring my boss. And suggestions/comments?
-- You can't be late until you show up.
May 19, 2009 at 8:31 am
If you want to bring this up without bypassing channels, there are a couple of ways to do that.
Probably the best option in most circumstances, is to call a meeting, and have material put together for the meeting that covers your concerns and your proposed solution. Include your boss and the CIO in the meeting, along with other interested parties. If you put the meeting in terms of, "I'm concerned about this situation and looking for input from various viewpoints on it", that gives everyone a chance to have something to say about it. Your boss doesn't lose any face, and isn't being bypassed, and he may have data that improves your proposed solution.
Another is to put the data together into an e-mail, send it to your boss, and CC the CIO. That way, again, you're not bypassing anyone, and it can generate a discussion on the matter. The problem with this one is that long, technical e-mails often get a glance-and-ignore handling from most people.
If neither of those is a valid approach in your organization, then, yeah, go directly to the CIO, appologize up front for bypassing your boss, but make it clear that you did so reluctantly, because you are concerned that the situation isn't getting the necessary attention.
Be polite and respectful about it, make it clear that whatever solution you're using is because of your concern about the well-being of the company, and that it's not personal in any way.
If your boss or the CIO is the type who will make it political/personal no matter what you do, and some people are that way, then you're still better off bringing it up at the highest appropriate level, and then dropping it if it gets too contentious. At least you've covered yourself if it does blow up, to one extent or another.
With about 80% of humanity, you'll just need to communicate more, and the situation will resolve. Communication is the universal solvent. With the other 20%, where politics and vendetta rule over reason, you're in a no-win situation anyway, and you just deal with it as best you can.
(There is a science to dealing with the 20%, but it's way outside the scope of what I can post in a simple forum message. There are whole books on the subject.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 19, 2009 at 8:34 am
I'd say yes, you should follow directives from on high as long as those directives are given with the full understanding of the implications. It doesn't sound like this is the case in your situation. If I were told to do backups in a manner I didn't like (and I like your approach going to disk & then to tape), I'd do a full set of tests to verify backup consistency, timing, and most importantly (as your boss is finding out), recovery from the backup. Once I had all the data, I'd show everyone what the decision is costing and then say "I'll do it your way if you want me to, but you need to know how it works."
In your immediate situation, no, I wouldn't charge over your bosses head just yet, but I'd be prepared to. If he can't resolve the backup, you need to encourage him to alert his superiors. If he does resolve the backup, start another one on a different server, just to see if the problem was a one-off or part of a larger issue. If it's a one-off, document it, and send a copy to your boss. If it's part of a bigger issue, again, document it, and get that information to your boss. If he does nothing with it, or starts making noises like you're doing wrong, then go over his head.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 19, 2009 at 9:35 am
Is your boss aware of the current problem doing the restore ? What does he say about it ?
Last I knew, you could not use Veritas to restore to drive paths different from those of the original backup. So if the original database is on E:\MSSQL\DATA\DATA_1.MDF and F:\MSSQL\DATA\DATA_2.NDF and you need to restore to a different server to G:\MSSQL\SQL_DATA\DATA_1.MDF and H:\MSSQL\DATA\DATA_2.NDF , you're out of luck. That's a "show stopper" for me and unacceptable for disaster recovery.
You can also save money by not buying the Veritas SQL agent.
May 19, 2009 at 10:26 am
Yes, my boss is aware of the current issue and my displeasure. When he found out we were still taking critical backups to disk, I could see his blood boil a bit as that is not the corporate solution. However, I feel rather vulnerable and would rather have two solutions (hopefully both working) and am doing a CYA for the time being. Disk is not an issue and I'm retaining 14 days on disk.
I'm letting our backup guy do the leg work with Veritas to figure out the issue and once that is resolved, and can be proven over a period of time, I may consider not doing the backups to disk. As stated, it has worked in the past, and we were able to restore databases to different drives than what the production DBs are on.
Thanks for the responses folks. Kind of a rant by me this morning.. and when I saw Andy's lead-in for a topic like this, I had to put this out there. I do know, if my CIO knew about the situation, he'd totally flip (another reason for seeking your comments before I did anything foolish!). He's a former SQL DBA/developer himself. Not sure what his thoughts are about Veritas as a corporate solution though. I'll mention that to him after things get working again, just a kind-of informal conversation in the hall.
-- You can't be late until you show up.
May 19, 2009 at 3:09 pm
When I first started at a position, a couple of heavy revenue earning SQL boxes where being backed up via VERITAS to tape, the rational that there was not enough disk space. I brought this up in email chains to everyone concerned, saying we should go to tape first and then to disk, and low and behold one of the boxes dies, can't recover disk and it turns out the databases to be backed up was hard coded into the backup cmds on VERITAS, some where missed. Lots of downtime, while DB's on dev\qa boxes where salvaged, loads rerun, and DBs put into emergency mode to pipe as much data out.
This caused a major outage, revenue lost, but the next quarter, there is all new hardware purchased, clustered with ample SAN disk for backups, and a windows boss on there way out.
If people, who may not know better, and call the shots, disagree with what you recommend, document your concerns, insist on written responses, loss of revenue usually starts getting things done correctly, and you are covered because you raised your concerns to all before something goes awry.
Andrew
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply