June 19, 2014 at 5:25 am
rodjkidd (6/19/2014)
So for my current gig, I'm not a production DBA.We have a very temperamental server which is being upgraded this week.
It also has spaghetti jobs / packages. You know A must run for B & C to run, and C must run for D to run, etc...
And no, there is no job decency / logic in the packages... A should finish by X, so B can run at X + 10 (or whatever).
Got an email this morning to say that job D produced no results. I had a look and it's because C ran and produced no results.
The server was rebooted over night, there is a lot of over night jobs that run, or rather didn't last night...
Not really sure we should be finding out because a business user emails with "why are the files empty?"
Yes the DBA's don't try and run any of the jobs that were either running at the time of re-boot or should have started while it was down.
I should point out that in my 18 months here this is the 3rd set of Prod DBA's, and they are swamped, but still lack a certain proactive response, shall we say?
I'll say no more! :ermm:
Rodders...
I realize that I don't have all the facts about the situation, but that sounds to me like a scenario where, instead of having 18 jobs that "should finish by time X", you should instead have one job with 18 steps. Each step can then handle itself and it eliminates the guess work of whether or not the previous step finished.
June 19, 2014 at 6:10 am
Ed Wagner (6/19/2014)
rodjkidd (6/19/2014)
So for my current gig, I'm not a production DBA.We have a very temperamental server which is being upgraded this week.
It also has spaghetti jobs / packages. You know A must run for B & C to run, and C must run for D to run, etc...
And no, there is no job decency / logic in the packages... A should finish by X, so B can run at X + 10 (or whatever).
Got an email this morning to say that job D produced no results. I had a look and it's because C ran and produced no results.
The server was rebooted over night, there is a lot of over night jobs that run, or rather didn't last night...
Not really sure we should be finding out because a business user emails with "why are the files empty?"
Yes the DBA's don't try and run any of the jobs that were either running at the time of re-boot or should have started while it was down.
I should point out that in my 18 months here this is the 3rd set of Prod DBA's, and they are swamped, but still lack a certain proactive response, shall we say?
I'll say no more! :ermm:
Rodders...
I realize that I don't have all the facts about the situation, but that sounds to me like a scenario where, instead of having 18 jobs that "should finish by time X", you should instead have one job with 18 steps. Each step can then handle itself and it eliminates the guess work of whether or not the previous step finished.
Sounds like Rodders should suggest 'how would you fix this' and 'what priority would you give it?' as interview questions in the near future.
And maybe see if he could offer some help.
Rebooting a server in the middle of the night is a task that needs some coordination even is the best environments.
There is more happening than just DBA's being at fault.
Speaking of DBA's - sounds like a 4th set might be coming in the near future. π
Swamped, and now they might be losing more ground.
June 19, 2014 at 6:37 am
Greg Edwards-268690 (6/19/2014)
Ed Wagner (6/19/2014)
rodjkidd (6/19/2014)
So for my current gig, I'm not a production DBA.We have a very temperamental server which is being upgraded this week.
It also has spaghetti jobs / packages. You know A must run for B & C to run, and C must run for D to run, etc...
And no, there is no job decency / logic in the packages... A should finish by X, so B can run at X + 10 (or whatever).
Got an email this morning to say that job D produced no results. I had a look and it's because C ran and produced no results.
The server was rebooted over night, there is a lot of over night jobs that run, or rather didn't last night...
Not really sure we should be finding out because a business user emails with "why are the files empty?"
Yes the DBA's don't try and run any of the jobs that were either running at the time of re-boot or should have started while it was down.
I should point out that in my 18 months here this is the 3rd set of Prod DBA's, and they are swamped, but still lack a certain proactive response, shall we say?
I'll say no more! :ermm:
Rodders...
I realize that I don't have all the facts about the situation, but that sounds to me like a scenario where, instead of having 18 jobs that "should finish by time X", you should instead have one job with 18 steps. Each step can then handle itself and it eliminates the guess work of whether or not the previous step finished.
Sounds like Rodders should suggest 'how would you fix this' and 'what priority would you give it?' as interview questions in the near future.
And maybe see if he could offer some help.
Rebooting a server in the middle of the night is a task that needs some coordination even is the best environments.
There is more happening than just DBA's being at fault.
Speaking of DBA's - sounds like a 4th set might be coming in the near future. π
Swamped, and now they might be losing more ground.
So we started last year to document this, and sort out "a better way". Be it master jobs, re-writes, etc. But other shiny new things get in the way as far as projects. The funny thing is, I knew D depended on C, and wanted to either decouple this or at the very least a master job as you both suggested, but it wasn't until this morning I realised it goes a lot deeper than that. And I'm guessing more than 3 or 4 jobs deep. It's been a very organic evolution of a applications, so you can image what that means.
Its a project on the back burner due to bigger and newer projects that happening in the pipeline.
I have to say the current group are at least trying, last senior DBA wouldn't touch a thing.
So this isn't the first time this has happened with an unplanned server re-boot, although not at the time in the morning, but no email goes out, no warning on intranet, no attempt to run jobs - just oh well it will be fine tomorrow. I've worked places where the incident manager would be so on the ball this just didn't happen or else you had to face a debrief meeting!
The approach at the moment appears to be "chip away at the problem" and we will get there in the end.
I'm sure they will get there in the end, but I doubt I will be here to see it!
I'm going to speak to my manager about this as I think I can work with them and at least come up with a "play book" or what needs to run. But whether that can fit in with project work I have no idea.
Cheers both for your input
Rodders...
June 19, 2014 at 7:13 am
rodjkidd (6/19/2014)
Greg Edwards-268690 (6/19/2014)
Ed Wagner (6/19/2014)
rodjkidd (6/19/2014)
So for my current gig, I'm not a production DBA.We have a very temperamental server which is being upgraded this week.
It also has spaghetti jobs / packages. You know A must run for B & C to run, and C must run for D to run, etc...
And no, there is no job decency / logic in the packages... A should finish by X, so B can run at X + 10 (or whatever).
Got an email this morning to say that job D produced no results. I had a look and it's because C ran and produced no results.
The server was rebooted over night, there is a lot of over night jobs that run, or rather didn't last night...
Not really sure we should be finding out because a business user emails with "why are the files empty?"
Yes the DBA's don't try and run any of the jobs that were either running at the time of re-boot or should have started while it was down.
I should point out that in my 18 months here this is the 3rd set of Prod DBA's, and they are swamped, but still lack a certain proactive response, shall we say?
I'll say no more! :ermm:
Rodders...
I realize that I don't have all the facts about the situation, but that sounds to me like a scenario where, instead of having 18 jobs that "should finish by time X", you should instead have one job with 18 steps. Each step can then handle itself and it eliminates the guess work of whether or not the previous step finished.
Sounds like Rodders should suggest 'how would you fix this' and 'what priority would you give it?' as interview questions in the near future.
And maybe see if he could offer some help.
Rebooting a server in the middle of the night is a task that needs some coordination even is the best environments.
There is more happening than just DBA's being at fault.
Speaking of DBA's - sounds like a 4th set might be coming in the near future. π
Swamped, and now they might be losing more ground.
So we started last year to document this, and sort out "a better way". Be it master jobs, re-writes, etc. But other shiny new things get in the way as far as projects. The funny thing is, I knew D depended on C, and wanted to either decouple this or at the very least a master job as you both suggested, but it wasn't until this morning I realised it goes a lot deeper than that. And I'm guessing more than 3 or 4 jobs deep. It's been a very organic evolution of a applications, so you can image what that means.
Its a project on the back burner due to bigger and newer projects that happening in the pipeline.
I have to say the current group are at least trying, last senior DBA wouldn't touch a thing.
So this isn't the first time this has happened with an unplanned server re-boot, although not at the time in the morning, but no email goes out, no warning on intranet, no attempt to run jobs - just oh well it will be fine tomorrow. I've worked places where the incident manager would be so on the ball this just didn't happen or else you had to face a debrief meeting!
The approach at the moment appears to be "chip away at the problem" and we will get there in the end.
I'm sure they will get there in the end, but I doubt I will be here to see it!
I'm going to speak to my manager about this as I think I can work with them and at least come up with a "play book" or what needs to run. But whether that can fit in with project work I have no idea.
Cheers both for your input
Rodders...
Maybe an approach to "chip away at the problem" is to start by creating the master job. Migrate step A and then disable job A. When you have the time, migrate step B and disable step B. Continue in perpetuity until all steps are in the master job. I know that's a simplified summary, but it's a place to start that would allow you to break up the monumental task into manageable chunks that have a prayer of ever getting done. Divide and conquer!
June 19, 2014 at 7:23 am
Sounds like it will only have potential to get worse over time, until the time comes when the business says 'enough'.
Fortunately I have had the pleasure of working where 'lights out' practices were the norm in out group.
Takes a while for some businesses to realize taking the time to do it right has benefits, even if it takes a little more time.
Seems they are starting to see this, just it hasn't been painful enough.
We were one segment of a global company.
When the corporate execs couldn't see what happened yesterday, and would ask, attitudes and priorities changed quickly.
Especially when it came to light that your taking a system down had a ripple effect.
We even had alerts go out when batch processes on the AS400 ran late, and our processing had not started around the normal time.
Amazing when you investigate and call someone at 2am telling them to see what has hung on the ERP system how they start working together.
Much better for all when the response is we have identified why things were late, and are doing 'x' to fix it.
The pit crew wins a lot of races without the fastest car on the track.
June 19, 2014 at 11:29 am
Well looks as though I may have "bought" myself a little side project with the Prod DBA's on this one.
And I need to have a chat with them about "critical" jobs that must run.
π
Rodders...
June 19, 2014 at 11:30 am
And Greg, it's been getting worse over time already... It was only just about stable when I started. Just been overtaken by more important projects and priories... You know unless it keels over and dies. :O
Rodders...
June 19, 2014 at 1:28 pm
Just to improve my english, what is the meaning of "Rodders..."?
Members of a hotrod club
or maintenance (access?)
or the thing you use to unclog pipes (duct rodders)
June 19, 2014 at 2:14 pm
rodjkidd (6/19/2014)
And Greg, it's been getting worse over time already... It was only just about stable when I started. Just been overtaken by more important projects and priories... You know unless it keels over and dies. :ORodders...
I call it the illusion of speed.
Some see flurry of activity during the crisis as everyone working and being productive.
I'd like to set them loose on a road course, show them the long straight away.
Then watch their panic trying to make the hairpin turn at the end.
Ever notice many of the best processes are planned out?
And designed so things slide into place when enhancements are made?
Many times the one who appears calm, even in a crisis, gets a lot more done.
Truly wonderful to have spent many years with a Data Warehouse Architect that understood that well.
Gave me many well needed nights of sleep.
Unfortunately, seems many shops live in crisis mode.
And have a hard time justifying fixing things unless totally broken.
Just have to wonder how many DBA's may have left having been always out prioritized by the latest project or someone else's priority?
That turnover also adds to the workload.
Hope you can at least get some better alerts in place.
I like the suggestion to start sorting things out and replacing.
Easy to show some progress and reduce the noise.
June 20, 2014 at 2:34 am
Jo Pattyn (6/19/2014)
Just to improve my english, what is the meaning of "Rodders..."?Members of a hotrod club
or maintenance (access?)
or the thing you use to unclog pipes (duct rodders)
Jo,
Ha ha - nice.
My name is Rodney.
Rodders is one of the nicknames, I've always signed things that way. Except when at work or applying for jobs obviously! The bank wasn't too impressed when I used it there, but hey ho!
Rod (as in Rod Stewart) is also one, but other than using in my email address as I searched for something unique I don't use that one.
Also Roddy as in Hot Rod Roddy Piper, can also be one, but I think that's really for Roderick and is a very Scottish thing to say. Well I've only have been called it by Scots anyway!
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
June 20, 2014 at 3:05 am
rodjkidd (6/20/2014)
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
You are, of course, in danger of being called Dave π
Bex
June 20, 2014 at 5:11 am
Bex (6/20/2014)
rodjkidd (6/20/2014)
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
You are, of course, in danger of being called Dave π
Bex
Not really, because as we all know..."No man, Dave's not here". π
June 20, 2014 at 7:08 am
Ed Wagner (6/20/2014)
Bex (6/20/2014)
rodjkidd (6/20/2014)
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
You are, of course, in danger of being called Dave π
Bex
Not really, because as we all know..."No man, Dave's not here". π
Really? Why?
Dave? What the UK based satellite station?
Dave Lister?
HAL and Dave?
Nah, you've lost me there...
Rodders...
June 20, 2014 at 7:16 am
rodjkidd (6/20/2014)
Ed Wagner (6/20/2014)
Bex (6/20/2014)
rodjkidd (6/20/2014)
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
You are, of course, in danger of being called Dave π
Bex
Not really, because as we all know..."No man, Dave's not here". π
Really? Why?
Dave? What the UK based satellite station?
Dave Lister?
HAL and Dave?
Nah, you've lost me there...
Rodders...
It is part of a comedy routine about stoned people from the 60s or 70s by Cheech & Chong.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
Itβs unpleasantly like being drunk.
Whatβs so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
June 20, 2014 at 7:18 am
Stefan Krzywicki (6/20/2014)
rodjkidd (6/20/2014)
Ed Wagner (6/20/2014)
Bex (6/20/2014)
rodjkidd (6/20/2014)
It came to prominence in the UK thanks to this show;
http://www.bbc.co.uk/comedy/onlyfools/uncovered/rodney.shtml
Cheers,
Rodders...
You are, of course, in danger of being called Dave π
Bex
Not really, because as we all know..."No man, Dave's not here". π
Really? Why?
Dave? What the UK based satellite station?
Dave Lister?
HAL and Dave?
Nah, you've lost me there...
Rodders...
It is part of a comedy routine about stoned people from the 60s or 70s by Cheech & Chong.
I'm glad I'm not the only who remembers that bit of comedy from the 70s. Maybe with the Random word currently being about Monty Python, it has me thinking of that era. π They don't make stuff like that any more.
Viewing 15 posts - 44,326 through 44,340 (of 66,688 total)
You must be logged in to reply to this topic. Login to reply