So far this series has been a bit of fun. We have discussed a fair amount of useful things to date. And here we are only on the seventh day.
Today we will be reverting our attention back to the msdb database. This time not as a means of maintaining the under-carriage but rather to help us try and become better DBAs.
Sometimes to be a better DBA, we have to be a bit proactive and less reactive.
We’ll get to that shortly. As we have done just about every day so far though, we need to recap real quick what we have to date in the 12 Days of pre-Christmas.
- Maintenance Plan Gravage – Day 6
- Table Compression – Day 5
- Exercise for msdb – Day 4
- Backup, Job and Mail History Cleanup – 3rd Day
- Service Broker Out of Control – 2nd Day
- Maint Plan Logs – 1st Day
On the Seventh Day of pre-Christmas…
My DBA gave to me an early Nuclear Fallout detection and warning system. Ok, maybe not quite that extensive – but it could sure feel like it if something were to slip through the crevasses and the business management started breathing down your neck.
Have you ever had a SQL Job run longer than it should have? Has the job run long enough that it ran past the next scheduled start time? These are both symptoms that you might want to monitor for and track in your environment. If you are watching for it – you should be able to preemptively strike with warning flares sent out to the interested parties. Then you would be more of the hero than the scapegoat. And we all want to be on the winning side of that conversation without the confrontation from time to time ;).
Today, I am going to share a quick script that can be used to help monitor for jobs that have waltzed well beyond the next scheduled start time. First though, there is a script available out there for the jobs that have run beyond normal run times. You can check out this article by Thomas LaRock to get a good script to check and log long running jobs.
Though the criteria are similar – we do have two different needs that need to be reported on and monitored. This is why we have this second script. If you have a job that should run every 15 minutes but you find that the job has been running non-stop for 45 minutes, that is quite a problem. And many times we should alert on something like that.
So here is the quick script.
[codesyntax lang=”tsql”]
WITH preagg AS ( SELECT sj.name,MAX(sa.last_executed_step_date) AS last_executed_step_date ,MAX(sa.start_execution_date) AS start_execution_date ,MAX(sa.stop_execution_date) AS stop_execution_date ,MAX(sa.next_scheduled_run_date) AS next_scheduled_run_date ,MAX(sh.run_status) AS run_status ,MAX(ca.next_run_date_time) AS Next_Scheduled_RunDate ,MAX(CAST(STUFF(STUFF(REPLACE(STR(sh.run_duration, 6), ' ', '0') , 3, 0, ':') , 6, 0, ':') AS varchar)) AS Run_Duration FROM msdb.dbo.sysjobs sj INNER JOIN msdb.dbo.sysjobactivity sa ON sj.job_id = sa.job_id LEFT OUTER JOIN msdb.dbo.sysjobhistory sh ON sa.job_history_id = sh.instance_id INNER JOIN msdb.dbo.sysjobschedules sjs ON sj.job_id = sjs.job_id INNER JOIN msdb.dbo.sysschedules ss ON sjs.schedule_id = ss.schedule_id CROSS APPLY (SELECT CONVERT(DATETIME, CASE WHEN sjs.next_run_date = 0 THEN NULL ELSE RIGHT('00000000' + CONVERT(VARCHAR(8), sjs.next_run_date),8) + ' ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(6), sjs.next_run_time),6) , 3, 0, ':') , 6, 0, ':') END) ) ca(next_run_date_time) WHERE (sa.start_execution_date > 0 AND sa.stop_execution_date IS NULL) And (ISNULL(sa.next_scheduled_run_date,GETDATE()) <> ca.next_run_date_time OR ISNULL(sh.run_status,4) in (1,4)) /* Status of the job execution: 0 = Failed 1 = Succeeded 2 = Retry 3 = Canceled 4 = In progress */GROUP BY sj.name ) SELECT * FROM preagg WHERE next_scheduled_run_date IS NULL ORDER BY Next_Scheduled_RunDate
[/codesyntax]
From here, it is a matter of creating a SQL job to run this script. The schedule is dependent on job frequency and your expected needs to monitor. It can very easily be set to run every five minutes.
Once the job is created then it is a matter of setting up a mechanism to alert. There are any number of ways to do that as well. I leave that to any who use the script to determine how best to implement for their own environment.
Conclusion
This script will catch jobs that are running and have exceeded the next scheduled start time. This is accomplished through the filters in the where clause. With a couple of date comparisons as well as filtering on job_status we can get the desired results.