One day I became tired of doing a manual procedure to check every morning to
see if nightly backup, data transfer, and other jobs have failed. I decided to
develop code that will automatically verify if jobs have failed and email the
results to me.
One way is to accomplish this is goal is to make sure the last step of any job
is 'Check Failed Steps'. This works as long as the last step of the job is the
only one to have an on failure action of 'Quit the job reporting failure'. For
example:
Job Step Name | On Success | On Failure |
---|---|---|
Check all databases | Go to next Step | Go to next Step |
Cycle Error Log | Go to next Step | Go to next Step |
Report Folder Sizes | Go to next Step | Go to next Step |
Database Size Report | Go to next Step | Go to next Step |
Check failed steps | Quit the job reporting success | Quit the job reporting failure |
In some cases if a step fails you'll want to branch directly to the failed
step:
Job Step Name | On Success | On Failure |
---|---|---|
Extract Data | Go to next Step | Go to step : 5 |
Prepare File Data | Go to next Step | Go to step : 5 |
Create File | Go to next Step | Go to step : 5 |
Send File | Go to next Step | Go to step : 5 |
Check failed steps | Quit the job reporting success | Quit the job reporting failure |
P_JOB_STEP_FAIL_1.prc contains the
stored procedure code to check failed steps for a job. This only checks the last
(current) instance of the job to run. To run it, the syntax is:
exec dbo.P_JOB_STEP_FAIL @job_name VARCHAR(50), @recipients varchar(100)
notifies you of any failed job step within the last x hours. I typically a
separate job for this procedure and schedule it to run every 3-4 hours checking
8 hours back. Synax is as follows:
Exec dbo.P_ANY_JOB_STEP_FAIL @recipients varchar(100), @hours_back int = -12
Of course, I am using msdb system tables sysjobhistory and sysjobs. No
special permissions are required because they run as a part of a job.
Conclusion
It is may be not a big deal to check jobs daily and control them manually.
But this simple 2 way automation helps me, especially when I maintain multiple
servers and many jobs. And it gives me the pleasure of additional control. I
have been using these procedures for almost 2 years and didn’t have issues or
problems so far with proper error notification.