Do you get tired of reviewing each SQL Agent notifications nightly to determine which SQL Agent jobs failed? Is there a significant number of SQL Agent job notifications that it takes a while to review each? Here is an alternative that allows you to have a single email report of all job failures.
Why We Created A Single Job Failure Report
Over the past few years we have implemented a number of SQL Agent jobs in SQL Server. Each job uses the SQL Agent notification process to communicate success or failures to the DBAs. This communication is all done via email. Due to the number of jobs we have, we get lots of emails reporting the successes and failures of our nightly jobs. Every morning the time to review each of the email notifications to determine success or failure of each SQL Agent jobs was time consuming. Also it was easy to get in a habit of quickly deleting the emails without really noticing whether the jobs succeeded or failed. For this reason we decide to create a process to produce a single job failure report.
What We Did
We decide since the job/step failure information was stored in the MSDB database that we would write a stored procedure called “usp_failed_jobs_report (see “Listing of SP” below) to produce a single report. This stored procedure performs some simple SQL to gather information from the sysjobhistory and sysjobs tables, formats the data into a report, and then emails the report to the our DBAs.
This SP code is run every morning Monday - Friday at 6 am via a SQL Agent job. The SP determines which jobs and job steps have failed since the last scheduled running of this SP. For each failed jobs/job step this SP produces a single line in the report. The line in the report shows the name of the job and step that failed, plus the date and time of the failure.
Example of Report
Here is an example on the a report produced by this SP.
The following jobs/steps failed since Aug 23 2002 6:00AM job step_name failed datetime ------------------------------------------- --------------------------------- ------------------- ADRN1303 TRIGGER TRIGGER FILE EXIST Aug 25 2002 12:15AM LOAD EMPLOYEE TABLE LOAD TABLE Aug 25 2002 12:58AM LOAS EMPLOYEE TABLE (Job outcome) Aug 25 2002 12:58AM ADDS load for PROD1 New FTP VALIDATION 1 Aug 25 2002 2:16AM ADDS load for PROD1 New FTP VALIDATION 2 Aug 25 2002 2:16AM
This report shows that “ADRN1303 TRIGGER” job had a step failure, “LOAD EMPLOYEE TABLE” had a step and job failure (the “(job outcome)” step_name indicates a job failure), and “ADDS load for PROD1 New” had two different step failures.
Conclusion
By having a single report, the DBA's in our shop can now review a single email to determine which jobs have failed since the last business day. Having this single report allows a quick method for the DBA’s to identify all the jobs and step failures for the past day. By reviewing a single report the DBA’s don’t have to review each individual SQL Agent notification email to determine which jobs failed.
Listing of SP
CREATE procedure usp_failed_jobs_report as -- Written by: Greg Larsen -- Company: Department of Health, Washington State -- Date: January 3, 2002 -- Description: This SQL Code reports job/step failures based on a data and time range. The -- report built is emailed to the DBA distribution list. -- -- Modified 04/12/2002 - Greg Larsen - Modified to support Long running jobs that cross reporting -- periods declare @RPT_BEGIN_DATE datetime declare @NUMBER_OF_DAYS int -- Set the number of days to go back to calculate the report begin date set @NUMBER_OF_DAYS = -1 -- If the current date is Monday, then have the report start on Friday. if datepart(dw,getdate()) = 2 set @NUMBER_OF_DAYS = -3 -- Get the report begin date and time set @RPT_BEGIN_DATE = dateadd(day,@NUMBER_OF_DAYS,getdate()) -- Get todays date in YYMMDD format -- Create temporary table to hold report create table ##temp_text ( email_text char(100)) -- Generate report heading and column headers insert into ##temp_text values('The following jobs/steps failed since ' + cast(@RPT_BEGIN_DATE as char(20)) ) insert into ##temp_text values ('job step_name failed datetime ') insert into ##temp_text values ('------------------------------------------- --------------------------------- -------------------') -- Generate report detail for failed jobs/steps insert into ##temp_text (email_text) select substring(j.name,1,43)+ substring(' ', len(j.name),43) + substring(jh.step_name,1,33) + substring(' ', len(jh.step_name),33) + -- Calculate fail datetime -- Add Run Duration Seconds cast(dateadd(ss, cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int), -- Add Run Duration Minutes dateadd(mi, cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int), -- Add Run Duration Hours dateadd(hh, cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int), -- Add Start Time Seconds dateadd(ss, cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int), -- Add Start Time Minutes dateadd(mi, cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int), -- Add Start Time Hours dateadd(hh, cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int), convert(datetime,cast (run_date as char(8)))) ))))) as char(19)) from msdb..sysjobhistory jh join msdb..sysjobs j on jh.job_id=j.job_id where (getdate() > -- Calculate fail datetime -- Add Run Duration Seconds dateadd(ss, cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int), -- Add Run Duration Minutes dateadd(mi, cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int), -- Add Run Duration Hours dateadd(hh, cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int), -- Add Start Time Seconds dateadd(ss, cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int), -- Add Start Time Minutes dateadd(mi, cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int), -- Add Start Time Hours dateadd(hh, cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int), convert(datetime,cast (run_date as char(8)))) )))))) and (@RPT_BEGIN_DATE < -- Calculate fail datetime -- Add Run Duration Seconds dateadd(ss, cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int), -- Add Run Duration Minutes dateadd(mi, cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int), -- Add Run Duration Hours dateadd(hh, cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int), -- Add Start Time Seconds dateadd(ss, cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int), -- Add Start Time Minutes dateadd(mi, cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int), -- Add Start Time Hours dateadd(hh, cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int), convert(datetime,cast (run_date as char(8)))) )))))) and jh.run_status = 0 -- Email report to DBA distribution list exec master.dbo.xp_sendmail @recipients='Greg.Larsen@sqlservercentral.com', @subject='Check for Failed Jobs - Contains jobs/steps that have failed.', @query='select * from ##temp_text' , @no_header='true', @width=150 -- Drop temporary table drop table ##temp_text GO SET QUOTED_IDENTIFIER OFF GO SET ANSI_NULLS ON GO