Runaway Job Management
I support quite a few third-party applications, many of which install Database Maintenance jobs of their own (e.g. purging, populating, etc.) Once in awhile these jobs interfere with day-to-day activities or my normal DBA maintenance jobs.
The repetitive task of finding the answer to "Why is XXX so slow today?", or "Why is the reindex taking so long this weekend?" usually involves checking to see if these third-party jobs are running/runaway/stuck. Normally, the job is killed by an administrator and life goes on.
Having laid that out, I wanted a way to be alerted or automatically take an action on my behalf so that I could retain my precious Corona time.
The code I provided will either send an Alert to the specified SQL Operator ('Alert'), or stop the job ('Kill') depending on what argument is set. This code can be altered to be placed directly within a "Governer" job, or you can make it into a Stored Procedure on a database of your choosing within the instance and call it for multiple jobs.
Logic can also be added to take other actions of your choosing. Hope you enjoy this as it's saved me quite a bit of time in the past year!
Mark
SET NOCOUNT ON
DECLARE
@JobName SYSNAME,
@RuntimeThresholdMins INTEGER,
@Action VARCHAR(25),
@AlertOperatorName NVARCHAR(100)
--User-defined parameters
SET @JobName = 'Job Name'
SET @RuntimeThresholdMins = 240
SET @Action = 'Alert'--'Kill
SET @AlertOperatorName = 'OperatorName'
DECLARE
@JobID UNIQUEIDENTIFIER,
@IsSysAdmin INTEGER,
@JobOwner SYSNAME
--Determine Job ID from name
SELECT @JobID = SJ.job_id FROM msdb..sysjobs SJ WHERE SJ.[name] = @JobName
SET @IsSysAdmin = ISNULL(IS_SRVROLEMEMBER(N'sysadmin'), 0)
SET @JobOwner = SUSER_SNAME()
IF @JobID IS NULL
BEGIN
RAISERROR('Job "%s" does not exist on %s.', 16, 1, @JobName, @@SERVERNAME)
RETURN
END
DECLARE @JobState TABLE
(
JobID BINARY(1024),
LastRunDate VARCHAR(8),
LastRunTime VARCHAR(6),
NextRunDate VARCHAR(8),
NextRunTime VARCHAR(6),
NextRunScheduleID INTEGER,
RequestedToRun INTEGER,
RequestSource INTEGER,
RequestSourceID VARCHAR(255),
Running INTEGER,
CurrentStep INTEGER,
CurrentRetryAttempt INTEGER,
[State] INTEGER
)
INSERT INTO @JobState
EXEC [master]..xp_sqlagent_enum_jobs @IsSysAdmin, @JobOwner, @JobID
--Normalize VARCHAR string for time formatting
UPDATE JS SET
JS.LastRunTime = REPLICATE('0',6-LEN(JS.LastRunTime)) + JS.LastRunTime,
JS.NextRunTime = REPLICATE('0',6-LEN(JS.NextRunTime)) + JS.NextRunTime
FROM @JobState JS
IF EXISTS(
SELECT 'X'
FROM @JobState JS
WHERE
--Job is currently executing
JS.[State] = 1
AND
DATEDIFF(mi,
CAST(SUBSTRING(JS.NextRunDate, 5, 2) + '/' +
SUBSTRING(JS.NextRunDate, 7, 2) + '/' +
SUBSTRING(JS.NextRunDate, 1, 4) + ' ' +
--Time
SUBSTRING(JS.NextRunTime, 1, 2) + ':' +
SUBSTRING(JS.NextRunTime, 3, 2) + ':' +
SUBSTRING(JS.NextRunTime, 5, 2) AS DATETIME), GETDATE()) >= @RuntimeThresholdMins)
BEGIN
IF @Action = 'Alert'
BEGIN
DECLARE
@Subject VARCHAR(100),
@Body VARCHAR(8000)
SET @Subject = '***Possible Runaway Job: ' + @JobName + '***'
SET @Body = 'The job "' + @JobName + '" has been running for longer than ' + CAST(@RuntimeThresholdMins AS VARCHAR(38)) + ' minutes.'
--Send alert
EXECUTE msdb..sp_notify_operator
@name = @AlertOperatorName,
@subject = @Subject,
@Body = @Body
END
ELSE IF @Action = 'Kill'
BEGIN
--Cancel job
EXEC msdb..sp_stop_job @job_id = @JobID
END
ELSE
BEGIN
RAISERROR('"%s" is an invalid action. Use "Alert" or "Kill".', 16, 1, @Action)
END
END
GO