October 22, 2014 at 4:05 pm
There is SQL Server Agent Job which hung periodically. Is there a way to send alerts or notify when the Job get hung?
Thanks
October 22, 2014 at 4:28 pm
Please define "hung". Does it last longer than you expect? It never ends?
-- Gianluca Sartori
October 22, 2014 at 4:51 pm
It never ends. We restarted the server but no luck?
October 22, 2014 at 9:40 pm
Hello,
From which login account job is executing. There may be possibility of access issue on this.
Let me know this.
Thanks,
Anuj Soni
October 23, 2014 at 2:11 am
Admingod (10/22/2014)
It never ends. We restarted the server but no luck?
Restarting the server is never a good solution.
If the job gets stuck, you have to find what it is doing at the time of blocking.
If it is a simple T-SQL job step, you can monitor the execution of the statements with the activity monitor or with the excellent stored procedure sp_WhoIsActive.
It could be blocking, or a query that got a suboptimal execution plan due to parameter sniffing or stale statistics.
Try to investigate it further and come back when you have more info.
-- Gianluca Sartori
October 23, 2014 at 9:32 am
Well I would like to find if there is a way to setup some kind of notification when the job is hung? Do we have any workaround like that?
Thanks
October 23, 2014 at 9:34 am
I have seen in the past if there are any blockings then it should time out right. However, it will show as running for days without failing and doing nothing?
Thanks
October 23, 2014 at 9:46 am
No, SQL Server waits ad infinitum for locks to be released. As Gianluca suggested, please use sp_who2 or sp_whoisactive to find out whether there is any blocking. What does the job do - is it a simple T-SQL batch, or does it do anything external, such as backups and restores, SSIS packages and so on?
John
October 23, 2014 at 11:12 am
The code below will give a list of jobs that have been running over ten minutes (can be anything you want).
Remember to exclude jobs you expect to run long (e.g. REPL jobs). You can exclude by job category
declare
@session_id int,
@minutesLmit int
set @minutesLmit = -10
select
@session_id = max(session_id)
from
[msdb].[dbo].[sysjobactivity]
select
SERVER = @@servername,
jo.name,
ja.start_execution_date
from
[msdb].[dbo].[sysjobactivity] ja
JOIN
select * from [msdb].[dbo].[sysjobs] jo ON ja.job_id = jo.job_id
where
ja.session_id = @session_id
AND start_execution_date is not null
AND start_execution_date < dateadd(mi,@minutesLmit,getdate())
AND stop_execution_date is NULL
October 23, 2014 at 12:30 pm
We have a job that runs for hours and have another Job that runs this code at a certain time to alert us that the job is running late or hung:
IF NOT EXISTS
(
SELECT
History.*
FROM
msdb.dbo.sysjobs Job
Join
msdb.dbo.sysjobhistory History
ON
Job.Job_ID = History.job_id
AND Job.name = 'PUT JOB NAME HERE'
AND History.run_date = Convert(Int, CONVERT(VarChar, GetDate(), 112))
AND step_name = 'PUT STEP NAME HERE'
AND run_status = 1
)
RaisError('JOB NAME HERE has not completed!!', 18, 1) WITH SetError
October 23, 2014 at 1:26 pm
In regards to my previous post, we have the jobs tied to a pager which the developers take turns monitoring.
So when the processing step checks fails, we are alerted and get up in the early hours to troubleshoot/send emails.
This happens when there are network issues or files not posted to the ftp site by our client.
I wanted to put how we utilize that code I posted into context.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply