Check if a job is running or not?

  • We have a few important replication processes running and I need to be informed immediately if such jobs stops for whatever reason with whatever status. I have Googled the net and looked into the job-tables in msdb, but sofar have not been able to find out how to make an alert on such a job, or have SCOM check these jobs. It seems there is no status for the sp_help_job which tells me a job is running. The jobmonitor clearly has something to look at and tell us if a job is executing or not. There seems to be no value for 'run_status' in tablesysjobhistory which tells a job is 'Executing'.

    In sysjobactivity supposedly the attributes start_execution_date and stop_execution_date should tell you if a job is executing, but this is not reliable: A few weeks ago we were told a replication process was not running for three days already. Checks showed there was record for said job with an open stop_execution_date... Further analysis told us, that last year this server had crashed and was restarted, which seemingly resulted in an incorrect cleaning of the sysjobactivity table.

    I could include a check on how many records there are in the sysjobactivity table with an open stop_execution_date, but I would hope there is a more simple, elegant solution.

    Any ideas?

    TIA,

    FreeHansje

    Greetz,
    Hans Brouwer

  • Declare @retValue int

    EXEC @retValue = msdb.dbo.sp_start_job @job_name = 'packagename'

    If @retValue = 1

    Raiserror ('Failed to start [packagename]', 16, 1, null)

  • this might be pretty informative to see what raplication is doing. I have this being run every 5 mins and emails me when latency is above a predefined threshold.

    SELECT dl.publisher_db,

    dl.subscriber_server,

    dl.subscriber_db,

    dl.distribution_agent,

    CASE dl.runstatus

    WHEN 1 THEN 'Start'

    WHEN 2 THEN 'Succeed'

    WHEN 3 THEN 'In progress'

    WHEN 4 THEN 'Idle'

    WHEN 5 THEN 'Retry'

    WHEN 6 THEN 'Fail'

    END AS DistributionRunStatus,

    dl.distribution_latency,

    CASE rl.runstatus

    WHEN 1 THEN 'Start'

    WHEN 2 THEN 'Succeed'

    WHEN 3 THEN 'In progress'

    WHEN 4 THEN 'Idle'

    WHEN 5 THEN 'Retry'

    WHEN 6 THEN 'Fail'

    END AS ReaderRunStatus,

    rl.reader_latency,

    ( dl.distribution_latency + Isnull(rl.reader_latency, 0) ) AS total_latency

    FROM (SELECT agent_id,

    publisher_db,

    Upper(ss.datasource) AS subscriber_server,

    subscriber_db AS subscriber_db,

    [name] AS distribution_agent,

    DH.runstatus,

    CASE

    WHEN DH.runstatus = 1 THEN Datediff(SS, DH.TIME, Getdate())

    WHEN DH.runstatus = 3 THEN DH.delivery_latency

    ELSE 0

    END AS distribution_latency

    FROM (SELECT DHMAX.*

    FROM distribution.dbo.MSdistribution_history DHMAX WITH(NOLOCK)

    INNER JOIN (SELECT agent_id,

    MAX([time]) AS TIME

    FROM distribution.dbo.MSdistribution_history DH WITH(NOLOCK)

    GROUP BY agent_id) MTDH

    ON DHMAX.agent_id = MTDH.agent_id

    AND DHMAX.TIME = MTDH.TIME) DH

    JOIN distribution.dbo.MSdistribution_agents DA WITH(NOLOCK)

    ON DH.agent_id = DA.id

    JOIN MASTER.dbo.sysservers ss WITH(NOLOCK)

    ON da.subscriber_id = ss.srvid) AS DL

    LEFT OUTER JOIN (SELECT agent_id,

    la.publisher_db,

    lh.runstatus,

    CASE

    WHEN lh.runstatus = 1 THEN Datediff(SS, LH.TIME, Getdate())

    WHEN lh.runstatus = 3 THEN lh.delivery_latency

    ELSE 0

    END AS reader_latency

    FROM (SELECT LHMAX.*

    FROM distribution.dbo.MSlogreader_history LHMAX WITH(NOLOCK)

    INNER JOIN (SELECT agent_id,

    MAX([time]) AS [time]

    FROM distribution.dbo.MSlogreader_history WITH(NOLOCK)

    GROUP BY agent_id) MTLH

    ON LHMAX.agent_id = MTLH.agent_id

    AND LHMAX.TIME = MTLH.TIME) LH

    JOIN distribution.dbo.mslogreader_agents LA WITH(NOLOCK)

    ON lH.agent_id = lA.id) AS RL

    ON DL.publisher_db = RL.publisher_db

    ORDER BY dl.publisher_db,

    dl.distribution_agent

  • I use this script, think I may of got it here or just profiled it on replication monitor, can't remember, I am concerned with failures, don't like having the client tell us something is down. Collect the details via ssis punt into a table and if select count(*)>= send an email with the query to check it to the dba on call dl.

    use distribution;

    SELECT

    MSdistribution_agents.name, MSdistribution_agents.publisher_db,

    MSdistribution_agents.publication,

    MSdistribution_agents.subscriber_db,

    case when runstatus = 1 then 'Start'

    when runstatus = 2 then 'Succeed'

    when runstatus = 3 then 'In progress'

    when runstatus = 4 then 'Idle'

    when runstatus = 5 then 'Retry'

    when runstatus = 5 then 'Fail'

    end as RunStatus,

    MSdistribution_history.[time], MSdistribution_history.duration,

    MSdistribution_history.delivered_transactions, MSdistribution_history.delivered_commands,

    MSdistribution_history.error_id,

    MSdistribution_history.comments

    FROM

    MSdistribution_agents (nolock)

    INNER JOIN MSdistribution_history (nolock)

    ON MSdistribution_agents.id = MSdistribution_history.agent_id

    where

    MSdistribution_history.[time] >= dateadd(mi, -30, getdate())

    and

    MSdistribution_history.error_id > 0

    order by MSdistribution_history.[time] desc

  • Tnx for answering all.

    I am uncertain on what the value of run_status tells me. If status is 4(Idle), does this mean that the job is active, but currently there is no activity? I come to this conclusion because of the values 2 and 6(Succeed and Fail).

    And Andrew, I'm trying to understand what restrictions you use in the WHERE clause: do you only ask for history records older then 30 minutes? Is that an arbitrary number?

    Tnx again

    Greetz,
    Hans Brouwer

  • I am only interested in failures, I have a job collect any failures every 15 minutes, the where clause is so I don't miss any failures if there is a network blip when running the ssis package. We toyed with looking at latency, but it can get mangled depending on load, user or actual data loads, and the usual database maintenance. You can change the where clause to fit your needs, we just can't have our clients telling us something is not working.

    Andrew

  • here's a proc that I use to get the status of a job. I put it in the master db, that's why the naming convention

    Create procedure [sp_IsJobRunning]

    @ps_JobName nvarchar(128),

    @pn_JobState int out,

    @pn_JobOutcome int out

    as

    --XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    --X DESCRIPTION: Procedure used to determine if a SQL Agent job is running

    --X

    --X ASSUMPTIONS: Need execute permissions on master..xp_sqlagent_enum_jobs

    --X

    --X PARAMETERS: @ps_JobName - Name of job to find running

    --X @pn_JobState - state of the job

    --X Valid Options for Job_State are

    --X 0 - Not Idle or Suspended

    --X 1 - Executing Job

    --X 2 - Waiting For Thread

    --X 3 - Between Retries

    --X 4 - Idle

    --X 5 - Suspended

    --X 6 - WaitingForStepToFinish

    --X 7 - PerformingCompletionActions

    --X @pn_JobOutcome - Outcome status of job

    --X Valid options for Last_Run_Outcome are

    --X 0 - Failed

    --X 1 - Successful

    --X 2 - Not Valid

    --X 3 - Canceled

    --X 4 - Executing

    --X 5 - Undetermined State

    --X

    --X RETURNS: @pn_JobState and @pn_JobOutcome

    --XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    begin

    set nocount on

    Declare @vt_JobResults TABLE (

    job_id UNIQUEIDENTIFIER NOT NULL,

    last_run_date INT NOT NULL,

    last_run_time INT NOT NULL,

    next_run_date INT NOT NULL,

    next_run_time INT NOT NULL,

    next_run_schedule_id INT NOT NULL,

    requested_to_run INT NOT NULL, -- BOOL

    request_source INT NOT NULL,

    request_source_id sysname COLLATE database_default NULL,

    running INT NOT NULL, -- BOOL

    current_step INT NOT NULL,

    current_retry_attempt INT NOT NULL,

    job_state INT NOT NULL)

    declare @vn_job_id UNIQUEIDENTIFIER

    -- Get the Job ID. This is needed to see if the job is actually running

    select @vn_job_id = job_id

    from msdb..sysjobs

    where [name] = @ps_JobName

    -- Populate temp table with results of all jobs

    insert into @vt_JobResults

    exec master..xp_sqlagent_enum_jobs 1,''

    -- Check to see if the job is already running

    select @pn_jobstate = job_state, @pn_JobOutcome = last_run_outcome

    from @vt_JobResults r

    inner join msdb.dbo.sysjobservers so

    on r.job_id = so.job_id

    where r.job_id = @vn_job_id

    end

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • did you receive an answer for the runstatus = '4'? running or not? thanks

  • Nope, no answewr on that yet.

    Greetz,
    Hans Brouwer

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply