How to limit SQL Job Success events

  • Hi,

    We have a monitoring tool that reads Windows Event Log entries for Failure and Success of SQL Job executions. So, we have setup MSSQL jobs to write to the Windows Event Log when they fail or complete successfully.

    This works okay, however for jobs that run at a very high frequency I find that the Windows Event Log then receives way too many Success events and they choke up the monitoring system.

    Is there a way to get MSSQL to write a  single job Success event for the first success that immediately follows a failure condition, but then ignore subsequent Success conditions. The following explains the kind of notifications I would like to see...

    Job X success  -ignore

    Job X failure   -produce failure event

    Job X success   - produce success event

    Job X success   - ignore

    Job X success   -ignore

    ...

    Job X failure   -produce failure event

    Job X failure  -produce failure event

    Job X success   - produce success event

    Job X success   - ignore

    If it's not possible to ignore these subsequent success events, then an alternative might be to somehow flag it into the success Windows Event Log that it is somehow a recurring success (not the initial success following a failure)

    It would be more efficient to build the logic into the source (MSSQL) rather than build it into the monitoring system. But I am open to any good suggestions even if I had to use some other means other than Windows Event Log as source.

    Regards Graham

  • As far as I know there is no way to do that nicely inside SQL.

    If doing it inside SQL is a requirement, you could modify the code you are running so it writes the failure to a table along with the date and time rather than failing completely.  Then the next step in the job would look to see if the current run is a failure or the previous run is a failure and could notify as needed.  Then you change your monitoring tool so it looks at your new table that you are using for logging to determine if and when to send out notifications.  Or code it into the job to send out the notifications about failure and success (if needed).

    Alternately, is it possible to fix the failures?  If you have no failures, then this becomes a non-issue.

    Now, depending on the monitoring solution you have in place, RedGate SQL Monitor does an excellent job at handling exactly what you requested.  When a job fails, you get notified and it makes an entry and it will keep it marked (and logged) as a failure until the next success.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • We have a monitoring tool that reads Windows Event Log entries for Failure and Success of SQL Job executions.

    Hmm, maybe?!:

    When that tool reads a "job success" as the last msg for a given job, it changes the Event Log status for that job (naturally using msdb.dbo.sp_update_job) from "When job completes" to "When job fails".  Then subsequent successes would not be logged.

    When the tool reads a "job failure" as the last msg for a given job, it changes the Event Log status for that job from "When job fails" to "When job completes" so that a success will be logged.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • My suggestion is to stop monitoring successes.  Job history does that well enough.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff, we require the success to auto clear the failure events. This clear would then auto resolve the help desk auto generated tickets as well as end the SLA breaches imposed by the monitoring system.

    Regards Graham

  • This is a pretty nice solution. I will run it past the SQL DBA and check whether they would agree to this. Thanks

  • Thanks Brian, I will ask the SQL DBA whether they have RedGate. Regards

  • Graham Van der vaart wrote:

    Hi Jeff, we require the success to auto clear the failure events. This clear would then auto resolve the help desk auto generated tickets as well as end the SLA breaches imposed by the monitoring system.

    Regards Graham

    Ah... got it.  Thanks for the feedback, Graham.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply