July 30, 2021 at 1:25 pm
Hi,
We have a monitoring tool that reads Windows Event Log entries for Failure and Success of SQL Job executions. So, we have setup MSSQL jobs to write to the Windows Event Log when they fail or complete successfully.
This works okay, however for jobs that run at a very high frequency I find that the Windows Event Log then receives way too many Success events and they choke up the monitoring system.
Is there a way to get MSSQL to write a single job Success event for the first success that immediately follows a failure condition, but then ignore subsequent Success conditions. The following explains the kind of notifications I would like to see...
Job X success -ignore
Job X failure -produce failure event
Job X success - produce success event
Job X success - ignore
Job X success -ignore
...
Job X failure -produce failure event
Job X failure -produce failure event
Job X success - produce success event
Job X success - ignore
If it's not possible to ignore these subsequent success events, then an alternative might be to somehow flag it into the success Windows Event Log that it is somehow a recurring success (not the initial success following a failure)
It would be more efficient to build the logic into the source (MSSQL) rather than build it into the monitoring system. But I am open to any good suggestions even if I had to use some other means other than Windows Event Log as source.
Regards Graham
July 30, 2021 at 3:42 pm
As far as I know there is no way to do that nicely inside SQL.
If doing it inside SQL is a requirement, you could modify the code you are running so it writes the failure to a table along with the date and time rather than failing completely. Then the next step in the job would look to see if the current run is a failure or the previous run is a failure and could notify as needed. Then you change your monitoring tool so it looks at your new table that you are using for logging to determine if and when to send out notifications. Or code it into the job to send out the notifications about failure and success (if needed).
Alternately, is it possible to fix the failures? If you have no failures, then this becomes a non-issue.
Now, depending on the monitoring solution you have in place, RedGate SQL Monitor does an excellent job at handling exactly what you requested. When a job fails, you get notified and it makes an entry and it will keep it marked (and logged) as a failure until the next success.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
July 30, 2021 at 6:36 pm
We have a monitoring tool that reads Windows Event Log entries for Failure and Success of SQL Job executions.
Hmm, maybe?!:
When that tool reads a "job success" as the last msg for a given job, it changes the Event Log status for that job (naturally using msdb.dbo.sp_update_job) from "When job completes" to "When job fails". Then subsequent successes would not be logged.
When the tool reads a "job failure" as the last msg for a given job, it changes the Event Log status for that job from "When job fails" to "When job completes" so that a success will be logged.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 2, 2021 at 12:11 am
My suggestion is to stop monitoring successes. Job history does that well enough.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 2, 2021 at 12:16 pm
Hi Jeff, we require the success to auto clear the failure events. This clear would then auto resolve the help desk auto generated tickets as well as end the SLA breaches imposed by the monitoring system.
Regards Graham
August 2, 2021 at 12:19 pm
This is a pretty nice solution. I will run it past the SQL DBA and check whether they would agree to this. Thanks
August 2, 2021 at 12:20 pm
Thanks Brian, I will ask the SQL DBA whether they have RedGate. Regards
August 2, 2021 at 12:24 pm
Hi Jeff, we require the success to auto clear the failure events. This clear would then auto resolve the help desk auto generated tickets as well as end the SLA breaches imposed by the monitoring system.
Regards Graham
Ah... got it. Thanks for the feedback, Graham.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply