SQL AGENT: Do we care if it is down?

  • Hi,

    Obviously I know we do care about the Agent being up or down, but for the purposes of monitoring if a server is up and the services are running, should we actively check to see if the Agent is up?

    Here is what I am doing:

    - Because of a power outage last week that effected several of our boxes and our Exchange server in Corporate, we were not able to get the normal SQL Alerts indicating the servers were going down.

    - As a makeshift monitoring process, I created a Job that runs every 1 minute and:

    -- Uses a Cursor to loop through and grab dB instance names from a table

    -- use xp_cmdshell to Ping the server

    -- If the server cannot be reached, use DB Mail to send out an Alert

    -- If the server can be reached, then use OPENROWSET to do a simple query (i.e. return the servername)

    - If there is no value returned, then we know the Server is up, bu SQL Service is either having trouble or is down and it sends an email.

    At this point, we know the Server status, as well as the SQL Service status. I also know that I can modify my code to look in sysprocesses to see if the AGENT is running, but my question is, do we even care at that point? Would there every be an occasion where for some reason the AGENT is suddenly not running, but the Service is?

    I am probably going to include this check anyhow, but was just curious as to what others thought about it and if anyone else has another other ideas on how to monitor Server uptime/status/etc. without using any 3rd party tools.

    Edit: This job is going to run on a server in Location A, which will monitor the servers in Location B, and vice versa. That way, if power goes down in Location A, then the monitoring process setup on Location B will notice it and start sending emails every 1 minute until the issue is resolved.

  • Whether you care about SQL Agent being up or not depends on what you have it doing, and how important those jobs are to you.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Also add onto that, if you want to use, or are using alerts you need agent running.

    I agree with G, too... if you're running any agent jobs, you need to monitor it.

    When I used to administer a large number of instances, we would see agents not running from time to time when the DB engine was. Usually it was agent not starting up correctly after a patch outage and not just 'dying' out of the blue.

    you can monitor agent with powershell+wmi pretty easily.

  • Well, the problem is that some of our instances of 2008 have been experiencing an issue where WMI use just stops working correctly for some reason and the only thing that can fix it is rebooting, which we try not to do unless we really have to. Which is why I am looking for alternative ways to monitor the services that don't use 3rd party tools or PShell.

    We have all our services set to Manual, so in theory, if the server is up, the services should be running as well.

    So that's why my code basically Pings the Server, uses an OPENROWSET to return some value to see if SQL Service is running, then checks to see if a record exists in sysprocesses table for the AGENT. Set this to run every 1 minute or so on a remote box and if I don't receive any emails, I am a happy camper.

    On a side note: Has anyone ever seen services set to manual that automatically start themselves on reboot for some reason? This is happening on several of our servers and we can't figure out why...but that is a story for another day I guess.

  • There is a hotfix for the WMI problem, if it's the one I"m thinking of...

    What's the OS platform?

    I think the problem I know about is only on win 2008 R2 on x64.

    http://support.microsoft.com/kb/981314

    You can also just restart the WMI service, if that's the case.

    This fix is also included in win 2008 r2 SP1.

    Services that get started by other services are often set to manual.

  • upstart (7/15/2011)


    We have all our services set to Manual, so in theory, if the server is up, the services should be running as well.

    So that's why my code basically Pings the Server, uses an OPENROWSET to return some value to see if SQL Service is running, then checks to see if a record exists in sysprocesses table for the AGENT. Set this to run every 1 minute or so on a remote box and if I don't receive any emails, I am a happy camper.

    On a side note: Has anyone ever seen services set to manual that automatically start themselves on reboot for some reason? This is happening on several of our servers and we can't figure out why...but that is a story for another day I guess.

    First, setting the services to manual does not mean they will be up and available if the server is up. I think you meant Automatic - but if not, then those services will not startup when the system is restarted.

    Some services are set to manual and started by another service/process as needed. So, yeah - I have seen quite a few services that are set to manual that are actually running on the servers. In fact, I see quite a few services on my laptop that are started and set to manual.

    If you have a lot of servers to monitor, it would be quite cost effective to purchase a monitoring suite to monitor your servers. Not only would you get service state, but you would also be able to monitor CPU, memory, storage and other performance counters and alert on those.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply