Need to setup a sql alert on “waiting tasks” - similar to “waiting tasks” in Activity Monitor

  • My manager has asked me to work upon a request. I am not sure if it is reasonable or weird but he wants me to setup an alert based on the "waiting task" that we see in the activity monitor. Basically, if the value in the "waiting task" goes above 10 then we need to be alerted upon. I am not sure as to how I would replicate the same information that we see in Activity monitor as this seems to be live data. Closest DMV and the column I could think of is "waiting_tasks_count" in sys.dm_os_wait_stats. However, this does not seem to be in line with what he is looking for and the column consists of cumulative values over a period of time since last time SQL was restarted. Can someone shed some light on where I can find a query or a counter that I can alert upon?

  • ffarouqi - Monday, July 31, 2017 11:40 PM

    My manager has asked me to work upon a request. I am not sure if it is reasonable or weird but he wants me to setup an alert based on the "waiting task" that we see in the activity monitor. Basically, if the value in the "waiting task" goes above 10 then we need to be alerted upon. I am not sure as to how I would replicate the same information that we see in Activity monitor as this seems to be live data. Closest DMV and the column I could think of is "waiting_tasks_count" in sys.dm_os_wait_stats. However, this does not seem to be in line with what he is looking for and the column consists of cumulative values over a period of time since last time SQL was restarted. Can someone shed some light on where I can find a query or a counter that I can alert upon?

    Not sure what he is specifically asking but I would guess Average Waiter Count. Activity monitor doesn't get this by querying one DMV and if you were to use dm_os_wait_stats, it's cummulative since the last restart so you would have to ask him which waits, what time frame and write a process that captures the waits to a table, capture again for whatever time frame and do a comparison.

    You can find what Activity Monitor is using if you trace while opening Activity Monitor.
    Run profiler and then open activity monitor and then go back to profiler and stop the trace but keep the trace window open.
    You can find the queries that Activity Monitory is using in that trace The entries will be with your Login and the app will be Management Studio.
    For how it is building the temp tables to collect from and the logic, search the TextData for:
    CREATE PROC #am_generate_waitstats
    That's where things are created. The other one used is which you can find by searching TextData is:
    CREATE PROC #am_get_querystats

    But just capturing waits and sending a notification based on "going above 10" - whatever that means - may likely not be of a lot of value and could generate a lot of unnecessary noise. I'd probably try to pin him down on what exactly he is trying to alert to. It sounds like he just looks at Activity Monitor, makes some guesses on what could be an issue but doesn't really understand the details of what he is looking at in terms of waits. So what is the issue he is trying to address? 

    Sue

  • Sue_h - Here are the notes from my manager. I am not sure what is he trying to achieve.

    I've been using the Activity Monitor (I attached a screenshot). Prior to our upgrade, I used to see the 4 counters constantly in motion; there were ALWAYS waiting tasks, or it seemed that way. So when performance issues happened, it was always very difficult to know what was causing them. 

    Last week, a user kicked off a delete in our payroll_weekly table. I happened to be in the office and immediately saw the WAITING TASKS spike. I also heard users at their desks say "The system is frozen!" I was able to log on and find the offending job (the delete). So my though was: If I can get an alert when those jobs spike, I can at least have some time to jump on and see what's happening, then work to stop whatever it is in the future. 

    Regards,
    Faisal

  • Okay...he doesn't know what he is looking at. There will always be some waits so that means nothing. Just seeing a wait spike doesn't mean anything either. Not sure why he isn't more worried about stopping some large delete in the future. Is it really a job? Or is that just what he calls it? Seems the problem is more with a user being able to do some large deletes from the payroll_weekly table. If the user has direct access to delete, they should not. If it's an application that allows this, it should not. And if he gets a list of waits at any given time it's not going to tell him much, especially when he doesn't know what they mean. Locks need to be taken for a delete and other processes trying to access the table will have to wait. I would guess it's a large number of records though based on the description so it isn't anything that should run during business hours anyway. So it seems there is a lot wrong with the process of how this database is used. That is really the issue.

    But he will likely want you to do something. It would be good if you took the time to understand the blocked process report and how setting something up with this can probably help. There is a good article up on this site to get started:
    Using the Blocked Process Report in SQL Server 2005/2008
    And here is another one -
    How to Identify Blocking Problems with SQL Profiler

    Read up as much as you can on the blocked process report and locking in general, lock escalation.
    And just for your purposes, for monitoring wait stats, start by reading this:
    Monitoring Wait Stats
    And  to understand them more, try this article:
    Wait Statistics

    And back to your task, to create an alert for blocked processes:
    To to SQL Server Agent, right click on the Alerts folder and select New Alert
    Change the type to SQL Server performance condition alert
    Under object select SQL Server:General Statistics
    Under Counter select Processes blocked.
    The counter field to defines how many blocked processes are required to trigger the alert. You can set it to rises above 0 if he wants a notification on the first one. Ask him - I'm sure he'll say yes.
    Under the Response tab, set up what you want the alert to do when it is triggered. Probably create an operator for him, use his email and notify by email.

    Sue

  • Thanks

    Sue_H - Tuesday, August 1, 2017 11:15 AM

    Okay...he doesn't know what he is looking at. There will always be some waits so that means nothing. Just seeing a wait spike doesn't mean anything either. Not sure why he isn't more worried about stopping some large delete in the future. Is it really a job? Or is that just what he calls it? Seems the problem is more with a user being able to do some large deletes from the payroll_weekly table. If the user has direct access to delete, they should not. If it's an application that allows this, it should not. And if he gets a list of waits at any given time it's not going to tell him much, especially when he doesn't know what they mean. Locks need to be taken for a delete and other processes trying to access the table will have to wait. I would guess it's a large number of records though based on the description so it isn't anything that should run during business hours anyway. So it seems there is a lot wrong with the process of how this database is used. That is really the issue.

    But he will likely want you to do something. It would be good if you took the time to understand the blocked process report and how setting something up with this can probably help. There is a good article up on this site to get started:
    Using the Blocked Process Report in SQL Server 2005/2008
    And here is another one -
    How to Identify Blocking Problems with SQL Profiler

    Read up as much as you can on the blocked process report and locking in general, lock escalation.
    And just for your purposes, for monitoring wait stats, start by reading this:
    Monitoring Wait Stats
    And  to understand them more, try this article:
    Wait Statistics

    And back to your task, to create an alert for blocked processes:
    To to SQL Server Agent, right click on the Alerts folder and select New Alert
    Change the type to SQL Server performance condition alert
    Under object select SQL Server:General Statistics
    Under Counter select Processes blocked.
    The counter field to defines how many blocked processes are required to trigger the alert. You can set it to rises above 0 if he wants a notification on the first one. Ask him - I'm sure he'll say yes.
    Under the Response tab, set up what you want the alert to do when it is triggered. Probably create an operator for him, use his email and notify by email.

    Sue

    Thanks Sue. I already explained him about the waits and how they work in SQL server and it doesn't make any sense to me about what he expects. Anyways, we do have Solarwinds configured that alerts upon if we have blocking going on, so I wouldn't want to really duplicate the same. I totally get that "Monitoring wait stats thing" but from his email he is not sure if he wants to look at what wait type is prevalent on the server rather than just jumping on the high and low waiting task count in Activity monitor.

  • ffarouqi - Tuesday, August 1, 2017 11:27 AM

    Thanks

    Thanks Sue. I already explained him about the waits and how they work in SQL server and it doesn't make any sense to me about what he expects. Anyways, we do have Solarwinds configured that alerts upon if we have blocking going on, so I wouldn't want to really duplicate the same. I totally get that "Monitoring wait stats thing" but from his email he is not sure if he wants to look at what wait type is prevalent on the server rather than just jumping on the high and low waiting task count in Activity monitor.

    What's weird is that if he really knew things, he could just check Solarwinds.
    If you had to set something up with wait stats, you need to just dump them into a table and after a period of time, capture them again. Then you compare the current with the previous to find the difference. If you check with a stored procedure, create whatever condition you want and send him an email if the waiting task count exceeds a certain number from the last one. If he thinks 10, then send him an email every time any wait count is 10 greater than it was on your previous check. wait exceeds. I would guess he would have you get rid of that alert fairly quickly. Sometimes you have to give them what they ask for to get them to see it's not what they want. If you look at that third link called Monitoring Wait Stats, it will set you in the direction for capturing and comparing the differences. You'd just need to add a part for checking for magic number 10 on NumberOfWaits she uses in the script. And then send an email.

    Sue

  • Sue_H - Tuesday, August 1, 2017 11:44 AM

    ffarouqi - Tuesday, August 1, 2017 11:27 AM

    Thanks

    Thanks Sue. I already explained him about the waits and how they work in SQL server and it doesn't make any sense to me about what he expects. Anyways, we do have Solarwinds configured that alerts upon if we have blocking going on, so I wouldn't want to really duplicate the same. I totally get that "Monitoring wait stats thing" but from his email he is not sure if he wants to look at what wait type is prevalent on the server rather than just jumping on the high and low waiting task count in Activity monitor.

    What's weird is that if he really knew things, he could just check Solarwinds.
    If you had to set something up with wait stats, you need to just dump them into a table and after a period of time, capture them again. Then you compare the current with the previous to find the difference. If you check with a stored procedure, create whatever condition you want and send him an email if the waiting task count exceeds a certain number from the last one. If he thinks 10, then send him an email every time any wait count is 10 greater than it was on your previous check. wait exceeds. I would guess he would have you get rid of that alert fairly quickly. Sometimes you have to give them what they ask for to get them to see it's not what they want. If you look at that third link called Monitoring Wait Stats, it will set you in the direction for capturing and comparing the differences. You'd just need to add a part for checking for magic number 10 on NumberOfWaits she uses in the script. And then send an email.

    Sue

    Thanks! Sue. Do you have a way or a script for sending the alert by encapsulating the NumberofWaits to give me an accurate difference by comparing the current wait count and the last wait count. I am not even sure if the NumberofWaits is a cumulative wait and how would I send an alert because the dmv that it is getting out the data from has that value been growing over time so not sure how I would alert if it is more than 10 as most of the time it would always be greater than 10 as the waits will stack up and it doesn't make sense to keep on alerting when there isn't really any problem...totally confused by his requirement.

  • ffarouqi - Tuesday, August 1, 2017 1:17 PM

    Thanks! Sue. Do you have a way or a script for sending the alert by encapsulating the NumberofWaits to give me an accurate difference by comparing the current wait count and the last wait count. I am not even sure if the NumberofWaits is a cumulative wait and how would I send an alert because the dmv that it is getting out the data from has that value been growing over time so not sure how I would alert if it is more than 10 as most of the time it would always be greater than 10 as the waits will stack up and it doesn't make sense to keep on alerting when there isn't really any problem...totally confused by his requirement.

    Yes, as I mentioned before,  the values are cumulative since SQL Server restart. That's why you would want to do captures and then use a CTE to compare the two different values. If you go to that link it shows you how to capture and also shows you how to do the comparison with the CTE:
    Monitoring wait stats

    Those are all things you will want to know how to do. There is nothing built in to do this. Using the above link really isn't too hard to modify or use.
    You can also refer to this article:
    Triage Wait Stats in SQL Server

    The best thing to do is create a database for things like this - most people have them on each instance. Call it DBA, call it Performance, call it Maintenance, call it whatever you want. But you will want a place for DBA activities like this - tables, stored procedures, etc.
    After you create the database, start playing with those examples. If I write all of it for you then I have to support it and you will need to give me your paycheck. So it's probably better if you do it. And it won't all work right away, you'll hit errors here or there but that is how you will learn and how you will get to where you know it. And that is how you earn your paycheck 🙂

    So create the table, capture the statistics to that table, capture a second time so that you can compare the two rows.
    In  your  case you will always want to maintain the two and just keep checking. So use the sample CTEs where you compare the two rows. And the before you insert the next set (which would be the third), use that CTE with row number and delete those with interval 1. That will leave you with one set. Then insert the next capture and run the CTE to compare. And just work at getting that process working.
    Once you get that working, you can just check the values for the number of waits between the two captures, if it's greater than 10, send the email.
    All of that will take awhile to get working and figure out but it's something you will use for other things - maybe not with the email. But it's a process you would use when monitoring virtual file stats as well. Both are important and both need this kind of a process.

    Sue

  • For current waits you can query those from sysprocesses for example, alert on current writelog waits:

    DECLARE @EmailRecipient VARCHAR(1000), @SubjectText VARCHAR(1000), @ProfileName VARCHAR(1000), @Body VARCHAR(4000), @waits INT

    SELECT @waits = count(*) from sysprocesses where waittime <> 0 and spid > 49
    and
    lastwaittype = 'WRITELOG'

        -- If we still have them, continue
        IF @waits > 50 BEGIN

           /****************************************************************
           * Store details
           ****************************************************************/
           EXEC SysAdmin..usp_CallWhoIsActive

           /****************************************************************
           * Send Email
           ****************************************************************/
           SELECT TOP 1 @EmailRecipient = 'blah@blah.com'
           SELECT @SubjectText = @@SERVERNAME + ' Write Log Waits!!!'
           SELECT TOP 1 @ProfileName = name FROM msdb.dbo.sysmail_profile
             
    SET @Body =

      N'<H3 style="color:red; font-family:verdana">' + CAST(@@SERVERNAME AS VARCHAR) + ' has ' + CAST(@waits AS NVARCHAR) + ' WRITELOG Waits. Please run below query to check the waits.</H3>' + N'<H5 style="color:black; font-family:verdana;font =1">' + 'select count(*) from sysprocesses where waittime <> 0 and spid > 49 and lastwaittype = ''WRITELOG'' </H5>'

           EXEC msdb.dbo.sp_send_dbmail
              @profile_name = @ProfileName,
              @recipients = @EmailRecipient,
              @subject = @SubjectText,
              @body = @Body,
              @body_format = 'HTML' ;
        END

    But I would concentrate more on blocking in your situation for example something like this based on the adam mechanic whoisactive procedure:

    IF NOT EXISTS(SELECT * FROM sys.tables WHERE name = 'BlockInfo') BEGIN
        CREATE TABLE BlockInfo (rundate DATETIME PRIMARY KEY, blocks INT)
    END

    INSERT INTO BlockInfo
    SELECT GETDATE(), COUNT(*)
    FROM master..sysprocesses WITH (NOLOCK)
    WHERE blocked <> 0 AND waittime >= 5000 -- 5 secs

    IF (SELECT TOP 1 blocks FROM BlockInfo ORDER BY rundate DESC) <> 0 BEGIN
        -- Wait a short time and test again
        WAITFOR DELAY '00:00:15'

        INSERT INTO BlockInfo
        SELECT GETDATE(), COUNT(*)
        FROM master..sysprocesses WITH (NOLOCK)
        WHERE blocked <> 0 AND waittime >= 5000 -- 5 secs
    END

    /***************************************************************************************
    * Send Email
    ****************************************************************************************/
    DECLARE @EmailRecipient VARCHAR(1000), @SubjectText VARCHAR(1000), @ProfileName VARCHAR(1000), @Body VARCHAR(MAX), @cBlocks VARCHAR(20)

    SELECT TOP 1 @cBlocks = CONVERT(VARCHAR(20), CAST(blocks AS MONEY), 1)
    FROM BlockInfo WITH (NOLOCK)
    ORDER BY rundate DESC

    SELECT @cBlocks = LEFT(@cBlocks, CHARINDEX('.', @cBlocks)-1)

    IF @cBlocks <> '0' BEGIN

        exec sysadmin..usp_callwhoisactive

        SELECT TOP 1 @EmailRecipient = 'blah@blah.com'
        SELECT @SubjectText = @@SERVERNAME + ' Blocking found !!!'
        SELECT TOP 1 @ProfileName = name FROM msdb.dbo.sysmail_profile

        declare @AllSpid table (spid int,     [wait_info] [varchar](4000) NULL, blocked int)
        insert into @AllSpid
        select SESSION_ID, [wait_info], ISNULL(BLOCKING_SESSION_ID, 0)
        FROM SysAdmin..WhoIsActive
      WHERE collection_time =
        (SELECT MAX(collection_time) FROM SysAdmin..WhoIsActive)

    select top 1 * from sysadmin..whoisactive

    SET @Body =

      N'<H3 style="color:red; font-family:verdana">' + CAST(@@SERVERNAME AS VARCHAR) + ' blocking details</H3>' +
      N'<table border="3"; style="font-family:verdana; text-align:left; font-size:8pt">' +

      N'<tr><th>Lead blocking SPID</th><th>Host</th><th>Wait Info</th><th>sql_command</th>' +
       CAST ( ( SELECT
        td = session_id, '',
        td = host_name, '',
          td = ISNULL(CAST(wait_info AS VARCHAR(30) ), '.'), '',
          td = CAST(sql_command AS VARCHAR(MAX))
         FROM SysAdmin..WhoIsActive
         WHERE collection_time =
                    (SELECT MAX(collection_time) FROM SysAdmin..WhoIsActive)
                    and session_id in
                    (select distinct spid as [head of blocking chain session_id]
                    from @AllSpid
                    where (blocked = 0 and spid in
                    (select blocked from @AllSpid where blocked > 0)) or spid = blocked)
                    FOR XML PATH('tr'), TYPE
      ) AS NVARCHAR(MAX) ) +
      N'</table>' ;

        EXEC msdb.dbo.sp_send_dbmail
            @profile_name = @ProfileName,
            @recipients = @EmailRecipient,
            @body_format = 'HTML',
            @body = @Body,
            @subject = @SubjectText;
    END

    MCITP SQL 2005, MCSA SQL 2012

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply