Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first. One connection can block another connection, regardless of whether they emanate from the same application or separate applications on different client computers. Some of the actions needing locking protection may not be obvious, for example, locks on system catalog tables and indexes. Most blocking problems happen because a single process holds locks for an extended period of time, causing a chain of blocked processes, all waiting on other processes for locks. There are many blocking scenarios such as submitting queries with long execution times, applications that are not processing all results to completion, etc. The application code itself must be designed appropriately and monitored for efficiency. In fact, most experts agree that as much as 70 to 80 percent of performance and blocking problems are caused by improperly coded database applications. SQL is the primary culprit. And this is why one of the common tasks in multi-user multi-database environment is to monitor the processes that are blocked by another process for more than X number of seconds.
Most of the time, blocks don't cause problems unless it is long time blocking. Locking problems arise only when a process holds a lock longer than necessary and blocks other users' processes
The following categories show the division of application blocking processing based on our DBA department decision:
- Short-time Blocking - average duration less than three seconds. They do not create problems.
- Mid-time Blocking - average duration ranges from 3 seconds to a maximum of 10 seconds. Usually do not create problems
- Long-time Blocking - average duration more than 10 seconds. Require attention.
Deadlocks - SQL Server automatically terminates one of the blocked processes.
It is doesn’t mean that division for every company or scenario may not vary. But our application’s connections timeout is 15 seconds and with long-time blocking situation some application’s connections start timeout.
In the environment, where 200+ databases and new databases created on the weekly/daily bases, the monitoring task is required partial or complete automation and a notification when blocking is occur. The purpose of this article is not discussing a strategy for the resolution of SQL Server blocking problems but to show how to automate the task of monitoring and notification if blocking situation happen and lasts for more that X number of seconds. It is understandable, that this task should be scheduled.
As always, I am not pretending to find the perfect or the best solution. The point many of my articles is to present a task, describe the ideas of how to solve the task and show one or few various (may not be the best but) way(s) of solution implementation. For sake of clarity, all stored procedures will not include an error handler, most parts for the variables verification, and some other advanced features.
Sysprocesses is the system table that can reveal the most clues about why a process is blocking others. The sysprocesses table holds information about processes running on Microsoft® SQL Server™. These processes can be client processes or system processes. Sysprocesses is stored only in the master database. Let’s see the source code of developed procedure.
Developed stored procedure has 3 parameters:
@blocker_threshold shows how many blocking processes must exists before send an email,
@email_list is the list of the recipients to whom send an email, @wait_time_threshold define a threshold for the blocking.
create proc dbo.P_blockerS @blocker_threshold smallint = 1, @email_list varchar(255) = 'sqldbagroup' , @wait_time_threshold int as -- This procedure will assist in identifying SQL Blockers -- If blocking exceeds a threshold supplied as a parameter it executes xp_sendmail BEGIN set nocount on Declare @total_blockers smallint, @waittime int , @maxid int, @minid int declare @blocker_spid smallint, @blocker_spid_info varchar(500) , @blocker_spid_msg varchar(7000) declare @tbl_blockers table (blocker_spid int,tid int identity(1,1)) select @total_blockers = count(*) from master..sysprocesses where blocked > 0 and waittime > @wait_time_threshold set @blocker_spid = 0 set @blocker_spid_info = '' set @blocker_spid_msg = CAST(@total_blockers as char(3)) + process(es) are blocked. Blocking info: ' IF (@total_blockers > 0 ) -- if there are blockers BEGIN insert into @tbl_blockers (blocker_spid) select spid from master..sysprocesses where blocked > 0 and waittime > @wait_time_threshold union select blocked from master..sysprocesses where blocked > 0 and waittime > @wait_time_threshold select @minid = min(tid), @maxid = max(tid) from @tbl_blockers WHILE ( @minid <= @maxid ) BEGIN -- begin while loop if ( @total_blockers >= @blocker_threshold ) BEGIN select @blocker_spid = blocker_spid from @tbl_blockers where tid = @minid select @blocker_spid_info = 'Block on ' + rtrim(@@servername) + ', SPID='+ rtrim(convert(varchar(6),p.spid))+', bloked by '+ isNull(cast(blocked as varchar(9)), ' ') + ', '+ rtrim(convert(varchar(15),IsNull(p.status, ' ')))+' , '+ rtrim(convert(varchar(25),IsNull(p.loginame, ' ')))+' , '+ rtrim(convert(varchar(15),IsNull(p.hostname, ' ')))+' , '+ rtrim(convert(varchar(30),IsNull(p.program_name, ' ')))+' , '+ rtrim(convert(varchar(25),IsNull(p.cmd, ' ')))+', login_time='+ rtrim(convert(varchar(19),IsNull(p.login_time,'1900-01-01'),121))+', last_batch='+ rtrim(convert(varchar(19),IsNull(p.last_batch,'1900-01-01'),121)) + ' ' from master..sysprocesses p where p.spid = @blocker_spid end set @blocker_spid_msg = @blocker_spid_msg + @blocker_spid_info set @minid = @minid + 1 END -- end while loop exec master..xp_sendmail @recipients = @email_list , @subject = 'Blocking Process' , @message = @blocker_spid_msg END -- if blockers END
Another developed procedure p_blocking_notify gives an extended example for the process of monitoring blocking connections. This procedure assists in identifying SQL Blockers. And, in the case of existence of a blocking process, procedure will not only send an email but provide an ability to kill the blocker connection based on application name and number of seconds’ another process is blocked. The choice of killing (disconnecting) the blocking connection is controlled by the parameter @kill_flag. If the value is ‘Y’ blocking connection will be terminated. If blocking exceeds a threshold supplied as a parameters @blocker_threshold and @waittime_threshold then it executes an raiserror and xp_sendmail.
Usually I am creating a job that runs ones in 1-2 minutes or based on blocking prediction time the schedule may vary (for example, every 2-5 minutes). In addition, stored procedure can be modified to insert results into a history table that may give the ability for the better analysis over the time. I am utilizing this procedure when I need to analyze an intermediate blocking process that may happen rarely. In this case SQL Profiler is useless because excessive blocking situation may not appear within days or even weeks.
create proc dbo.p_blocking_notify @blocker_threshold smallint = 1, @email_list varchar(255) = 'emailgroup@abc.com' , @program_name_par1 varchar(255), @waittime_threshold int, @num_of_blocked_processes int, @kill_flag char(1) = 'N' as BEGIN set nocount on Declare @total_blockers smallint, @cnt int, @waittime int , @program_name varchar(50), @APP_total_blockers int declare @blocker_spid smallint, @blocker_spid_info varchar(250) , @below_thres_msg varchar(30) declare @blocker_spid_msg varchar(5000), @blocker_spid_msg1 varchar(250) , @above_thres_msg varchar(255) select @total_blockers = count(*) from master..sysprocesses where blocked > 0 select @APP_total_blockers = count(*), @waittime = max(waittime) from master..sysprocesses where blocked > 0 and program_name = @program_name_par1 set @cnt = 1 WHILE ( @APP_total_blockers > 0 and @waittime >= @waittime_threshold and @cnt < @num_of_blocked_processes) BEGIN -- begin while loop set @cnt = @cnt + 1 -- The following statements can be modified to insert results into a history -- table. If run from SQL Executive a blocker count will appear in history if (@total_blockers < @blocker_threshold) begin set @below_thres_msg = 'Total Blockers at '+convert(char(3),@total_blockers) print ' ' print @below_thres_msg end else BEGIN set @above_thres_msg = ' Blocking Threshold Exceeded, Threshold at ' + convert(char(3),@blocker_threshold) + ' Total Blockers at ' + convert(char(3),@total_blockers) print ' ' print @above_thres_msg end -- Send a Message to the log. Can utilize message for SQL Alert RAISERROR (@above_thres_msg,16,1) with log set rowcount 1 select @blocker_spid = (select TOP 1 spid from master..sysprocesses where blocked = 0 and spid in ( select blocked from master..sysprocesses where blocked > 0 ) ) select @total_blockers 'Total Blocked Processes' select @blocker_spid 'This is the Culprit' select @blocker_spid_info = (select 'Block on ' + rtrim(@@servername) + ', SPID=' + rtrim(convert(char(6),p.spid))+', '+ rtrim(convert(char(15),IsNull(p.status, ' ')))+' ,'+ rtrim(convert(char(25),IsNull(p.loginame, ' ')))+' ,'+ rtrim(convert(char(15),IsNull(p.hostname, ' ')))+' ,'+ rtrim(convert(char(30),IsNull(p.program_name, ' ')))+' ,'+ rtrim(convert(char(25),IsNull(p.cmd, ' ')))+', login_time='+ rtrim(convert(char(19),IsNull(p.login_time,'1900-01-01'),121))+',last_batch='+ rtrim(convert(char(19),IsNull(p.last_batch,'1900-01-01'),121)) from master..sysprocesses p where p.spid = @blocker_spid) select @program_name = RTrim(Ltrim(p.program_name)) from master..sysprocesses p where p.spid = @blocker_spid set rowcount 0 -- kill blocking process IF ( @program_name = @program_name_par1 and @waittime > @waittime_threshold and @kill_flag = 'Y') begin select @blocker_spid_msg = 'kill '+ rtrim(convert(varchar(4),@blocker_spid)) exec (@blocker_spid_msg ) end select @total_blockers = count(*) from master..sysprocesses where blocked > 0 select @APP_total_blockers = count(*) , @waittime = max(waittime) from master..sysprocesses where blocked > 0 and program_name = @program_name_par1 END -- end while loop set @blocker_spid_info = CAST(@cnt as char(2)) + ' process(es) killed. Last blocker id info: ' + @blocker_spid_info set @blocker_spid_msg = 'xp_sendmail @recipients = ''' + @email_list + ''' , @subject = ''Blocking Process'', @message = ''' + @blocker_spid_info + '''' exec (@blocker_spid_msg ) END
Caution: Using the kill command is a very dangerous approach to clearing the blocking process because it could potentially corrupt the SQL database if the blocking process is performing a database update. The kill command may also take quite a while to execute if the blocking process has performed a lot of work that the kill process must undo (rollback). For details on the kill command, view SQL Books Online.
In a lot of cases, there are some consequences of blocking that are not so obvious. DBA can spend large amount of time detecting, diagnosing and resolving blocking problems. Appropriate tools and processes can minimize the time required to manage the problems. Company can waste money by buying additional hardware instead of addressing the underlying (blocking) problem.
That’s why it is very important not only to know the standard monitoring connections data (that can be done by using system stored procedures sp_who and sp_who2) but to find out the actual SQL command the process was running at the time. For example, get information about running processes and the last/current SQL command they running if any of them are running more than X number of seconds while usually are short. Connection is identified by the host computer running the process, but theoretically, definition criteria may vary based on the specific conditions, environment, running applications, and so on. My task was to find out what processes are running on server and posting significant load on server. The main problem that it was an intermediate issue and there was no an ability to say when it happened. It may happen 2-3 times a day or may happen 1-2 times a week.
I decided to utilize my standard process which is inserting a snapshot of every process every X number of minutes. And if any defined process is running longer than usually – gather the last command data and in addition activate the job to get a snapshot of all running processes/connections.
There are a lot of articles how to monitor SQL server connections and users. The simplest way is to create a table and store data based on sp_who (or sp_who2) stored procedure or select information from system table sysprocesses. I created table T_SP_WHO_TABLE
CREATE TABLE [dbo].[T_SP_WHO_TABLE] ( [PROCESSES_IDENT_ID] [int] IDENTITY (1, 1) NOT NULL , [PROCESS_ID] [int] NOT NULL , [DATABASE_NAME] [varchar] (25) NOT NULL , [CURRENT_COMMAND] [varchar] (50) NULL , [PROCESS_ID_STATUS] [varchar] (25) NULL , [BLOCKING_PROCESS_ID] [int] NULL , [CPU_TIME] [int] NULL , [DISK_IO] [int] NULL , [LAST_BATCH] [datetime] NULL , [LASTWAITTYPE] [varchar] (64) NULL , [WAITTIME] [int] NULL , [WAITRESOURCE] [varchar] (64) NULL , [HOSTNAME] [varchar] (128) NULL , [PROGRAM_NAME] [varchar] (128) NULL , [NT_USERNAME] [varchar] (128) NULL , [SQL_LOGIN] [varchar] (128) NULL , [OPEN_TRAN] [int] NULL , [LAST_UPDATED_DT] [datetime] NOT NULL DEFAULT getdate() )
Stored procedure P_SP_WHO is running by SQL Server Agent every 5 minutes and insert snapshot of the monitoring data into the table. It gives me some information about connected users and processes. Another procedure P_T_SP_WHO_TABLE_CLEANUP running with daily (or weekly) maintenance job and keeping manageable number of rows in table T_SP_WHO_TABLE by deleting records that are older then X number of days based on the parameter @DAYSBACK.
create procedure P_SP_WHO as BEGIN INSERT INTO T_SP_WHO_TABLE (PROCESS_ID , DATABASE_NAME , CURRENT_COMMAND , PROCESS_ID_STATUS , BLOCKING_PROCESS_ID , CPU_TIME , DISK_IO, LAST_BATCH , LASTWAITTYPE , WAITTIME , WAITRESOURCE , HOSTNAME , PROGRAM_NAME , NT_USERNAME , SQL_LOGIN , OPEN_TRAN ) select spid , convert(varchar(25), case dbid when 0 then 'no database context' else db_name(dbid) end) , cmd , convert(varchar(20), status) , blocked , cpu , physical_io , last_batch , lastwaittype , waittime , convert(varchar(30), waitresource) , convert(varchar(15), hostname) , convert(varchar(30), program_name) , convert(varchar(20), nt_username) , convert(varchar(30), loginame) , open_tran from master..sysprocesses -- where conditions may vary or not exists at all where not ( ltrim(rtrim(hostname)) = '' and status = 'background' ) END create procedure P_T_SP_WHO_TABLE_CLEANUP @DAYSBACK INT as DELETE FROM T_SP_WHO_TABLE WHERE DATEDIFF(DD,LAST_UPDATED_DT, GETDATE()) > @DAYSBACK
To be able storing the command with the last statement I created table H_LAST_COMMAND.
Stored procedure P_MON_CONNECTIONS provide an ability to verify business criteria for the process time delay, store the last command information in case of the reaching a threshold and activate the job to store the snapshot of the processes.
CREATE TABLE [dbo].[H_LAST_COMMAND] ( [LAST_COMMAND_ID] [int] IDENTITY (1, 1) NOT NULL primary key CLUSTERED , [SPID] [int] NOT NULL , [EVENTTYPE] [varchar] (60) NOT NULL , [PARAMETERS] [INT] NULL , [EVENTINFO] [varchar] (600) NULL , [LAST_UPDATED_DT] [datetime] NOT NULL default getdate() ) CREATE procedure dbo.P_MON_CONNECTIONS @SEC_BACK INT = -18 , @hostname varchar(255) , @job_nm varchar(255) as begin declare @maxid int , @cmd1 varchar(8000) , @spid int SET NOCOUNT ON DECLARE @tmp1 table (spid INT, tcmd1 varchar(100), tid int identity(1,1)) create table #lastcommand(eventtype varchar(60) , Parameters int , EventInfo varchar(600) ) IF ( ( select count(*) from master..sysprocesses where hostname like @hostname and open_tran > 0 and login_time < DATEadd(ss, @SEC_BACK, getdate() ) and last_batch < DATEadd(ss, @SEC_BACK, getdate()) ) = 0 ) begin return 0 end insert into @tmp1(spid, tcmd1) select spid , 'dbcc inputbuffer(' + cast(spid as varchar) + ')' from master..sysprocesses /* where condition can vary*/ where status <> 'background' and IsNull(ltrim(hostname),'') <> '' select @maxid = max(tid) from @tmp1 while @maxid > 0 begin select @cmd1 = tcmd1, @spid = spid from @tmp1 where tid = @maxid truncate table #lastcommand insert into #lastcommand(eventtype , Parameters , EventInfo ) exec (@cmd1) insert into h_last_command (spid , eventtype , Parameters, EventInfo ) select @spid, eventtype , Parameters, EventInfo from #lastcommand select @maxid = @maxid - 1 end EXEC msdb.dbo.sp_start_job @job_name = @job_nm SET NOCOUNT OFF return 0 end
To be able to utilize the procedure an another job with one step was created. The job is running endlessly until stopped manually. The step source code is follow:
Declare @delay_in_seconds int Set @delay_in_seconds = 10 WHILE 1=1 BEGIN exec P_MON_CONNECTIONS @SEC_BACK = -15 , @hostname = ‘PROCESS_COMPUTER’ , @job_nm WAITFOR DELAY @delay_in_seconds END
It is possible to wrap the job step into a stored procedure and develop the more sophisticated code with a parameter to define how long job will be running. In addition, for the time delay I created the standard stored procedure, which allows me easier utilize the time delay in many other cases. For example for a data transfer processes.
CREATE procedure dbo.P_TIME_DELAY @delay_in_seconds int -- delay in seconds, max for 1 hour as begin declare @DELAY_TIME varchar(10) , @delay_time_num numeric(6,2) -- check if provided number bigger than 3599 (1 hour) IF ( @delay_in_seconds > 3599 or @delay_in_seconds < 0) begin select raiserror('Error: @delay_in_seconds more than 3599 or less than 0. Value can be between 0 and 3599', 16, 1) return end -- convert seconds set @delay_time_num = @delay_in_seconds set @DELAY_TIME = '00:' + right('0' + cast(FLOOR(@delay_time_num/60) as varchar(2)), 2) + ':' + right('0' + cast( cast( (@delay_time_num - (FLOOR(@delay_time_num/60) * 60) ) as int) as varchar(2)), 2) WAITFOR DELAY @DELAY_TIME End
Most of stored procedures, user defined functions, and tables that are used only by administrative processes (including the ones described above) reside in a separate database created by DBA for system administrator’s needs. This means that there are no users without administrative rights have an ability to execute DBA specifically created stored procedures or functions as well as view or change gathered data.
Conclusion
This article, while is not pretending to deliver the best possible solutions, shows few DBA specific tasks and outline some solutions to solve the blocking problems. I am not discussing the optimization or optimal source code but presented source code is fully functional. The source code can be modified; an error handler and some advanced features can be added based on the specific conditions.