Introduction
Your SQL Agent Jobs may be silently failing if you are using Linked Servers to make remote procedure calls. On occasion these jobs will show a false success message in the job history like the following:
The step above actually failed since the remote procedure call timed out. This issue has existed in the SQL Server platform for several versions, including SQL 2016 RC3. It has been reported to Microsoft as early as 4/5/2008 in this issue report. The Linked Server Provider treats the Query timeout as a warning and does not raise an error. The issue report does recommend a workaround which can easily be overlooked without a full example. A full example is provided in this article.
Reproducing the Query Timeout
To reproduce this issue, we will establish a Linked Server and change the timeout from the default of 600 seconds to a few seconds.
1. Establish a Linked Server, we’ll use localhost for simplicity.
EXEC master.dbo.sp_addlinkedserver @server = N'LOCALHOST', @srvproduct=N'SQL Server' EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'LOCALHOST',@useself=N'True',@locallogin=NULL,@rmtuser=NULL,@rmtpassword=NULL EXEC master.dbo.sp_serveroption @server=N'LOCALHOST', @optname=N'rpc', @optvalue=N'true' EXEC master.dbo.sp_serveroption @server=N'LOCALHOST', @optname=N'rpc out', @optvalue=N'true'
2. Next we’ll adjust the timeout to a few seconds.
exec sp_configure'remote query timeout', 5 reconfigure
3. Now executing the following will simulate the query timeout.
DECLARE @returnCode int = 0 DECLARE @sql nvarchar(max) = 'WAITFOR DELAY ''00:00:06''; exec sp_who2' EXEC @returnCode = [localhost].master.dbo.sp_executesql @sql
The Message dialogue shows a warning message that the query timed out. No results were returned and no error was raised.
Raising an Error Instead
To change the warning behavior we can implement a workaround by checking the return code of the sp_executesql procedure we are calling. This older style error checking can be used within a try catch block so we can continue using the catch block for all of our error handling and add logging or other notifications to the catch block.
BEGIN TRY DECLARE @returnCode int = 0 DECLARE @linkedServer sysname = 'localhost' DECLARE @databaseName sysname = 'master' DECLARE @procErrorMsg nvarchar(255) = N'Remote procedure %s on database %s failed using linked server %s.' DECLARE @procName sysname DECLARE @sql nvarchar(max) SET @procName = 'sp_who' SET @sql = 'WAITFOR DELAY ''00:00:06''; exec ' + @procName EXEC @returnCode = [localhost].master.dbo.sp_executesql @sql IF @returnCode IS NULL BEGIN RAISERROR (@procErrorMsg, 16, 1, @procName, @databaseName, @linkedServer) END SET @procName = 'sp_who2' SET @sql = 'WAITFOR DELAY ''00:00:06''; exec ' + @procName EXEC @returnCode = [localhost].master.dbo.sp_executesql @sql IF @returnCode IS NULL BEGIN RAISERROR (@procErrorMsg, 16, 1, @procName, @databaseName, @linkedServer) END END TRY BEGIN CATCH DECLARE @errorMessage nvarchar(max) DECLARE @errorSeverity tinyint DECLARE @errorState tinyint SELECT @errorMessage = ERROR_MESSAGE(), @errorSeverity = ERROR_SEVERITY(), @errorState = ERROR_STATE() RAISERROR (@errorMessage, @errorSeverity, @errorState) END CATCH
Conclusion
Using the above pattern for your procedure calls within the Job Step will cause the job to fail as expected instead of silently failing.
--
Chad Feghali is a senior database administrator and developer with Rackspace.