Blog Post

Hotfix for SQL Server 2008/2008 R2 Periodically Does Not Accept Connections Bug

,

Microsoft has released a hotfix for a very frustrating issue that plagued me for a couple of years at NewsGator.  I recently became reacquainted with this old problem with Error: 18056, Severity: 20, State: 29. Basically what happens is that a SQL Server 2008 or 2008 R2 database server that is under  absolutely no CPU or memory stress suddenly stops accepting connections from your application and web servers. You will get a number of errors in the SQL Server error log, like you see below:

Date  11/18/2011 8:42:40 PM
Log  SQL Server (Current – 11/18/2011 9:00:00 PM)

Source  spid81

Message
Error: 18056, Severity: 20, State: 29.

Date  11/18/2011 8:42:40 PM
Log  SQL Server (Current – 11/18/2011 9:00:00 PM)

Source  spid81

Message
The client was unable to reuse a session with SPID 81, which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

While this is going on, no middle-tier servers can connect to the SQL Server instance in question. Quite often, you as the DBA will not be able to make a new connection to the server in question either, using SSMS.  Your CPU utilization will go down to zero, with seemingly no activity happening, and the SQL Server Service will continue to run just fine, with no cluster failover or database mirroring failover being triggered. Essentially, your database instance seems to be pouting and not talking to anyone, like an unruly two-year old…  Usually, this problem clears itself up with no intervention within anywhere from one to ten minutes, but occasionally it requires restarting the SQL Server Service to resolve.

We used to see this issue periodically at NewsGator, starting back in 2009 on SQL Server 2008. There was no change or improvement as we moved to SQL Server 2008 R2. I had previously filed a couple of Connect items about it, opened a CSS case, etc., with no final resolution from Microsoft.  Here are some Connect items that describe the issue in more detail:

SQL Server 2008 Periodically Does Not Accept Connections

SQL Server 2008 SP1 CU6 Periodically Does Not Accept Connections  

SQL Server 2008 R2 Does Not Accept Connections

Increasing the default MaxWorkerThreads instance configuration setting seemed to mitigate the issue somewhat, when I first started seeing the issue.  Another change that seemed to help reduce the frequency of the issue was lowering the MaxServerMemory instance configuration setting by a few GB lower than you would otherwise have it set at. Lots of people I know and respect in the SQL Server community have also run into this over the last couple of years. Bob Dorr talked about this back in August 2010 and  Microsoft Escalation Engineer named Tejas Shah talked about it in May 2010.

After this background and history, it seems that the hotfix that I linked to in the first sentence of this blog post corrects the issue. I have not been able to deploy the fix yet on my particular production server where I recently saw the problem, but a good friend of mine from Microsoft has told me that one of his largest customers recently deployed the fix, and they have not seen the issue reoccur since then. According to the KB article for the fix, it is included in these Cumulative Updates:

SQL Server 2008 R2 RTM CU9 (10.50.1804)

SQL Server 2008 R2 SP1 CU2  (10.50.2772)

SQL Server 2008 SP2 CU5  (10.00.4316)

SQL Server 2008 SP3 CU1  (10.00.5766)

The relevant fix does show up in the fix-list for the SQL Server 2008 R2 RTM CU9 Cumulative Update, but not in the fix-list for SQL Server 2008 R2 SP1 CU2 or for SQL Server 2008 R2 SP1 CU3. I have been assured by someone else at Microsoft (who is in a position to know), that the fix is in the SQL Server 2008 SP1 branch, even though it is not explicitly listed in the fix-list.  If you have been running into this issue, I would suggest that you make plans to get a Cumulative Update that is new enough to include the fix as soon as possible. If you are at an organization that does not believe in deploying Cumulative Updates, you might have to make an exception in this case. I would love to hear from anyone else who has been seeing this problem themselves. Thanks for reading!

Filed under: SQL Server 2008, SQL Server 2008 R2 Tagged: Cumulative Updates

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating