October 8, 2014 at 9:41 am
Background : We are currently running on SQL Server 2012 with Always on, we run the backup script on the READ-ONLY Secondary node, We have log backup running for every 15 min, this is a high user activity game database
Problem : Right After Backup VERIFY-ONLY for every 15 min, we could see high drop in User Connections, No of Batch Requests/Sec would drop significantly, this happens during our peak traffic
Causes: I checked the logs, First we would get a Logon Failed Error Message followed by Client was unable to reuse a session, which had been reset for connection pooling, this floods the Connection pools after wards it loses the contact with Secondary Node
Logs : First received this message Logon Failed for User, Error:18456, Severity : 14, State:46
followed by Error 18056,Severity 20 State 46
Client was unable to reuse a session with SPID XXX which had been reset for connection pooling
Any direction about fixing this issue would be greatly appreciated
February 2, 2015 at 11:21 am
Did you get anywhere on this?
We're now seeing the same thing (SQL 2012 + Always on) and a drop off of clients.
March 16, 2017 at 9:11 pm
I received the same problem, does anyone know what is the caused?
March 16, 2017 at 11:17 pm
Hi All,
i had to do couple of things to fix this issue
1) i removed fragmentation on my most active table. i rebuild the indexes with FILL Factor of 80 on most active tables
2) we run log backups for every 15min. we have some of the SSIS Packages running at the beginning of the hour pulling the data. Typically it takes 2-3 min for our log backup. so i scheduled them right after that Ex : 7:00 AM Log backup. SSIS package is scheduled at 7:04 AM
3) i looked at the TOP 10 queries with the highest logical reads and writes and added included columns for few of them for maximum index selectivity
SELECT TOP 10 SUBSTRING(qt.TEXT, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(qt.TEXT)
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2)+1),
qs.execution_count,
qs.total_logical_reads, qs.last_logical_reads,
qs.total_logical_writes, qs.last_logical_writes,
qs.total_worker_time,
qs.last_worker_time,
qs.total_elapsed_time/1000 total_elapsed_time_in_mS,
qs.last_elapsed_time/1000 last_elapsed_time_in_mS,
qs.last_execution_time,
qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER BY qs.total_logical_reads DESC -- logical reads
-- ORDER BY qs.total_logical_writes DESC -- logical writes
-- ORDER BY qs.total_worker_time DESC -- CPU time
4) During this process Brent Ozar Sp_BlitzIndex sproc helped me to identify the lookups vs scans at the table level
My Experience with this issue is with IO contention, During the Log backup when the Check Point is issued, SQL Server is busy processing the incoming requests and these requests are taking longer time, so it has to drop the connections
March 20, 2017 at 12:46 am
This was removed by the editor as SPAM
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply