May 15, 2018 at 6:24 pm
Appteam complains about Timeouts connecting to sql server. No errors- eitherEventViewer or Sql server error log. Checked most "popular"waits (used query from Paul Randal site) - got this:
WaitType
| Wait_S
| WaitCount
| Percent
| AvgWait_S
|
PREEMPTIVE_OS_AUTHENTICATIONOPS
| 705730.94
| 21,999,839
| 63.3
| 0.0321
|
BACKUPIO
| 118039.45
| 13,748,352
| 10.6
| 0.0086
|
ASYNC_IO_COMPLETION
| 117950.15
| 18,258
| 10.6
| 6.4602
|
BACKUPBUFFER
| 108957.83
| 8,039,889
| 9.8
| 0.0136
|
MSQL_XP
| 13037.04
| 1,735,074
| 1.2
| 0.0075
|
Trying to identify- where Waitscoming from?
Or can we correlate Waits and Applications or Users app running under?
Or there is no way we can distribute total Waits by ... something?
Thanks
May 15, 2018 at 7:58 pm
Yuri55 - Tuesday, May 15, 2018 6:24 PM
Appteam complains about Timeouts connecting to sql server. No errors- eitherEventViewer or Sql server error log. Checked most "popular"waits (used query from Paul Randal site) - got this:
WaitType
Wait_S
WaitCount
Percent
AvgWait_S
PREEMPTIVE_OS_AUTHENTICATIONOPS
705730.94
21,999,839
63.3
0.0321
BACKUPIO
118039.45
13,748,352
10.6
0.0086
ASYNC_IO_COMPLETION
117950.15
18,258
10.6
6.4602
BACKUPBUFFER
108957.83
8,039,889
9.8
0.0136
MSQL_XP
13037.04
1,735,074
1.2
0.0075
Trying to identify- where Waitscoming from?
Or can we correlate Waits and Applications or Users app running under?
Or there is no way we can distribute total Waits by ... something?
Thanks
Do the connection strings enable "Multiple Active Result Sets"? If they do, or they have nothing about M.A.R.S. in them, change your connection strings to explicitly turn it off. We went through this a year ago and it was a humdinger of a problem. Not only did it start to cause rampant timeouts, but it also causes KILL/ROLLBACKs in SQL Server which seriously exacerbate the problem. It was causing 36 CPUs to jump to 80% CPU along with massive blocking where no one could login in via any means (which is one of the many reasons why I always keep and RDC session open on my production boxes) that would last anywhere from 30 to 40 minutes and, sometimes, would be followed in just another couple of minutes by another such episode.
Even if M.A.R.S. enable connections don't turn out to be your problem, turn it off to be sure for the future. We operated for about 3 years with no noticeable issues and then, one day, it reached a tipping point and crushed us.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 16, 2018 at 10:18 am
Thanks Jeff- will double check with Dev team. Good to know about your real life example-
kind of "Curious case of MARS enabled" 🙂
Do you have any comments regarding my Waits related question- can we correlate sql server Waits and ...something like app, user, dates, etc or it's server general info and no distribution is available?
Thanks
May 16, 2018 at 10:40 am
Yuri55 - Wednesday, May 16, 2018 10:18 AMThanks Jeff- will double check with Dev team. Good to know about your real life example-
kind of "Curious case of MARS enabled" 🙂
Do you have any comments regarding my Waits related question- can we correlate sql server Waits and ...something like app, user, dates, etc or it's server general info and no distribution is available?
Thanks
One really decent method is to download "sp_WhoIsActive" by Adam Machanic. You'll have to search for it because I wrote my own and so don't use his but it will help you find the code causing the waits. And, yeah... the M.A.R.S. problem caused a lot of waits even if it wasn't in a KILLED/ROLLBACK state but other code in need of a bit of tuning is also a major problem. On reasonable quality servers, it's usually not a hardware problem and a whole lot of people end up chasing ghosts. We went from a 16 cpu box with 128 GB of RAM to a 32 CPU box with 256 GB of RAM and then later upped it to 48 CPUs and 384 GB of RAM. The first upgrade only helped a couple of large batch jobs run only about 2 times as fast and did nothing for most of the code. The second upgrade did nothing for any of it.
As I've been known to say, "Performance is in the code" and Adam's sp_WhoIsActive will help you find which code to bring guns to bear on. A lot of it will be very low effort high ROI fixes and fixing the top 10 will normally make the code 10 to 1000 times faster and less resource intensive. Don't stop there though because the 10 after that are just waiting to become performance problems as the data continues to grow.
Again, make sure you keep your stats up to date and make sure you keep up with CUs/SPs on your boxes.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 16, 2018 at 11:09 am
Thanks again, Jeff for your time and tips. This "chasing ghosts" is what I try to avoid therefore trying to understand what/who contribute to this not very common wait type "PREEMPTIVE_OS_AUTHENTICATIONOPS" (as per info- can be Controller/DNS/Active Directory). Anyhow I have Adam script handy and will use it.
Best, Yuri
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply