March 21, 2014 at 9:26 am
Monitoring Tools Used - Quest Spotlight, SQLPingv14
Problem - between :01 and :06 of every hour we are getting Response time readings of 5k to 35k BRIEFLY on all of our monitored instances (Some clustered, some stand along, some vm, some physical)
Apparently we have been getting these warnings for a long time but as they were classified as low, cleared almost instantly and the collection was done every 5 minutes so we would go hours without seeing a message we never noticed it being a pattern or something of concern. Recently we rolled out a new application that does it's own response time check and if it failed (I think it was set to 5 seconds by default) it reset it's connection pool knocking everyone off the system. We have since raised the timeout to 30 seconds to keep the users logged into that app.. but that awoke us to this problem. Checking with the networking guys they said they turned on sniffers and saw no bump in network traffic between both that app server and the database or from the monitoring tool and the db. Checked with our storage guys .. they too saw now spike in I/O at that specific time. We thought it might be the Monitoring tool doing something hourly (thinking it was flooding the network .. even though we saw nothing flooding the network on the sniffer) so we disabled that service and rang SQLPing against a sampling of instances .. and with that tool we saw occasional spikes in Response time on a couple of different instances.. but it didn't seem as drastic as what the tool was reporting.
The tool does a Select 1 against the instance to measure round trip
the sqlping creates a temp table in tempdb and queries that table for its response time check.
At this point I am just at a loss of what to check next or how to continue troubleshooting this issue. It seems to me there is SOMETHING going on at the top of the hour and it would appear it would have to be Network/Disk/ or Monitoring Tool ..
Just hoping someone has some recommendations on where to look to next.
Thanks!
March 21, 2014 at 9:36 am
Is there a transaction log backup or differential backup running hourly? Perhaps there is also be some IT-related process outside the server that's running on a schedule...
What jobs run during the time frame you notice this?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
March 21, 2014 at 9:48 am
We do have transaction log back ups that happen at the top of the hour .. but some are every 2 hours other some start on the :10s or :20's and they are all going to different backup locations on different disks on the san. We make an effort to ensure that no TLOG backups start at approximately the same time (unless the db is extremely small) To be clear we have 25 or so instances that are all experiencing the same behavior but for the most part these are small databases with relatively low usage compared to most large shops.. We have no jobs that run on all of these db servers scheduled for the same time.
March 21, 2014 at 9:49 am
And that was part of my question as well that there might be another it-function outside my view .. I was hoping someone might have something to kind of point in that direction some how..
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply