I/O requests taking longer than 15 seconds to complete

  • Markus (11/7/2012)


    scogeb (11/7/2012)


    Markus (11/7/2012)


    Very difficult to tell you... if fluctuates.....

    The left side of the graph from 0 to 100 goes up and down but if I highlight one of the lines below the last and maximum do not agree with where the bar is on the 0 to 100 either. The bar graphs for all of the counters is all over the place.

    when I see the slow I/Os the avg. disk per sec.transfer is pegged at 100 for last it reports 12.5 does that mean 12.5 milliseconds?

    Ignore the left side of the graph, that's just a scale. If you highlight the counter on the bottom and Last says 12.5, that's 12.5 seconds and is not good.

    Is it possible tempdb is trying to grow but can't grow fast enough? What size is your tempdb, how much is in use? Otherwise, it looks like it could be a connection issue. Do your SAN guys monitor your server HBA too and everything inbetween? I would think they can only see the SAN performance. Is this fiber or ISCSI? Any switches inbetween? Run the counters and try a manual copy of a large file to that drive and see what the numbers are then.

    TEMPDB is 32 gig in size and there are two .mdf files. TEMPDB mdf and ldf are on their own drive letter and this is the drive that is getting the slow I/Os. It is not growing.

    I have no idea what they are monitoring but I will find out. Don't know if there are any switches in between either. I will try running the counters and doing a large file copy and see what happens.

    Thanks gang!

    Hmmmm, is tempdb trying to grow? Check your SQL logs. What is tempdb set to grow at? 10%? That would be 3.2 GB and would take a long time. Grasping straws here though. More likely a connection issue.

  • No. TEMPDB is not trying to grow. I have it set to grow at 500MB. I have run the UPDATE Stats and DBCC CHECKDB many, many times and there is virtually nothing else going on on this SQL Server yet. It is production but what other dbs that are on it are very small and online stuff that doesn't do much other than an insert or look up here and there. I sized TEMPDB to 32 gig from the get go so it wouldn't try and autogrow to keep fragmentation down. After I did that I ran the Windows defrag on that drive as well so it is in good shape before it went live production. The only thing on this drive letter is the TEMPDB files.

  • Markus (11/7/2012)


    Don't know if there are any switches in between either.

    A storage area network will have switches and lots of them 😉

    Can you supply details of throughput please from the counters i suggested?

    It's very doubtful your SAN guys will deal with server HBA config. Check the queue depth setting in use and verify any HBA driver versions with the vendor.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • if we had problems with the SAN or connectivity many other larger systems not on SQL Server would be complaining up and down and there has been none of that.

    Unfortunately a quote like this is probably going to lead to the problem.

    Sales people tell you that if you have a latency issue with a SAN, to throw more spindles at it. Unfortunately as you throw more spindles, this creates more space and systems people liek to fill the space.

    You need to understand everything which is going on with the SAN. Yes, the SAN people will carve out disks for you to use. Unfortunately in most cases these disks are spread across physical spindles which are being used for other things as well. Additionally, many SANS have a configuration which moves the most frequently used files to be placed in the most optimal locations on the drives. There is no distinction on file size.

    Now, are your tempdb's on the SAN? If so, that is bad. Tempdb is the most disk IO intense database you have. Are your transaction logs on the SAN? If so could they and the tempdb be moved to local drives? These moves will cause your IO on the SAN to drop drastically allowing for your database files to be able to work better. Are the databases set up to move automatically if it is determined that they should be locatedin a different position on the SAN? If so get that feature turned off. The SAN does not always move the busy blocks, but the whole file. In the case of large dbs, that will cause a lot of overhead. Having tempdb and t-logs moved off the SAN cleared up a lot of issues for us.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • I can't put the TEMPDB files on a local drive because this is a cluster.

    I think I found a smoking gun....

    I copied a 32 gig file from another server to the G drive on the SAN and took timings and looked at MB throughput. Then I also then file copied that file from the G drive to the H drive both on the SAN. Interestingly the file copy from G to H drive took three times as long and had the MB per second three times slower than from another server. A file copy from another server should be slower than a local copy.

  • A file copy from another server should be slower than a local copy.

    :hehe:

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • Markus (11/7/2012)


    I can't put the TEMPDB files on a local drive because this is a cluster.

    I think I found a smoking gun....

    I copied a 32 gig file from another server to the G drive on the SAN and took timings and looked at MB throughput. Then I also then file copied that file from the G drive to the H drive both on the SAN. Interestingly the file copy from G to H drive took three times as long and had the MB per second three times slower than from another server. A file copy from another server should be slower than a local copy.

    You're on the right track.

    While performing the file copy get the SAN and OS guys to monitor the systems to identify where the bottleneck occurs

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • When you copied the file from the G to H drive, the data was likely going back and forth on the same physical connection, hence the 3x slower result. Are there multiple HBA's or paths set up? There typically are. This could be be similar to two database requests being interspersed with each other. Hopefully your server and SAN support can shed some light which path(s) is/are being used and why.

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply