I/O warnings in SQL Server 2005

  • Hi friends

    Recently the error log shows I/O stall warnings in my production server. This strangely started only Three days before. I have more than 50 warnings in my error log in 3 days. Mostly from tempdb mdf and ldf. The error log says

    SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file XXXXXXXXXXXXXXXX\tempdb.mdf] in database [tempdb] (2). The OS file handle is 0x0000000000000868. The offset of the latest long I/O is: 0x000000241a0000

    I collected the performance counters for Physical disk average sec/write and average sec/read since we have SAN.

    My total average is good (.001 Sec) but at times the sec/write spike went very high ( 14 sec).

    I read in some posts by Gail that the these values should be under 10 Milli sec and spikes under 50 Milli sec. Is there any tech net article about these values? Do I have a performance problem?

    I am particularly worried since this is the first time I see this in my Prod server.

    Thanx in advance

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Anyone?

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Were your physical disk counters for each individual drive?

    Spurious IO delays during high activity can be expected and really aren't an issue as long as they are very limited and don't cause problematic timeouts or other issues with your database applications.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks a lot.

    Sorry for late reply

    Were your physical disk counters for each individual drive?

    Yes. My counters where for each individual drives. I noticed that the spikes where more frequent to drive containing tempdb. However the io warnings are not seen now after fail over to secondary node.

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • What type of hardware are you running ? What type of SAN are you running ?

    There are a few combinations that are prone to this.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Is the TempdB on its own dedicated drive?

    How you run Profiler to determine long running queries? Some of these queries may be using the tempDb especially if temp tables are used or there are lots of sorts.

  • Thank you all for your help.

    But now the main cause of concern is no more there. I no longer get the IO warnings after a fail over. I think I need to monitor it a little more

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • This message means what it says. Some IO took a long time. It doesn't usually indicate a hardware issue in my experience, although it may do sometimes. If it does you'd expect errors in the system event log too.

    More usually when it occurs on tempdb it's because an auto expand is happening. You can make this less likely by decreasing the amount tempdb expands by each time. This should always be set in megabytes rather than as a percentage so you know exactly what it will expand by and how long it might take.

    If this is what happened the impact will have been that whatever process needed the extra space will have had to wait. Potentially other processes may have waited too.

    Providing nothing timed out, the processes will have all completed once the tempdb space became available.

    Tim

    .

  • Thanks Tim.

    I was wondering how it just stopped after restart.

    My tempdb is pre - sized. I think the messages might have started after it filled the initial size. I will better monitor growth in my tempdb.

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • If by pre-sized you mean it doesn't grow I agree with your assessment.

    It sounds as though it got full, one or more processes waited, space became available, they carried on.

    I also agree with your conclusion, monitor it more regularly and make it bigger if necessary.

    Tim

    .

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply