Error

  • I get this error in my SQL Error Log, this is the time when my Litespeed full backup job runs in the night. There is no heavy activity at this time so why a timeout error. Its a SQL 2000 Cluster environment with a DB of size over 300 GB.

    Its been happening for a week now, and there wasn't any changes made for this to happen.

    "SQL Server has encountered 655 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [M:\SQLData\Derivative_Prod_5_Data.ndf] in database [Derivative_Prod] (5). The OS file handle is 0x00000660. The offset of the latest long IO is: 0x00000b217d0000"

    Thanks in Advance!!!

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • Did you have other jobs running against this server?

    Did you backup your data to a tape drive? How many jobs running against this server?

    If you answer is YES. The timeout is reasonable. The solution is to reduce the number of jobs running at the same time.

  • There are many a jobs running on this server. But they've been there for quite some time now. Also I noticed one more thing that the Tlog backup(hourly) took 29 hours and still did not complete and the full backup (nightly) took 14 hours and still did not complete. One reason could be bcos the backups steps that have a step name 'wait for other process' so may be they were waiting for the other to end. Its only after I cancelled the Full backup that the Tlog succeeded. So let me try cutting down on the number of jobs.

    Thanks!!

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • A single IO (64kb) taking longer than 15 seconds is never acceptable, no matter what may be running. What that warning means if that in the course of 2 minutes (I think), SQL issued 655 IO requests for the secondary data file to the operating system that had not completed 15 seconds later.

    Generally this points at an IO bottleneck problem, though it can also be problems in the IO stack (like antiviruses)

    Please run a perfmon trace for a few hours and capture the following counters.

    Physical disk: Avg sec/read

    Physical disk: Avg sec/write

    Physical disk: % idle time

    Physical disk: Avg disk queue length

    What are the min, max and avg values that you see?

    What's your disk setup? SAN, direct attached RAID? What levels and what files are on what drives/arrays?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • MS SQL Server, I believe it is the same to other database servers, has a limit to run jobs from SQL Agent. If we schedule many jobs starting at the same time, we may get intermittent failures.

    Just experience.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply