Avg. Disk Queue Length from 0.025 to 310.8 to 0.025

  • Hi everybody,

    First of all, Happy Holidays! I am new to Server Central, and I am in need of help.

    Something is causing my SQL2000 sever to stall for a few seconds. Here is the scenareo:

    There is only one app issuing a SP (hundreds of times). It usually starts at a nice rate of of about 200 item/sec, but suddenly it drops at 1 item/sec.

    I have verified:

    No locks of any kind.

    No file growth during this task for either the Transaction Log or the DB File

    (both have plenty of space available)

    No task scheduled for SQLAgent to perform.

    No Alerts present (including Demos).

    Only one user is connected to the DB.

    Is a dedicated machine, no other processes are running. Is has 1.5Gbs RAM and total consumption is of 710MB at all time (including OS) (does not change event during the stall)

    I have checked a number of counters using the System Manager (the most mentioned ones I saw online) and the only one that had values beyond those "recomended" was the {Avg. Disk Queue Length}. Its values raged from 0.025 to 310.8 and then back to 0.025. The pick only happens for a few seconds (10 to 25), but its impact on the server is nasty! Everything stalls. The performance drop never happens at the same "moment" during the execution of the task.

    The SP does the following:

    Begin Tran

    Inserts one row only in table1

    Updates one row only in table2 (using params and the @@Idetity generated by the Insert)

    Updates one row only in table3

    Commit Tran

    The three tables involved have indexes. I tried running the task without the indexes and it works at steady 2 items/sec.

    And finally, I do not know what other piece of information to provide.

    Can enybody help.

    Thanks a lot for your time gentlemen.

  • What kind of RAID array does your server have? Have you verified all disks in array are good?

  • You've done a lot of what I'd suggest. I would look at the disk system, I wonder if you've got hardware issues.

    The other thing I'd check is do you have a clustered index on any of the tables and does your insert/update result in a page split? You can check page splits/sec, and see if when the disk IO jumps, do the page splits jump. That could result in nasty IO issues.

    And it wouldn't be consistent. You could lower your fillfactor as well and see if that helps.

  • Sounds to me like SQL Server is performing a CHECKPOINT (flushing dirty pages to Disk) when you get your disk queue spike.. You can use Perfmon to verify by checking the "SQLServer:Buffer Manager\Checkpoint pages/sec" counter.

    As above I'd recommend a RAID Array that is optimal for writes (if its a write heavy database) e.g. Raid 1+0, 0+1..

  • sounds indeed like a CHECKPOINT (sudden spike)

  • Will look into it! thanks.

  • Make sure DMA is enabled on the drive, if it's an IDE/EIDE drive. I used to check by timing a copy of a large file; on modern (say 80GB and up) drives, it should go at least 8 to 10 MB per second; if it's under 5MB per second, then I suspect DMA isn't enabled. Control Panel, System, Device Manager, should tell you, properties of the drive. Also, if your log is not on a separate disk, I think that could be a reason. In that case, I see stalled IO warnings in the SQL logfile (IO writes that take greater than 15 seconds).

  • Good call on the CHECKPOINT. Didn't think of that.

    Please let us know if you find anything.

  • Will do gentlemen. Thank everybody. going away for a few days. Happy holidays!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply