Interpreting high-read Perfmon results

  • Hello

    I am trying to understand where is the leak in the bucket, so to speak in one of our system. It is a heavily read application. The memory seems to be ok (Available MBytes above 900).

    Could you please give a high-level analysis of these numbers, and see if you see any obvious issues with the system. I have other parameters too but tried to get the main memory and disk numbers.

    this was the average values from a 2 hours peak hour sampling this morning.

    Thanks for your help

    Dan.

    ---------------------

    Committed Bytes In Use 59

    Available MBytes933

    Page Faults/sec667

    Page Reads/sec1.32

    Page Writes/sec0.75

    Pages/sec16 (Pages Input/sec=4;Pages Output/sec=12)

    Page Splits/sec4.66

    Buffer cache hit ratio99

    Free Pages2457

    Lazy writes/sec2.86

    Page life expectancy120

    Page lookups/sec40326

    Page reads/sec11403

    Page writes/sec42

    Stolen pages86586

    Average Latch Wait Time (ms)12.53

    Latch Waits/sec146

    pagefile.sys % Usage Peak3.04

    % Disk Read Time(Data)3656

    % Disk Write Time (Data)5

    Avg. Disk Queue Length(Data)37

    Avg. Disk Read Queue Length(Data) 36.8

    Avg. Disk sec/Read0.06

    Avg. Disk sec/Transfer0.49

  • The sec/transfer looks a little high. Don't see any other problems at a quick glance

    Do you have a performance problem or are you setting up a baseline for the system?

    High level comment - you're monitoring too many counters. Information overload. Watch the important ones.

    Download this podcast and listen to it. http://runasradio.com/default.aspx?showNum=81

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • whats your disk layout? The disk queue legnth may/may not be an issue depending on how many disks you have

  • That appears to be a pretty high disk queue length to me. πŸ™‚

    David

    @SQLTentmaker

    β€œHe is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • David Benoit (10/27/2009)


    That appears to be a pretty high disk queue length to me. πŸ™‚

    <shrug> Two points

    SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem

    If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.

    Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail

    This server was not monitored before; hence I am trying to create a baseline, at the same time, the users are complaining of slow reads in the reporting.

    What majorly concerns me is the Disk Queue Length and also reads were taking close to 45ms last week.

    Thanks for the link to the audio. I had seen you recommending in another post, and tried several ways to search for it on SSC, but could not find the right keywords to find it; glad you posted it here. I will have ability to listen to it in 6 hours.

    'Animal Magic', the disk I mentioned to you has the data alone and is on the SAN. There are separate disks for log and indexes and tempdb.

    Dan

  • Gail

    I just saw your post to David. So I will coast on the 'Disk Length' till after I hear the audio.

    Dan

  • GilaMonster (10/27/2009)


    David Benoit (10/27/2009)


    That appears to be a pretty high disk queue length to me. πŸ™‚

    <shrug> Two points

    SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem

    If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.

    Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.

    Thanks Gail - I have the podcast ready to listen to shortly. Greatly appreciate you pointing it out. πŸ™‚

    The reason for my response was that his average was over an extended period of time. I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

    David

    @SQLTentmaker

    β€œHe is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • im just up to the point in the audio which talks about the average queue legnth πŸ™‚

  • David Benoit (10/27/2009)


    GilaMonster (10/27/2009)


    David Benoit (10/27/2009)


    That appears to be a pretty high disk queue length to me. πŸ™‚

    <shrug> Two points

    SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem

    If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.

    Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.

    Thanks Gail - I have the podcast ready to listen to shortly. Greatly appreciate you pointing it out. πŸ™‚

    The reason for my response was that his average was over an extended period of time. I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

    Precisely the reason i asked the question what the disk subsystem was. however now we know its a SAN, as Gail (and the podcast) will tell you, its almost impossible to correctly evaluate the counter as you dont know how many disks are being used for the SQL processing. thats what it seems to be saying, which makes sense.

  • David Benoit (10/27/2009)


    The reason for my response was that his average was over an extended period of time.

    And it could well have been that there were a couple of spikes to a few thousand and the rest of the time the queue length was under 5. We don't know.

    I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

    It's not the queue length, it's how long it takes to service the queue. Take a supermarket analogy.

    Two queues, one has 7 people in, one has 25. Which would you prefer to stand in?

    Let's add some info and say that the queue with 7 people in is taking 5 minutes per person. The queue with 25 people is taking under 1 minute. Now which queue would you prefer to stand in?

    Going back to a server, if the queue length is always 50+ but all IO requests are returned in under 5 ms, there's no problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • repent_kog_is_near (10/27/2009)


    the users are complaining of slow reads in the reporting.

    Then consider trying to optimise the offending queries, rather than looking at the hardware. If the queries are non-optimal, it's a waste of time trying to tune/replace hardware.

    Start with profiler, find the worst performing queries, fix them, repeat until performance is acceptable or until you reach the limits of the hardware.

    http://www.brentozar.com/archive/2009/10/bottlenecks-and-bank-balances/

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1/

    http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-2/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Very true and great analogy. Thanks for the clarification.

    David

    @SQLTentmaker

    β€œHe is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Gail,

    Thanks for the audio link. You would recommend to start with these basic counters for any server, too?

    I asked my SysAdmin and she says that the SAN Storage for the server comes from a storage pool of 96 of 15k drives, utilizing a total of 94 spindles. Does that help to understand the (Disk Queue) counter value better?

    Dan

  • repent_kog_is_near (10/28/2009)


    Thanks for the audio link. You would recommend to start with these basic counters for any server, too?

    Absolutely

    I asked my SysAdmin and she says that the SAN Storage for the server comes from a storage pool of 96 of 15k drives, utilizing a total of 94 spindles. Does that help to understand the (Disk Queue) counter value better?

    Not really. There's the fibre switch to consider, the SAN cache, the SAN controller, there's the difference between dedicated and shared drives (I'm betting those are shared), etc. Disk queue length cannot be meaningfully interpreted on a SAN, even if you do know the number of spindles overall. Stick with Avg sec/read and Avg sec/write as they tell you the actual latency of IOs

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply