Interpreting high-read Perfmon results

Question

Post reply

Interpreting high-read Perfmon results

repent_kog_is_near

SSCertifiable

Points: 7278
More actions
October 27, 2009 at 10:00 am

#385657

Hello
I am trying to understand where is the leak in the bucket, so to speak in one of our system. It is a heavily read application. The memory seems to be ok (Available MBytes above 900).
Could you please give a high-level analysis of these numbers, and see if you see any obvious issues with the system. I have other parameters too but tried to get the main memory and disk numbers.
this was the average values from a 2 hours peak hour sampling this morning.
Thanks for your help
Dan.
---------------------
Committed Bytes In Use 59
Available MBytes933
Page Faults/sec667
Page Reads/sec1.32
Page Writes/sec0.75
Pages/sec16 (Pages Input/sec=4;Pages Output/sec=12)
Page Splits/sec4.66
Buffer cache hit ratio99
Free Pages2457
Lazy writes/sec2.86
Page life expectancy120
Page lookups/sec40326
Page reads/sec11403
Page writes/sec42
Stolen pages86586
Average Latch Wait Time (ms)12.53
Latch Waits/sec146
pagefile.sys % Usage Peak3.04
% Disk Read Time(Data)3656
% Disk Write Time (Data)5
Avg. Disk Queue Length(Data)37
Avg. Disk Read Queue Length(Data) 36.8
Avg. Disk sec/Read0.06
Avg. Disk sec/Transfer0.49

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

The sec/transfer looks a little high. Don't see any other problems at a quick glance

Do you have a performance problem or are you setting up a baseline for the system?

High level comment - you're monitoring too many counters. Information overload. Watch the important ones.

Download this podcast and listen to it. http://runasradio.com/default.aspx?showNum=81

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Animal Magic SSChampion Points: 13964 More actions · Answer 2

whats your disk layout? The disk queue legnth may/may not be an issue depending on how many disks you have

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 3

That appears to be a pretty high disk queue length to me. 🙂

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

David Benoit (10/27/2009)
That appears to be a pretty high disk queue length to me. 🙂

<shrug> Two points

SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem

If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.

Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

repent_kog_is_near SSCertifiable Points: 7278 More actions · Answer 5

Gail

This server was not monitored before; hence I am trying to create a baseline, at the same time, the users are complaining of slow reads in the reporting.

What majorly concerns me is the Disk Queue Length and also reads were taking close to 45ms last week.

Thanks for the link to the audio. I had seen you recommending in another post, and tried several ways to search for it on SSC, but could not find the right keywords to find it; glad you posted it here. I will have ability to listen to it in 6 hours.

'Animal Magic', the disk I mentioned to you has the data alone and is on the SAN. There are separate disks for log and indexes and tempdb.

Dan

repent_kog_is_near SSCertifiable Points: 7278 More actions · Answer 6

Gail

I just saw your post to David. So I will coast on the 'Disk Length' till after I hear the audio.

Dan

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 7

GilaMonster (10/27/2009)
David Benoit (10/27/2009)
That appears to be a pretty high disk queue length to me. 🙂
<shrug> Two points
SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem
If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.
Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.

Thanks Gail - I have the podcast ready to listen to shortly. Greatly appreciate you pointing it out. 🙂

The reason for my response was that his average was over an extended period of time. I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

Animal Magic SSChampion Points: 13964 More actions · Answer 8

im just up to the point in the audio which talks about the average queue legnth 🙂

Animal Magic SSChampion Points: 13964 More actions · Answer 9

David Benoit (10/27/2009)
GilaMonster (10/27/2009)
David Benoit (10/27/2009)
That appears to be a pretty high disk queue length to me. 🙂
<shrug> Two points
SQL is designed to drive the disk queue up. It's not unusual to have a high disk queue on SQL and, providing it isn't sustained, it's not a problem
If the underlying storage is a SAN, Disk queue is near-impossible to interpret meaningfully. There's just too much between SQL and the actual disks to say if the queue length it 'too high' or not.
Bottom line, the 2-per-spindle 'rule' just doesn't hold any more. Again, listen to that podcast.
Thanks Gail - I have the podcast ready to listen to shortly. Greatly appreciate you pointing it out. 🙂
The reason for my response was that his average was over an extended period of time. I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

Precisely the reason i asked the question what the disk subsystem was. however now we know its a SAN, as Gail (and the podcast) will tell you, its almost impossible to correctly evaluate the counter as you dont know how many disks are being used for the SQL processing. thats what it seems to be saying, which makes sense.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

David Benoit (10/27/2009)
The reason for my response was that his average was over an extended period of time.

And it could well have been that there were a couple of spikes to a few thousand and the rest of the time the queue length was under 5. We don't know.

I know my servers would puke with a avg queue that high but then again we have a LOT of IO.

It's not the queue length, it's how long it takes to service the queue. Take a supermarket analogy.

Two queues, one has 7 people in, one has 25. Which would you prefer to stand in?

Let's add some info and say that the queue with 7 people in is taking 5 minutes per person. The queue with 25 people is taking under 1 minute. Now which queue would you prefer to stand in?

Going back to a server, if the queue length is always 50+ but all IO requests are returned in under 5 ms, there's no problem.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

repent_kog_is_near (10/27/2009)
the users are complaining of slow reads in the reporting.

Then consider trying to optimise the offending queries, rather than looking at the hardware. If the queries are non-optimal, it's a waste of time trying to tune/replace hardware.

Start with profiler, find the worst performing queries, fix them, repeat until performance is acceptable or until you reach the limits of the hardware.

http://www.brentozar.com/archive/2009/10/bottlenecks-and-bank-balances/

http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1/

http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-2/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 12

Very true and great analogy. Thanks for the clarification.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

repent_kog_is_near SSCertifiable Points: 7278 More actions · Answer 13

Gail,

Thanks for the audio link. You would recommend to start with these basic counters for any server, too?

I asked my SysAdmin and she says that the SAN Storage for the server comes from a storage pool of 96 of 15k drives, utilizing a total of 94 spindles. Does that help to understand the (Disk Queue) counter value better?

Dan

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

repent_kog_is_near (10/28/2009)
Thanks for the audio link. You would recommend to start with these basic counters for any server, too?

Absolutely

I asked my SysAdmin and she says that the SAN Storage for the server comes from a storage pool of 96 of 15k drives, utilizing a total of 94 spindles. Does that help to understand the (Disk Queue) counter value better?

Not really. There's the fibre switch to consider, the SAN cache, the SAN controller, there's the difference between dedicated and shared drives (I'm betting those are shared), etc. Disk queue length cannot be meaningfully interpreted on a SAN, even if you do know the number of spindles overall. Stick with Avg sec/read and Avg sec/write as they tell you the actual latency of IOs

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass