August 20, 2010 at 9:00 pm
Hello Gurus,
In our environment, we have a new server with 24 processors, 12gigs memory. This is a win 2008 server with Operations Manger 2007 CUI and SQL 2008 on it. The other day, I was gathering performance metrics of this server and have observed Committed bytes close to 14 gigs. I think I have a memory bottleneck in my environment. However, I tried to gather some more metrics, and for my surprise, context switches and system calls are very high :w00t:. when i read the technet documentation, it said, context switches should not be more than 10k and system calls should not be more than 20k per sec. looking at the numbers, obviously, they are high on my box. Also, surprisingly, the processor time is logging very low constantly below 18%. I am not sure, where to start with. I have already requested for extra memory. But I highly doubt, if this weird behavior of context switches is because of high memory. or may be i am wrong.
Please suggest me where to start with performance tuning.
Memory
% Committed Bytes In Use 56.857
Available MBytes 192.000
Cache Bytes 420,339712.000
Cache Faults/sec 574.442
Committed Bytes 14.736,429,056.0000
Page Faults/sec 29,769.090
Page Reads/sec 12.987
Page Writes/sec 0.000
Pages/sec 82.919
Pool “loripaged Bytes 150.499,328.000
Processor _Total
% Processor Time 18.116
System
Context Switches/sec 64,350.506
Processor Queue Length 0.000
System Calls/sec 3 16,580.555
Threads 2,129.000
quick replies to this thread are highly appreciated.
August 24, 2010 at 12:36 am
Hi,
Just the numbers tell nothing, it just depends on the system usage. To be honest: I never care about context switches.
You should rather have a look at your buffer cache hit ratio. If it doesn't droppes below 95-98% for a longer period (more than 10 min in a 24hour cycle) your sql server should not suffer from a memory shortage.
regards
Andreas
August 24, 2010 at 8:35 am
please read the following on cache hit ratio.
generally I dont look here to often. but it is good to understand it!
Please take a look at Page Life expectancy, This should be a min of 300 but better to be above 1000 to be safe....this will tell you how long each page stay in the buffer pool. so 300 sec is about 5 min.
if your numbers are looking good here my guess you are good to go
August 24, 2010 at 1:12 pm
Hii... Thank you for your replies 🙂 . My buffer cache and page life expectancy are above the threshold limit. SQL is limited to use 8 gigs on this box. As per the 24hr observation, the total server memory is less than the target, Buffer chache is constantly above 95% and page life expectancy is never below 1000. which really pacifies me that there are no issues with SQL. But, my only concern is with the box.
As a DBA, I know it is not my area of expertise, But I was trying to understand why this would happen.
Again thank you very much for your replies.
Ali:-)
August 24, 2010 at 2:32 pm
Not sure of your question, Do you still have a question about context.
The # is actually about 5,000 per cpu. So it seems that you are good.
http://technet.microsoft.com/en-us/magazine/2007.10.sqlcpu.aspx
August 24, 2010 at 10:00 pm
I suspect the answer may be in your I/O subsystem, as various people here have commented that the CPU is not over-stretched and you are reported a 0 CPU Queue length.
Ultimately, adding more memory to the system will only help if
a) SQL can use that memory
b) ALL the relevant data is cacheable
If you can achieve both of those, you will see remarkable performance as everything will be in memory!
Otherwise, review your disk subsystem! With the kind of performance numbers you are showing, RAID 5 is your enemy, and RAID 10 (Note - 10 not 01 - there is a difference!) is your friend despite the burden it places on capacity.
August 25, 2010 at 7:06 am
The most important things to do are file IO stall analysis and wait stats analysis. That will tell you what SQL Server is having an issue with, and that is all that really matters.
I highly recommend you get a performance tuning professional in to give your system a review and mentor you in how to do the same.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 25, 2010 at 1:23 pm
@@ Toby Harman
Ultimately, adding more memory to the system will only help if
a) SQL can use that memory
b) ALL the relevant data is cacheable
If you can achieve both of those, you will see remarkable performance as everything will be in memory!
Otherwise, review your disk subsystem! With the kind of performance numbers you are showing, RAID 5 is your enemy, and RAID 10 (Note - 10 not 01 - there is a difference!) is your friend despite the burden it places on capacity.
My Total Server Memory is crusing around 7gigs and my target server memory is 8 gigs. Especially, when buffer cahce hit ratio is above 97% and page life expectancy is above 1000 ~1500 Does this mean, SQL is doing fine?.
Below are the counters. As i specified in my earliest post, This box has SQL2008, OPS Manager 2007Rs and Reporting Services. This is a Windows 2008 Box.
Logicak)isk _Total 0 ft E: F:
Avg. Disk Queue Length 0.000 0.000 0.000 0.000 0.000 0.000
Current Disk Queue Length 0.000
Disk Bytes/sec 0.000
Disk Read Bytes/sec 0.000
Disk Reads/sec 0.00o
Disk write Bytes/sec 0.000
Disk writes/sec 0.000
PhysicalDisk -Total Oic:tf 0cD.#1 OC:D:#2 0C:D:#3 0c:D:#4 1uEF:G:l
Avg. Disk Queue Length 0.000 0.0oo 0.00o 0.000 0.000 0.0oo 0.00o
Current Disk Queue Length 0.000 0.00o 0.oo0 0.00o 0.00o 0.00o 0.000
Disk Read Bytes/sec 0.00o 0.0o0 0.o00 0.00o 0.00o 0.0oo 0.o00
Disk Reads/sec 0.000 0.000 0.000 0.000 0.000 0.000 0.00o
Disk write Bytes/sec 0.000 0.000 0.000 0.000 0.000 0.0oX) 0.0oo
Disk writes/sec 0.000 0.000 0.000 0.000 0.000 0.000 0.00o
Processor _Total
lb Processor Time 1.298
ReportServer:Service
Bytes Received Total 340,338.000
Bytes Sent Total 2.755,802.000
Errors Total 363.000
Memory Pressure State 1.000
Requests Disconnected 29.000
Requests Executing 0.000
Requests Rejected 0.000
Requests Total 533.000
Requests/sec 0.000
Tasks Queued 0.000
sQL Server:BufIer Manager
Buffer cache hit ratIo 100.000
Lazy writes/sec 0.000
Page life expectancy 1.130.000
Page lookups/sec 18,308.528
Target pages 1,024,000.000
Total pages 1,024,000.00o
System
Context Switches/sec 15.925.063
August 25, 2010 at 1:49 pm
No worries...total sql server memory is the amount sql is using. Target is the amount sql would like to have.
If total become larger then Target you will see page life expectancy drop. Then you know you are under memory pressure.
Kevin is correct that if you really would like to performance check start on wait stats...lots of dmv out on the web to do that.
I would load sql performance dash board. There you will see waits....just click on the link and it will run the dmv for you in the background to tell you what your largest waits are.lots of other good things there as well...my guess you do not have anything out of the ordinary. Since you PLE is high is not likely that you have BUFFER or IO wait.and cpu are arounf 18% which tells me its unlikely you have high waits for CPU.
August 25, 2010 at 6:09 pm
97% buffer hit rate is decent. 3 reads out of every 100 are coming from the disk.
So in a large table with an index depth of 3, there will be up to 4 reads per record (1 for each level of the index and 1 for the record)
So for every 25 record reads (100 actual reads), three of them is coming from disk.
Increasing that buffer hit rate by 1% will mean that only 2 reads out of every 100 come from disk.
So by increasing the buffer hit rate by 1%, you decrease your record reads by 33%. Scary math! :w00t:
In a highly performing transactional environment I would expect to see buffer hits in the high 98-99.99% range. With the proliferation of data that is hard to achieve, but if it was easy, they'd let anyone play!
August 25, 2010 at 7:54 pm
Toby,
please read the following...you are misunderstanding buffer cache numbers
August 25, 2010 at 8:09 pm
So you are saying that buffer cache numbers should not be read in isolation? I am shocked! :ermm:
Yes, Page life expectancy is a factor, and that had already been covered by other posters, so I wasn't trying to elaborate on that. I was illustrating a particular point about the small change in a buffer cache % having a significant impact on overall reads.
Ultimately, your time to live for a buffered record is a function of the total number of reads and the buffer hit rate, the same as most other cache strategies that use a Least Recently Used chain.
I still stand by my original assessment that the I/O subsystem needs a check and that adding memory will give you amazing results but is (ultimately) a game that we, as DBAs, can't win because there is always more data!
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply