September 22, 2009 at 8:43 am
We're bringing a new system on line, and some users are complaining of some slowness. This is on SQL 2005, using a Citrix front end. Server is Windows 2003 on an ESX VM, has 4 GB memory, a 3Ghz CPU. Data files are on a fiber SAN, data on one LUN and Log files on another, both raid 5. This is an OLTP system.
During testing yesterday we ramped it up to ~ 200 connections. The CPU averaged roughly 45% (we're adding one more today), the buffer cache hit ratio stayed pretty much above 99%.
The disk/sec read was averaging ~ .019 and the disk/sec write was ~ .003 for the data drive. The disk queue length for the data drive was averaging around 1.5.
The Full Scans/sec were running about 40 average and page splits/sec about 50. The VM admin noted that there seemed to be a constant, large number of reading going on with the D: (Data) drive.
So my question: What is the best way to narrow down what is generating those reads? Should I use profiler for that or would one of the dmv's be more appropriate?
Thanks
September 22, 2009 at 11:44 am
2 things.
1. what is your page life expectancy, should be around 300 plus. meaning a page is staying in cache for longer then 5 min.the higher the better, that number and hit ratio that you have will help
2. if you have the dash board loaded object execution statistics..this will give you a quick idea of object is is doing all your reads.
3. last question is who big is DB? if you are seeing lots of reads and you page life exp. is high then those reads may not be coming from the DB.
the rest of numbers look pretty well. it can get a little trick running of VM and sql if your database it really large...
September 29, 2009 at 10:23 am
SSC,
Thanks for the suggestions. We've since moved the instance to another VM with Windows Server Enterprise, 8 GB of memory. The avg page live expectancy went from 30 to a rather ridiculously high number, which is good.
The database is 13 GB, not earth-shattering. The I/O seems ok from a point of responsiveness, the response time is fast (they are SAN RAID 5 using fibre) and the drive queue is low.
Some of the load may be coming from the Crystal Reports that are being run, I'm tracking those via Profiler. Some of the queries are a bit nasty..i.e. 8 outer joins. Part of the issue too, though, is that I think the database is in serious need of normalization, which is a whole other issue to deal with the developer on.
Our main issue right now is we're getting occasional blocking on one particular table. It happens normally under full load (around 11:30 am to 5 pm) once or twice a day. I've enabled the 1222 traceflag to get better info and work with the developer on that.
In general, responsiveness is now much better (we also added additional citrix boxes and add'l processors to those).
I'll take a look at your suggestion re: the dashboard.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply