February 4, 2009 at 5:47 am
First our configuration:
Windows server 2003 R2 fully updated (64-bit) 32 GB of RAM and 4 Intel Xeon MP 3.1 GHZ processors. (Cluster set up .. 2 nodes, equal machines) (lots of disk space on a high-speed SAN)
SQL 2000 (32-bit) with SP4
Recently due to hardware limitations, we had to move from a 32-bit OS and 32-bit SQL 2000 to our new cluster (mentioned above) ... now performance is terribly slow and the server's CPU's are constantly pegged at 90%.
We've failed over the cluster nodes to attempt to see if it was the node we were on failing, but that didn't help. We've updated statistics and watched the current activiy but we're at a loss.
I'm a realtively newbie DBA and hope someone has an idea on where we can look next.
Thanks.
February 4, 2009 at 6:01 am
February 4, 2009 at 6:15 am
we did a detach, copy and re-attach essentially.
we have updated statistics and re-indexed.
I am not opposed to doing it again, but my question is can that be done without locking out the user? To do it now would take a VERY long time and I can't shut the server down for any amount of time (we run 3 shifts against it).
February 4, 2009 at 6:23 am
Is it truly SQL Server that's pegging the CPU's or some other process?
Have you identified certain queries that may be taking a long time with a trace or profiler?
How does your IO look? Just cause you have a "High Speed SAN" doesn't mean your SAN Admin set it up properly.
-Luke.
February 4, 2009 at 6:30 am
We have about 36 databases on that server (and thats all that runs on it) and all of them are slow. Also I don't have the numbers in front of me but I do know the disk I/O is low.
I've seen other posts about Hyperthreading and parrallelism, but the old server this was running on was hyperthreaded (albeit 32-bit) as well with no issue.
February 4, 2009 at 6:38 am
If there's even 1 query that's being repeated often enough which sucks up all the resources available to it, It will take all it can thus affecting your other databases as well. The only exception would be that you have them all separated into different instances with all kinds of resource governors in place.
Heh.. timely delivery of the simple-talk newsletter... Check out this article on SQL profiler and finding slow queries...
http://www.simple-talk.com/sql/performance/how-to-identify-slow-running-queries-with-sql-profiler/
-Luke.
February 5, 2009 at 8:06 am
Thanks all for the replies. So far I've been going through every possible setting and what not.
Right now here is where I am at:
Set the parallelism to 4 (default was 0 .. which used the 8 processors)
Had one query that was not used much with 4 million rows that wasn't using any indexes get corrected and this morning my CPU usage is averaging (per perfmon) 46% and the average latch watis/sec are averaging right around 300 (down from 3000) but for what seems are shorter periods of time.
My disk I/0 is good and the average lock timeouts/sec is right around 3 for an average.
I haven't heard anything from my end users about changes in performance yet .. but based off of the information from perfmon ... are those averages in "acceptable" ranges?
February 5, 2009 at 9:00 am
My understanding is that average lock timeouts/sec should be 0, and if it's not that means you have queries that are timeing out while waiting for locks... These should be evident in your trace as exceptions...
Have a gander at http://www.sqlservercentral.com/articles/Administering/performancemonitoringbasiccounters/1348/. It's mostly for SQL 2000 but includes links to other Monitoring articles that may help put some perspective on certain things...
Mostly though they all talk about counters being something that should be in a general range, and that you need to have a baseline first so that you have something to compare it to.
-Luke.
February 5, 2009 at 9:54 am
Thanks for the info ... and actually this is on SQL 2000 so that article is appropriate.
I would have thought the locks should be at or close to zero .. and I'm fairly certain we have queries that just aren't right ... but the hardware it is on I would think should be adequate to handle those .. but not sure what I really should be seeing with lock/waits.
February 5, 2009 at 10:41 am
latch waits are fine and to be expected because of the transactional nature of your or really any database. But you need to have a base line for which to compare them to. If they are normally at 300/sec and all of a sudden jump to 30,000/sec for an extended period of time then there's a problem somewhere.
Do you have performance metrics from your old server prior to the migration? You might want to look at how your current numbers compare to those?
What does your buffer hit ratio look like, are you seeing a lot of recompiles etc?
February 5, 2009 at 10:46 am
I don't have any benchmarks from the old server however, I had a query adjusted last night tht dropped the latch waits from 3000 to 160 today. But there are still performance issues going on.
My buffer hit ration right now is holding rather steady at 99.8
I am seeing a lot of recompiles .. is there something that can be done with that?
February 5, 2009 at 11:22 am
February 5, 2009 at 11:26 am
Hmmm .. I'm not certain ... (I wasn't the one who set this up) ......
I'll have to see if I can find this.
February 5, 2009 at 11:40 am
here's the KB Article...
February 5, 2009 at 11:44 am
I just checked and that hotfix is on.
Right now I'm at a loss .. I just don't see any reason why the performance should be so bad.
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply