May 16, 2008 at 10:00 am
Hello,
I'm wanting some general guidelines for troubleshooting a slow server. We have an HP DL380 G5 server. Windows Server 2003 x64 Standard, SQL Server Standard 64-bit (9.0.3054). 2 - 2333 Mhz HT processors, 4 gigabytes of RAM. There are 6 - 142 gig internal drives split in half giving us 2 RAID 5 partitions, one for Data and one for Logs. -- Call it server A.
We have another very similar server, call it server B. The differences are 3 of the internal drives are 72 gig used for Logs (R5) and the processors are 3000 Mhz.
The lock pages in memory privilege has been given to the account running SQL on both servers. No other configurations have been made to either server other than the defaults. That I'm aware of. The OS on each are mirrored on separate drives.
In general, queries running on server A are slow. For example when nothing else is running on the server I can run a simple Select count(*) from tbTable with(nolock). A table with 3,869,319 records took 2:23 on server A while a table with 4,116,259 records on server B took 1:08. The query plans for this are pretty straight forward and look the same.
The network admins have run some health checks and found one drive reported errors on server A. That has been replaced but we get the same results.
Any help would be appreciated. Thank you.
May 16, 2008 at 10:34 am
I have always found that having plenty of RAM is the first key. With on 4GB of RAM, your SQL Server may be suffocating the OS. I would start by adding more RAM and make sure the max memory setting is set correctly so SQL Server cannot consume all of the memory in the box.
May 16, 2008 at 11:07 am
It may be that the pages for the table on server B are already in RAM but must be pulled from disk on server A.
To make it a fair test on both machines, run the following before the query:
CHECKPOINT
GO
DBCC DROPCLEANBUFFERS
GO
This will flush the dirty pages to disk and flush all data pages from RAM. This will give you a valid test.
The other thing to look out for is whether or not each table has the same definition. Clustered Index existence and sizes can make a HUGE difference and rows/page can also make a difference. If table A fits 100 rows / page and table B fits 40, table A could have 2x more rows and still load more quickly.
Kyle
May 16, 2008 at 12:57 pm
Have you tried looking at your perfmon counters to see where your bottleneck might be? That's always the first place to start.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant ā http://www.MidnightDBA.com/DBARant
May 16, 2008 at 1:29 pm
We just added 2 gig of memory but I haven't had a chance to configure the max memory for sql. Is leaving 1 - 1.5 gig enough for the OS?
I'm also copying some tables to server B to setup a more fair test.
I've always had trouble figuring out perfmon data so it's hard for me to pick the correct counters š It does appear memory is a bottleneck. At times while a couple of other jobs are running at the same time, I'll see CPU pretty well maxed out. These jobs run wget to import files unzip them and pull them into sql. After that, updates and such will run against that data.
May 16, 2008 at 1:38 pm
Perfmon data can be difficult.
Look at
Memory\availableMB to tell you how much memory you have left.
PHysicalDisk\avg. disk queue length to ovbiously tell you if you're disks are over-worked.
System\cpu queue to tell you if cpu is over worked...
there are many others too, but let's start there with a broad sword and work our way down.
that'll show you the basics of where to focus your energies.
and y, i'd leave 1.5GB for the OS... 1 is ok, but it's better with 1.5 or you'll risk starving the OS.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant ā http://www.MidnightDBA.com/DBARant
May 16, 2008 at 7:42 pm
Returning stuff to the screen is the worst way to test this. The video systems on each may be a lot different, etc, etc. Find a better way to test the "speed" of the server.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 16, 2008 at 9:53 pm
If you use PerfMon (and I do recommend it), then be sure to include these in addition to those already mentioned:
PhysicalDisk: %Idle and Disk Transfers/sec for each volume visible
Memory: Page Faults/Sec, Page Reads/sec
Processor: %Processor Time
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
May 19, 2008 at 1:46 pm
Here are the results of Perfmon for the 2 servers. I also set statistics IO on and pasted those results below each. The graphical execution plans were still identical. For this run Server A took 4:14 and Server B took 3:02. This was the third execution on each server. Iām not sure about running dbcc dropcleanbuffers, these are both production servers.
Pasting the results didn't go so well. I hope you don't mind looking at the attachment.
Thanks.
May 19, 2008 at 1:51 pm
Geez dude... table1 looks like it's severely fragmented... or severely under-indexed. To have I/O stats that far out of whack there's something going on there. That's what I would check first... fragmentation, or indexes.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant ā http://www.MidnightDBA.com/DBARant
May 19, 2008 at 2:00 pm
It appears that your collection times were significantly larger than your run times. Since you have posted many of the statistics as rates we will need the length of the two collections.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply