September 7, 2007 at 12:50 pm
We just built a new database server to replace an existing server. The configurations are below. When I restore a 2GB database on the old server it completes in about 1 minutes and 28 seconds. When I perform the same restore on our new server it completes in about 6 minutes and 28 seconds. If I perform a large file copy from one server to the new server and to the existing server the copy times are nearly identical. I've spent a day troubleshooting this and need some fresh suggestions on how to identify the problem. Any suggestions?
Old Server
Windows Server 2003 Standard Edition, SP1 - running SQL Server 2000 SP3a with Hot Fix - MS03-031
(4) 3GHz CPUs with 4 GB Memory utilizing /3GB switch
Data Drive is RAID 5 on 15k SCSI drives with a dedicated controller, having 576MB of write cache
Log Drive is on a shared controller with the C drive having 192MB of write cache. Both Drives are RAID 1 on 15k SCSI..
Backup Drive is on a shard controller with the tempdb drive, having 576MB of write cache. The backup drive is RAID 0 and tempdb is RAID 1, with both on 15k SCSI.
New Server
Windows Server 2003 Enterprise Edition, SP2 - running SQL Server 2000 SP3a with Hot Fix - MS03-031
(4) Dual Core 3GHz CPUs with 14 GB Memory utilizing PAE and AWE (No /3GB switch)
Data Drive is RAID 10 on 15k SAS drives with a dedicated controller, having 512MB of write cache
Log Drive is on a shared controller with the C drive having 256MB of write cache. Both Drives are RAID 1 on 15k SAS.
Backup Drive is on a shard controller with the tempdb drive, having 512MB of write cache. Both drives are RAID 1 on 15k SAS.
We just built a new database server to replace an existing server. The configurations are below. When I restore a 2GB database on the old server it completes in about 1 minutes and 28 seconds. When I perform the same restore on our new server it completes in about 6 minutes and 28 seconds. If I perform a large file copy from one server to the new server and to the existing server the copy times are nearly identical. I've spent a day troubleshooting this and need some fresh suggestions on how to identify the problem. Any suggestions?
Thanks, Dave
September 7, 2007 at 3:19 pm
My first attempt would be to run SQLIO or SQLIOSim on both to verify throuput of the drives... I have had controllers "surprises" many times.
Secondly you mention "RAID5" vs "RAID10" but how many disks ?
* Noel
September 10, 2007 at 7:56 am
I'll read about running SQLIO or SQLIOSim. I've never run them before.
The data drive on the new server is RAID 10 with 10 15k 36GB SAS drives. The old server's data drive is RAID 5 and consists of 3 15k 73GB SCSI drives. I've also tried copying the backup file to other drives such as the log file drive and the C drive and then restored to a database also on those drives. The performance was still bad.
Dave
September 10, 2007 at 8:20 am
Are you restoring a new db on both servers? Or an existing one on the old server? Also, are these from local disk for the backup file?
Does seem a little strange to me, but not sure what's wrong. Once it's restored, does it query quicker?
September 10, 2007 at 9:46 am
I took a nightly backup from disk on the production server and copied it to the new server. For comparison I restored the backup on the production server to a new database and did likewise on the new server. I performed the same test on a third server that also utilizes SAS drives and found that server experienced better restore times then on production. This was what I expected to see on our new server. Very odd.
October 3, 2007 at 11:46 am
Not that I know much about drives...however I experienced something similar to this once and it was due to the fact that when the new server was created and the disc was allocated it was formatted with a "quick format" vs. a "full format"
From what I was told by our infrastructure team was that could cause a performance impact on disc activities until the disc was fully utilized...something about pointers to the data.
It was correct though because once we utilized all the space on the drive the performance increased and came inline to what we were previously expecting.
Just a thought...hope this isn't too confusing...
October 3, 2007 at 12:06 pm
I did not know that. That's something I will keep in mind. One of our server guys fixed the problem by updating various drivers/firmware. He also addressed an issue reported by the NIC, which someone on another forum suggested may be the cause of the problem. We're not sure which fix corrected the problem, but the performance is much better. The only thing I notice now is that using AWE appears to create a bit of a performance hit while data is initially cached. Once cached performance is very good. If I restore the same database 6 times in a row, the first restore takes between 33 and 37 seconds. Restores 2 - 6 take 26 to 27 seconds, with an average of 26 seconds. If I remove AWE the first restore takes 28 - 33 seconds and restores 2 - 6 take 25 to 27 seconds, with an average of 26 seconds. I was a bit surpised to see AWE responding like this.
Thanks, Dave
October 4, 2007 at 6:12 am
Hi, just out of interest, why haven't you set the 3gb switch on the new server?
October 4, 2007 at 6:42 am
I tried to, but the OS would not load when the /3GB and PAE were set at the same time. We called Microsoft and was told by an OS engineer and a SQL Server engineer to avoid using /3GB and /PAE together. I told them Microsoft has a few articles indicating this was an acceptable practice, but both noted problems they have seen with /3GB and /PAE used together and said they have internal documentation indicating this combination should be avoided. The SQL engineer said the benefits received by the /3GB do no outweigh the overhead it places on the OS due to the reduced amount of memory allocated to kernel processing. I've wondered about this in the past, but this was the first time someone from Microsoft confirmed that it could present more problems then it solves. As soon as we removed the /3GB switch the OS loaded and everything looked good. I also had someone on another forum indicate they've experienced problems when using the /3GB on a server with 12GB of memory.
Dave
October 4, 2007 at 7:06 am
Thats interesting, I have both set on about a dozen servers and all are fine with it. I didn't think the OS would allow anything else running on the server ie SQL to obtain more than 2gb of memory without the 3gb switch regardless of the use of PAE and\or AWE. So if you dont have this set is SQL using more than 2gb memory?
October 4, 2007 at 7:15 am
/3GB deals with virtual address space, which is different then PAE, which allows you to use more then 4GB of physical memory. Using /3GB allocates 3GB of virtual address space to user processes and 1GB to kernel processes. You can reduce the amount of virtual address space allocated to user processes by adding /USERVA and setting it to a value under 3GB.
In my case I made certain SQL Server was the only software running by disabling software like Trend, Cisco Security Agent and Altiris. I even ran filemon to verify if anything was running. For whatever reason the OS did not want to load when /3GB was used. It may have been a driver/firmware issue with the DL580 server, but it wasn't worth researching any further after we spoke with Microsoft. We did confirm our drivers/firmware was up-to-date.
Dave
October 5, 2007 at 11:20 am
Is your new server running SQL Server under a domain account or under LOCAL SYSTEM? Is the user System Admin?
Maybe you are not taking advantage of Instant File Initialization. You can only do that if your user has the rights to Preform Volume Maintainance Tasks (A domain account would not have that by default unleess the user is and Admin on the local machine. Not recommended best practice).
October 5, 2007 at 12:30 pm
The SQL Server service account is a domain account with local administrative permission.
October 6, 2007 at 11:20 am
Ok, then you will have no problem there. But you should think about removing the domain user from the local administrators group..
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply