May 3, 2011 at 6:46 am
Here is the situation. I have a SQL Server 2000 Production Database running on a Windows 2003 Cluster. Every night we backup the database to another server which has the identical configuration except that it is not in a cluster. We then restore the database to use as a reporting copy. The database is about 300GB and we use Litespeed as our backup/restore tool. Until recently the backup would take about 30 minutes and the restore was taking around 30 minutes. About a week ago the restore suddenly began taking over 2 hours to run. Since this reporting copy is a source of our Data Warehouse ETL processes, we have a very tight window for getting the restore done.
More data:
This increased restore time only occurs during its normally scheduled window which is a start time of the backup at midnight. If I run the entire process at another time, say 10PM, the restore again executes in around 30 minutes. The only process running at the postmidnight restore is the restore process. I have confirmed this by checking sysprocesses and the only connections are from the restore. There are no other server processes (at least none that I can detect) running during the slow restore. I even ran SQLioSIM during the slow restore, using test files in the same location as the target database and got excellent IO durations.
Any thoughts, suggestions, ideas? Anything will help. I can't continue staying up until 3AM every night babysitting this process.
Gordon Pollokoff
"Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones
May 3, 2011 at 6:56 am
Either your disks are getting trashed doing something else or maybe the network is down to 100 mbps rather than 1gbps for some weird reason (assuming you restore from network drive).
Other odd reasons I've heard over the years :
AV
Screensaver (was taking 100% cpu)
Anything else you can think of that could be overly using the disks.
May 3, 2011 at 6:59 am
1. To clarify some points, the restore is from the local server drive. The drive is SAN storage connected via a 2GB fiber optic connection.
2. I had AV turned off for the most recent execution and saw the same behavior.
3. The only thing on the disks are database files and could not detect any other processes accessing any of the files.
Gordon Pollokoff
"Wile E. is my reality, Bugs Bunny is my goal" - Chuck Jones
May 3, 2011 at 7:05 am
2 GB connection... ya but what I've seen was the windows "network" connection to the san had dropped from 1Gb to 100 or 10 MB and the information just couldn't flow correctly.
1 more thing I have in mind is that the amount of data didn't change, but maybe you have a huge file size change maybe the san has to work overtime to reallocate space for the restore... but that should only happen once... not everytime.
May 3, 2011 at 7:07 am
Gordon Pollokoff (5/3/2011)
1. To clarify some points, the restore is from the local server drive. The drive is SAN storage connected via a 2GB fiber optic connection.2. I had AV turned off for the most recent execution and saw the same behavior.
3. The only thing on the disks are database files and could not detect any other processes accessing any of the files.
Ok then I'm officially out of ideas.
My first instinct would still be to see if the SAN is working correctly. Other than that you can watch the waits on the server but I have no idea if this is usefull at this point... more of a shot in the dark.
http://www.sqlskills.com/BLOGS/PAUL/post/Wait-statistics-or-please-tell-me-where-it-hurts.aspx
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply