Back-up jobs taking forever to run

  • Hi,

    We are experiencing an issue with backup jobs taking an excessively long amount of time to run, and I was hoping someone herre might have some insight as to where the problem is.

    The environment: 4 x SQL instances (1 x 2005, 3 x 2008) running on one server, all performing file-based backups via the network to disks located in a different server.

    Each instance (and individual databases) have different backup schedules, but a general rule-of-thumb is that each databases does a complete backup once a week, differential each night, and transaction log every 15-30 minutes. All backup jobs were created using Maintenance Plans (wizard), and each job consists of the backup step followed by a cleanup step which deletes back-up files older than x weeks.

    The problem is that every back-up job executed by 3 of the instances takes a looong time to complete. Testing has shown the actual back-up performs at a "normal" pace, but the job then sits for many minutes processing the clean-up step (deleting old backup files). As soon as I remove the clean-up step, or temporarily move most/all of the old backup files to another folder (so there are far less files to process in the clean-up step), the cleanup step & hence job completes quickly.

    The more files to process, the slower it gets - this makes sense, but at the moment we're talking about < 500 files to process for each back-up job, which I wouldn't think is excessive?

    Another oddity is that ONE of our 2008 instances is unaffected - it's back-up jobs (with the same clean-up steps) execute in seconds. If anything this instance should take longer - it has the most frequent back-up jobs (so more backup files to process during clean-up step) and most data to backup (it's databases are far more active than the other instances). The only difference between this instance and the others is that it is the default instance, rather than named.

    To give an example:

    Default instance, backing up ~30MB of TLOG, with approx 1,000 TRN files in backup folder to process in clean-up step - takes about 10 seconds.

    Other instances, backing up ~500K of TLOG, with approx 500 TRN files in backup folder to process in clean-up step - takes about 25 MINUTES.

    Can anyone think of why our clean-up steps are taking so long to run? And why one of our instances wouldn't be affected, while the other 3 are?

    Thanks.

  • Try defrag your disks

    Wilfred
    The best things in life are the simple things

  • I suspect there may be some sort of authentication issue causing timeouts while checkinmg permissions. I would check the account dteails the jobs are under, permissions to those accounts, access to domain server etc etc. Only a hunch but it may help.

    Mike John

  • Other thing I was thinking of:

    If you delete a backup, SQL has to do some maintenance in the msdb database. So deleting an outdated backup results in:

    - a DOS (yeah, I know) delete statement

    - in some SQL DELETE/UPDATE statements (?)

    Maybe msdb is causing the problem? Try rebuilding the indexes in msdb

    Wilfred
    The best things in life are the simple things

  • Firstly, apologies for the ridiculously slow reply - I completely forgot that I posted about this!

    A couple of answers below, but I have also taken this up in a different thread as I've done more diagnosis / troubleshooting in the meantime. New thread here: http://www.sqlservercentral.com/Forums/Topic635989-357-1.aspx

    Wilfred - Try defrag your disks

    Thanks for the idea but it's not a disk performance issue.

    Mike - I suspect there may be some sort of authentication issue causing timeouts while checkinmg permissions. I would check the account dteails the jobs are under, permissions to those accounts, access to domain server etc etc. Only a hunch but it may help.

    As per my newer post, I've tried giving the SQL service accounts Full Control permissions to the share they're targetting (and even Domain Admin rights very temporarily) and this didn't help.

    Wilfred - Other thing I was thinking of:

    If you delete a backup, SQL has to do some maintenance in the msdb database. So deleting an outdated backup results in:

    - a DOS (yeah, I know) delete statement

    - in some SQL DELETE/UPDATE statements (?)

    Maybe msdb is causing the problem? Try rebuilding the indexes in msdb

    I haven't investigated this in details but I don't think it is the issue:

    1) Further diagnosis (see new thread) points at specifically the xp_delete_file stored proc call being the problem

    2) The affected SQL instances are all pretty small-fry and on over-powered hardware so I doubt msdb performance is an issue. e.g. msdb on all instances is 10-20mb and the server has 4GB of RAM, 20-30% CPU load accross 2 CPUs, and disks on a NetApp SAN.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply