Backup performance directly to a network location vs to local and then copy

  • Recently we have been looking at making backups directly to a network location (option A). I have read numerous recommendations to backup locally and then copy the backup to the network location (option B).

    Given our infrastructure option B does not deliver us a whole lot of extra reliability. The probability of a backup failing due to problems with the network or the destination server is low. Even if it did occur we are implementing other disaster recovery solutions such as clustering and mirroring to a data centre in another city. Given that we are talking terabytes of extra server disk space that is required for the staging of the backups in option B the business as deemed unnecessary on a cost basis.

    My problem is with performance. I ran a series of tests with a 12 GB DB to determine the relative performance and found that option B was approximately 30% faster than option A. For some of our systems this is simply too long.

    Why does it take so much longer? Is there something I can do to reduce the discrepancy?

    - EBH

    If brute force is not working you're not using enough.

  • Your network link will likely be a bottleneck. Are you going over a WAN? They can be even slower, because they are generally a leveraged/shared resource. I don't have the math behind it, but I'm sure that sending data through a PCI/fibre bus/connection to fast disks goes at a higher MB/s than over a network link.

    So not only do you need fast cards in your servers (Gb speeds), you need a fat pipe on your WAN as well. And if you need it, make sure the minimum guaranteed bandwidth is high enough to meet your needs (there's a term for it, I forget what it is. E.g. you can pay to have access to a 10Mb pipe but only be guaranteed a minimum of 5Mb during peak periods. Depends on how your networks are set up).



    Scott Duncan

    MARCUS. Why dost thou laugh? It fits not with this hour.
    TITUS. Why, I have not another tear to shed;
    --Titus Andronicus, William Shakespeare


  • Well - as scott mentioned - it's probably the network link being the bottleneck.

    One way might be to see if you can turn the 1600lb gorilla into 2-800lb gorillas, or preferrably 4-400lbs gorillas. Meaning - if your destination is local - set up multiple network connections between the 2 servers, and run the ONE backup as multiple separate filegroup backups to "different destinations" which happen to all point to the same place. Of course - unless your switches are dead quiet and fully independent during the length of the backup process - it might STILL bog down....

    Of course - I'm not so sure this will hold up cost-wise, or scale to handle the many servers you're talking about. It might almost be cheaper (hardware wise) to implement a high-speed local backup tape device (after all - 80MB/s to a local tape array device is roughly the same as 835 Mb/s over the network once you count in the error correction, framing, etc... overhead the net throws on it, i.e. a lot faster than you'd likely get from just about any network connection)....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • The network connection between the SQL server and the storage location is common for both scenarios. So while it may be a bottleneck it is common.

    My question is why backing directly over the network takes 30% longer than backing up to a local disk and then copying the backup file over the same network. I would have thought it would take either the same of marginally longer to backup to local and then copy. After all with the backup to local and then copy there is more work. First there is the write to disk of the backup disk and a read of it again for the copy.

    I have considered the running of backups in parallel and it may be something I will have to do. However the, what I considered counterintuitive result of the backup directly over the network taking 30% long I thought I might be missing something. If that could be resolved, i.e. remove the 30% overhead then we would be within requirements and the added complexity parallelism would not be required.

    - EBH

    If brute force is not working you're not using enough.

  • Backup represents more than a single write operation. It's about 3 passes through the data, so the 12GB turns into a lot more traffic (there's at least a writing pass, and a validation pass, and a lot of chit chat back and forth DURING the backup).

    By doing that locally - you're taking advantage of disks that are 10x faster than your net connection (assuming single Gb connection).The COPY is then a single pass through the data.

    So yes - more work, but mostly done on "faster storage" (when you lump the network + remote storage)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • That makes perfect sense and is the answer I was looking for. Thanks for that.

    - EBH

    If brute force is not working you're not using enough.

  • - I'm told Xcopy should go faster than copy

    - also first delete your old backup at the destination and copy the file afterward.

    - There are solutions the can compress your backups on the fly.

    e.g. http://www.hyperbac.com/products/sqlserver/walkthroughs/SQLServerBackupDatabase.asp

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I used to do backups over a network that would take at least 4.5 hours for a 200GB database. The target storage device had some idiosyncracies that I won't go into, but sometimes it would take longer, up to 10 hours or more. I switched to doing a striped backup to four separate local drives (not the data drives) which finished in only 7 minutes. This was followed by copying over the network which took around 2 hours, so it cut the time in half. The background file copy doesn't noticably affect SQL performance, no locks are being held and the server can get on with other overnight jobs. I don't have a technical explanation, but I think the Windows file copy operation has a much lower latency than the BACKUP command. My guess is that the SQL BACKUP job gets suspended when all output buffers are full and there is a delay before it gets back to work when buffers are available, while the OS file copy is a much simpler operation that doesn't get distracted.

    I then moved to SQL Litespeed to write compressed backups, and use a medium compression setting to get the best compromise between CPU load for compression and file copy time. The backup now takes about an hour.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply