VLDB backup to network device(deduplication)

  • Hello all!

    I am running out of ways to speed up my full backup :hehe:, here is what I have currently tried:

    Native full backup split across 7 files eta on full backup 15 hrs

    Redgate SQL Backup 5 split across 7 files eta on full backup 10 hrs

    We are using 7 files as the magic number cause we have two data domain devices

    one located locally at the datacenter and the other replicates to our DR site so it helps with the relplication to our DR site as well.

    I am backing up to datadomain (network storage device that does deduplication at the block level) so compressing the db is out of the question datadomain has there own type of compression.

    DB sized 2.7TB (compressed)

    So I am open to all suggestions, I would love to see my full backups back down to 6 hrs 😀

  • I'm not sure I understand. Are you striping the backup across 7 files to one Data Domain device? Is that a single array/LUN/drive?

    The speed in backup comes from multiple physical devices writing as quickly as possible, so if there isn't sufficient IOPS on the target, I'm not sure you can do anything here. Are you getting a throughput to disk that is high compared to the specs of the device?

    As far as SQL Backup goes, I might still enable compression here. Even if that means the Data Domain device can't compress much more, you've reduced the network I/O, which might speed things up. Can you play with compression levels here to see what works? Do you have spare CPU that can be used here?

    Another thing, have you looked at diffs some days to alleviate the time? I have done weekly fulls and daily diffs in the past when backup windows were an issue.

  • Consider going to Netapp SAN.

    I can do a 500 GB backup in about 5 minutes. Restore is also that fast.

    http://www.netapp.com/us/solutions/applications/microsoft-sql/ms-sql-business-backup-recovery.html

    The amount of time of the backup depends on how much of your data changed since the last full backup.

    Of your 2.7 TB, how much of that data is static each day? In reality, most databases have a ton of history and that data does not change during the day.

    Probably less than 1% of your data in new or changed data each day.

  • Steve Jones - SSC Editor (7/7/2011)


    I'm not sure I understand. Are you striping the backup across 7 files to one Data Domain device? Is that a single array/LUN/drive?

    We have two data domain devices and they use context ids for their peer points <--not sure on the correct verbage but its mainly for replicating the backups to our DR site, that is the only reason why =/

    The speed in backup comes from multiple physical devices writing as quickly as possible, so if there isn't sufficient IOPS on the target, I'm not sure you can do anything here. Are you getting a throughput to disk that is high compared to the specs of the device?

    I can't tell as far as the actual DD device, the Prod DB Cluster is very calm.

    Server specs

    32 cpus

    192GB ram

    netapp(SAN)

    10g-E

    As far as SQL Backup goes, I might still enable compression here. Even if that means the Data Domain device can't compress much more, you've reduced the network I/O, which might speed things up. Can you play with compression levels here to see what works? Do you have spare CPU that can be used here?

    Another thing, have you looked at diffs some days to alleviate the time? I have done weekly fulls and daily diffs in the past when backup windows were an issue.

    Backup schedules:

    Sat-Thur Diffs Fri- fulls

    I leave a 12 hr window for the TB DB to be backed up then have all other dbs going around that schedule(just to make sure the network traffic isn't being bogged down as silly as that sounds but data domain gets alot of network name no longer aval).

    I've tried compressing the 2.7 TB db down and it kills data domain space wise, I started reading the Maximum transfer size & Maximum data block that sql backup has as an option didn't know if that would help with sending the data to data domain or that could kill the I/O on data domain.. grrr

  • We are planning on using the SMSQL tools but we have a very unq environment, netapp had to do some testing to verify SMSQL backup would work. We are still a few months out before I can use it =/

    edit - to answer your question that is 90% historical data, This one DB grows roughly 10-18gb a week

  • Scott Ohar (7/7/2011)


    edit - to answer your question that is 90% historical data, This one DB grows roughly 10-18gb a week

    If you can no longer fit within the window, or if you are approaching the limits, is an archive database an option?

    In our system we keep 1 year in the active OLTP database (500 GB roughly).

    Everything older is moved each month to the archive DBs.

    That would likely involve some development effort and is not a quick fix. But it sounds like it might be a necessary long term fix.

    18 GB per week X 52 weeks = 936 GB database to contain about 1 year of data.

    Does the application really need 2.7 TB of historical data?

    If 2 TB of data could be on a reporting database (backup only when it changes once per month), then many of your issues would disappear.

    Backup times, transfer times over the network to the DR site, size of backups (how many do you keep available on data domain device? multiple that by 2 TB for each)

  • Great minds think alike! Sadly its not an option the company is willing to go with anytime in the near feature =(

    Edit - ugh I did it again!

    Full backups are kept for 30 days then Month end backups we keep 18 months worth

  • Scott Ohar (7/7/2011)


    Great minds think alike! Sadly its not an option the company is willing to go with anytime in the near feature =(

    In that case, you need to just present the options in writing to your manager. Let them know you are at the limits of the backup window and what the WAN link to the DR site can achieve.

    If they are not willing to make some logical changes to the database (archive DB, etc), then they need to start upgrading the network within the data center. faster drives for writing to the data domain, upgrade the WAN to the DR site, etc.

    At some point, management needs to understand the limits.

    Either throw hardware and money at the problem, or put some development time into a more efficient historical data solution for reporting.

    Are the full backups running over you window and affecting your important business hours?

    That happened on one of our production Silos and when I pointed out it was unavoidable, it got management focused on the the issue and we made changes.

    We made the swtich to Netapp.

  • JamesMorrison (7/7/2011)


    Scott Ohar (7/7/2011)


    Great minds think alike! Sadly its not an option the company is willing to go with anytime in the near feature =(

    In that case, you need to just present the options in writing to your manager. Let them know you are at the limits of the backup window and what the WAN link to the DR site can achieve.

    Yeah, the option I received back was SMSQL(netapp)

    If they are not willing to make some logical changes to the database (archive DB, etc), then they need to start upgrading the network within the data center. faster drives for writing to the data domain, upgrade the WAN to the DR site, etc.

    At some point, management needs to understand the limits.

    Either throw hardware and money at the problem, or put some development time into a more efficient historical data solution for reporting.

    We have plans to archive just the development for it with the type of shop we are would more than likely take 12-14 months min, we have plans to archive via datawarehouse.

    Are the full backups running over you window and affecting your important business hours?

    That happened on one of our production Silos and when I pointed out it was unavoidable, it got management focused on the the issue and we made changes.

    We made the swtich to Netapp.

    Backups aren't impacting anything but me knowing the restore would take forever to get production data back online. My plans once we switch over to netapp sql backups would be using native sql for my month end backups and use the heck out of netapp.

    My Prod environment:

    139 DBs total on a active/active cluster

    7TB of LUN storage split for all dbs

    So yeah im praying for netapp lol

  • Scott Ohar (7/7/2011)

    My Prod environment:

    139 DBs total on a active/active cluster

    7TB of LUN storage split for all dbs

    So yeah im praying for netapp lol

    Wow, that sounds like it could be a complicated dance for setup on Netapp.

    With 139 DBs you are going to really map out which databases go on each LUN and Volume.

    To use SnapManager for SQL Server, you have to be careful on placement of the DBs. Maybe you already know this, but I had to put a lot of research into it when we first moved to Netapp.

    Large DBs that are on a FULL recovery plan should get their own Volume/LUN for the .MDF and a seperate Volume/LUN for the .LDF.

    You can put all of the SIMPLE DBs on the same Volume/LUN because you are not worried about point in time recovery.

    Just to avoid having too many logical drives, we put all of the smaller databases (FULL recovery) in a group Volume/LUN (.MDF and .LDF seperated).

  • A couple of things to look at then.

    RTO: What's the time to restore in a disaster, and what does that mean for lost work/productivity or sales. Put numbers to it and then go back to your boss, or other bosses, with a real understanding of what it would mean for a restore. Make sure they understand that a restore could be required because of a mistake, or disk crash, or corruption (which cannot be prevented). But get costs involved, and give them a few time estimates. I assume you do log backups as well, so the RPO is covered, but let them know the restore times from a full, a full and a diff, and a full, diff, and max logs.

    The likelihood of disaster is low, but you want to prepare. If the RTO is too large, consider then building a log shipped server off your Data Domain DR device. It sounds like you use two devices, but one's a mirror of the other, so you're backing up to one essentially and copying to another. Can you restore the backups on the other side to a server? At the very least you might create a standby database that's 2.7TB so the files exist and don't have to be zero'd out in a restore.

  • Oh yeah, we are somewhat ok on the RTO side of things, which is why I am waiting on the netapp tools, we have alot of work to do on our side to get it to work with our system, but our DR sites normally keeps 1 copy of the large DB for query tuning, but as far as disaster goes well restoring that one db is an easy 18 hrs then logs.:crazy:

  • Hi

    Just to get some things straight,

    You say the compressed backup is 2TB, you keep a months worth of DB's ? 30x2TB

    And once you are in the new month you keep 1 copy of the previous month ? another 2TB

    Is this correct?

    Can you give me a few raw sizes .

    Here is where I am going with this. If you backup a VLDB for argument sake. 2TB. every day.

    You techinically cover youself against failure of 2TB per day, are you thus implying you application write a fresh 2TB each day?

    [ok just reread another post of yours]

    you keep on top of this 18 months worth of fullbackups, another 36TB.

    So you are saying at any point in time you have

    30x2TB + 36TB ?

    Unless you are running NYSE, which push around that amount per day. I don't find it worth while to backup every EVERY DAY.

    Use filegroups.

    But give me more info on your situation

    Cheers

    Jannie

  • Jannie-186227 (7/20/2011)


    Hi

    Just to get some things straight,

    You say the compressed backup is 2TB, you keep a months worth of DB's ? 30x2TB

    And once you are in the new month you keep 1 copy of the previous month ? another 2TB

    No, we are using row compression. we can not use backup compressions going to datadomain, if I did that data domain would see it as a new block of data and waste space.

    Is this correct?

    Can you give me a few raw sizes .

    1 DB is 2.6 TB in size

    Here is where I am going with this. If you backup a VLDB for argument sake. 2TB. every day.

    You techinically cover youself against failure of 2TB per day, are you thus implying you application write a fresh 2TB each day?

    [ok just reread another post of yours]

    you keep on top of this 18 months worth of fullbackups, another 36TB.

    So you are saying at any point in time you have

    30x2TB + 36TB ?

    Unless you are running NYSE, which push around that amount per day. I don't find it worth while to backup every EVERY DAY.

    Use filegroups.

    Data domain total storage is roughly 40TB but is has it's own type of internal compression and deduplication so it only really uses storage if it sees anything new.

    Example of Data domains compression/dedupe process: at one point I had total of 120TB of backups for this one db and data domain total storage avail was 60%, we roughly on avg grow 10-17GB per week.

    We are going to start working on an Archive system next quarter but I don't see that going into my production envrionment for atleast 8 months =/

    Backup requirements - must keep min of 18 months of month end backups

    Regular Backup schedule is

    Sat - Thurs - Diffs

    Fri - Full backup

    But give me more info on your situation

    Cheers

    Jannie

    But hey! i just looked we have 1 table that now has 1.9 billion rows 😎

  • Scott Ohar (7/21/2011)

    But hey! i just looked we have 1 table that now has 1.9 billion rows 😎

    Scott, it's not the rows that hurt. It's the columns !!! 😀

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply