Snap Mirror San to San for DR

  • I haven't heard much about this for Disaster Recovery. From what I've heard you are basically mirroring your live database files from one location to another. Any problems copying the live files like that? Pros and Cons ( other than lots of money )?

  • Some of these work ok with SQL server and some do not. Talk to your SAN vendor and make sure it is MS SQL certified and for the versions of MS SQL you are using.

  • It's Netapp's snapmanager and snapmirror products. From what I'm reading, you take a full backup with this technology perhaps every hour with "incremental" transaction log style backups more frequently and they claim it has no more impact on users than sql's native backups.

    --------------------------------------------------------------------------------------------

    SnapManager for Microsoft SQL Server http://www.netapp.com/us/library/technical-reports/tr-3431.html

    SnapManager for SQL Server paves the way for database and storage administrators to simplify data management utilizing the powerful capabilities of NetApp storage systems. For more information, please review the technical report TR3431: Best Practices Guide: Microsoft SQL Server 2000/2005 and NetApp Solutions.

    4.1 Backup Recommendations

    SMSQL's full backup using Snapshot is quick and does not adversely influence database

    performance compared to traditional backup with SQL Server BACKUP statements. Transaction

    log backup is functionally the same using SMSQL as the transaction log backup done using

    BACKUP Log statements and is streamed out to the SnapInfo directory for safekeeping. The

    length of time required to back up a transaction log depends mainly on the active section of the

    transaction log that has to be extracted and copied to the dump directory. When planning the

    backup and restore process, the methodology or considerations are different from considerations

    when backup is done by Enterprise Manager or SQL Server Management Studio

  • We've got an EMC Symmetrix and are using Replication Manager as a front-end to Timefinder to do single-SAN BCV's of our core db servers for reporting purposes. It took several months to get it working properly, to the point we were ready to abandon the product... It's been in testing now for over a month and we've had no issues with it. We're doing this for our reporting purposes and not DR as you've stated.

    We are currently in the planning stages of our DR and have a CX700 to use as remote storage. It seems that if you want success, you need to stay with the same vendor across the hardware & software. We've had enough of a time to get it (BCV) to work on a single SAN, that I shudder to think what it'd be like to mix vendors, software and long-haul replication at the block-level. :unsure:

    Your friendly High-Tech Janitor... 🙂

  • test, test, test.

    Throw a load at the SQL Server, snap it, and then see if the other side is useable. There's always chance for an issue, but if you can test a bunch, you might have some confidence in it. But be sure you really create lots of activity and see if the SNAP can handle the consistency of the image while those changes are being written.

    I'd also look and be sure that the snap doesn't hold up the updates too long. That can be a performance issue.

  • Agreed. Testing is very important in this, you don't want to impact your production system by taking the BCV's...

    Your friendly High-Tech Janitor... 🙂

  • We use an netapp san with sql server, exchange and our ordinary CIF files. The CIF's work great. For sql and exchange we use netapp tools which do the backup for us on a scheduled basis.

    On each backup we tell the tool to do a full backup, truncate the log and verify. The backup runs very quick. The sizes are not that bad because netapp uses the same concept as SQL has for snapshots. Basically netapp rebuilds the version based upon which version you want to restore to. If one version goes bad then you cannot restore so we need to turn on verbose logging and have an outllook rule for these emails looking for 'error'.

    What we also do nightly is to extract the most recent net app snapshot and restore it as a test database in sql so that we know everything is ok.

    The netapp tools/wizards are hard to work with and sometimes they do the wrong thing during restore so we resorted to using dos commands to mount snapshots, copy the files from the mount and then dismount.

    For tape backups we only copy the most recent snap folder because we do hourly full backups (which only takes about 10 minutes for our databases (app1, app2, master, msdb) about 10G of total space.

  • How quick is very quick on the hourly full backup using netapp tools ( not tape backup)? Currently with log shipping we do a full backup every night with transaction log backups every 15 minutes so we can be very current if we have to roll over to the standby sql server.

    Theoretically the full backup and transaction log backups have no major impact on OLTP activities -- we consider it a 24/7 system. The mdf data file is currently 200GB but the overall size is expected to be several terrabytes in the next year or so. We'll probably introduce multiple database files along with table partitioning.

    If the netapp process doesn't "lock" the mdf/ldf any differently than native sql backups then it's probably ok. If it actually blocks transactions then it would have to complete within seconds every time it backs up.

  • When using the netapp tools for backup of sql data, it puts the db in a state where it has exclusive use, then it does something, removes the lock and that takes about a second or two in our case. Then it performs the SQL backup and while the backup is running others can whatever they want to the db. After the backup the data is mirrored to our DR site. It does some sort of track level sync and onlt the changes are copied. For the streaming database (master, msdb, model, etc) those files are handled differently by netapp. The differrence is that is copies the entire file to DR. Those are small so it's not an issue for us.

  • I'm sure that NETAPP has thought this through, but freezing activity for a few minutes doesn't guarantee integrity. There could be changes in flight when they do that. Or potential transactions that haven't committed.

    I'm not trying to talk you out of this, but be sure you really test things well and know what works, what doesn't, and what's being moved to the DR site.

  • Netapp only freezes the SQL db for a second or two. It depends on how long it takes to get the lock. The entire process of locking and unlocking takes a second.

    We have used the data at our DR site, all of our DR exercises are real life live data. We cut over to DR, use the data for a 24 hour period and then after our exercise which lasts 24 hours we use netapp to bring the data back to our production servers. We start the exercise on a thursday after we close the books before the evening trading starts and the test ends on Friday evening after the books are closed. Then we use Friday evening to sync the DR data to our production servers. We do 24 hour trading (Saturday is a non trading day) so we cannot do the exercise any other day of the week.

    We never had a problem with data or transasctions,etc getting messed up. All appears to be running smoothly with netapp.

  • When a VDI based backup is taken, cache memory on the device taking the snapshot is used to hold the current and new disk i/o in. This allows for the snapshot to be taken and no interruption to the SQL server is required during the VDI freeze & thaw i/o operations.

    Every SAN vendor I've used (4) has used this basic setup to do thier VDI based backups.

    Your friendly High-Tech Janitor... 🙂

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply