Moving to Netapp SQL Snapmanager

Question

Moving to Netapp SQL Snapmanager

Indianrock

SSC-Insane

Points: 20333
More actions
May 12, 2011 at 5:17 pm

#238201

For anyone who has used or considered using this product, I'm curious about the length of time it does an "IO freeze" on the sql files. I presume you snap a full copy of the database on the weekend or at night, and from then on it just snaps incremental changes every 15 minutes ( for example ). The concern is, on a 1TB database, if that freeze is quick enough not to kick people out of the database.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 1

With SnapManager for SQL Server it does not matter how large or small the database is. This is because no data moves anywhere as part of the backup. The storage controller simply takes a 'photograph' of the disk map at that time.

If you do a SQL trace at the time of the snapshot job you will see a 2 millisecond period between the freeze and the thaw. That's it. 1GB, 100GB, 1TB. Same 2ms.

When I show this to customers I always make an effort to show what SQL thinks is going on so that DBAs etc. don't think that NetApp is doing something shady.

The entire SnapManager backup job will take a couple of minutes to do because it sets up, checks licenses, makes sure the LUNs are in the right place and then tears down the job when it's done. All you are worried about from a SQL perspective can be viewed from the trace.

If you have SMSQL run the DBCCcheckDB process afterwards that will obviously take a good while on a 1TB database but that process uses a FlexClone of the database and only impacts I/O on the disk, not the host. If you choose to do the DBCCcheckDB on a SnapMirror or SnapVault destination that's fully supported and, indeed, recommended.

Hope that helps you a bit.

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 2

That's annoying. Hit send before proof-reading.

Here's the bit about what to back up and when.

SMSQL doesn't really 'do' incremental backups.

If a database is in Simple recovery mode the whole database gets a full backup every time. Because, as I wrote earlier, the time taken is so small, you can take backups as often as you need. Every couple of hours or once an hour if you need. It's all dependent on your RPO. In Simple mode the transaction log is streamed from the directory it's in over to another LUN which is called "SNAPINFO". You will set that up and size it based on the number of databases, how many days worth of TLog you want to keep and what the change rate it.

If a database is in Full recovery mode the process is similar to that for Simple but because the log is in a different LUN & FlexVol you can take those tlog incremental backups every few minutes. I see customers who go down to 10 minutes. For RPOs < 10 minutes you usually employ an application based HA solution.

Your local NetApp representative or their VAR partner can set up the global demo system and let you play/drive if you want to see it in action and walk through some scenario's.

Indianrock SSC-Insane Points: 20333 More actions · Answer 3

Thanks Mark. Glad to hear the "IO freeze" is so brief. Does snapmanager handle log backups? We've always run in full recovery mode with log backups every 15 minutes and retaining about a week's worth with weekly full backups and nightly differentials. What about the system databases?

I used to do regular sql backups with sql agent jobs, along with native sql log shipping. Currently our Systems Department uses Commvault and took over all backups, restores and disaster recovery.

Commvault works but has a tendency to think the log chain is broken -- usually at the worst time like during the business day or during our long weekend reindex/update stats window -- and begin a full backup.

The plan now is to buy a new Netapp along with software such as sql snapmanager. Systems doesn't know much about sql but uses the general term "replicate" the data to a secondary remote server location. Currently our production data is at a Raging Wire facility.

We expect to use commvault to backup the secondary and relieve the backup load on the primary. Now that I'm to begin taking over the Commvault piece I started asking questions about sql snapmanager, which led to my IT manager suggesting I research it.

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 4

Well, there's a thing. The Commvault Simpana suite actually takes our snapshots so if you have valid licenses you don't need to use SMSQL. On the flipside there's a likelihood that you've got all the SnapManager products as part of the software bundle when you purchased the storage controllers and disk.

SMSQL will handle all those backups for you exactly as you state in your first paragraph. The GUI has a wizard that you can use at first. Eventually you'll get to know the PowerShell and you can drop the GUI if you want. System databases (not temp, obviously) are streamed rather than snapped. This is because we are currently backwards compatible with older versions of SQL that don't support VSS. We use VDI. Eventually when the older versions of SQL become unsupported and VSS becomes the only game in town for SQL there might be a change. That's an NDA discussion for you to have with your NetApp rep. The SNAPINFO is where the system databases streamed to. I mentioned a little bit about SNAPINFO earlier.

You will need to have a precise conversation with the partner/NetApp professional services engagement so that they configure SMSQL in such a way as to not cause you problems with any databases that you're log-shipping. It's all supported but obviously we don't want to cause you problems with the log chain. It's just process and attention to detail really.

Indianrock SSC-Insane Points: 20333 More actions · Answer 5

Thanks again. We're not doing log shipping any longer. I'll pass this information along. This article seems to either have inaccuracies or is simply a bit out of date on this topic. Snapmanager

Indianrock SSC-Insane Points: 20333 More actions · Answer 6

I wasn't clear on whether snapmanager will backup systems databases such as master, msdb and tempdb?

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 7

It does backup all the system databases except temp, which we don't allow you to back up.

The difference is that the system databases only get the streamed backup, not the proper snapshot backup.

Indianrock SSC-Insane Points: 20333 More actions · Answer 8

Of course, can't backup tempdb. And by "streamed" we mean normal backup method. thanks again.

jaw_2k Grasshopper Points: 16 More actions · Answer 9

Can you also "stream" user databases with Snap Manager for SQL? Our hosting provider likes to have traditional ".bak" files to then push to tape and is resistant to mounting a snapshot and backing up the DB files or using NDMP to backup the snapshot from the filer to tape.

I'd like to do hourly SnapShots and 15min SnapManager tlog backups and a "streamed" backup at night to go to tape.

I've read not to use native backup or third party (LightSpeed) backup tools with SnapManager. Is it OK to do a "copy_only" native backup and SnapShots?

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 10

That's not an option for you. My advice would be to pressure them into it because it's better for your servers to use the snapshots and extract the data from the storage array (NDMP/FlexClone like you say). Leverage your right to go elsewhere with your business. Many, most providers like pulling data from a FlexClone because they (A) are unlikely ever to have to give it back because it's on a snapshot ($$ savings for them!!) and (B) don't have to worry about the actual application backups because all they're doing is pulling off flat files to their infrastructure.

Any company who doesn't want to shift all the risk and reduce the likelihood ever to have to give data back (restore) deserves the raised eyebrow.

And one last thing. Are they going to discount you for all that additional space you need for the BAK files? Because you need to provide space for a couple of BAKs in case something failed and the provider can't get the BAK up into the cloud (grr, hate that word). The snaps are a lot more efficient on your storage.

azhu Valued Member Points: 64 More actions · Answer 11

Hi Mark,

Thanks for the thoughtful answers. We are also in the process of implementing Netapp SQL Snapmanager for a new project. We are planning to build a 2-node cluster using VMWare virtual machines as the nodes and have all shared disk on Netapp devices. My question is how to set up the DR part. Do we set up/clone the prod VMs to DR site first then use Snapmanager+Snapmirror to shift the snapshots taken to the DR site. How do we 'link-back' the snapshot copies back to the DR VM cluster once a DR is called? Can we use Snapmanager to perform restore? Is there an automated way to restore to the DR?

Thanks

Arthur

mark.arnold217 SSC Enthusiast Points: 144 More actions · Answer 12

I must preface with the fact that I am in no way an expert on VMware but here goes with the configurations at the basic level. The storage will be set up with SnapMirror relationships. You or the storage guys will do that part. Either the controllers themselves, Protection Manager or SnapManager will manage when that mirror gets triggered. The storage guys will work with you to determine when the mirror kicks and how often. That's all down to calculations with bandwidth and your SLAs to the business. Just assume for now that it's happened and happened on a schedule you're comfortable with.

What you have at the DR end is a lot of FlexVols that contain LUNs for the SQL and (probably) NFS volumes for the OS. You can't see them or touch them just yet, they're just SnapMirror targets.

When you need to test DR you can use FlexClone. The NetApp plug-ins to the VMware consoles allow you to do this without any real knowledge of the underlying storage. It's a relatively simple process.

Invoking DR isn't much different and is achievable with SRM or another mechanism.

Under the covers the SnapMirror targets are all broken* and the storage is mounted up to the VMware hosts. They start the servers. Depending on whether you're using RDMs or the iSCSI initiator you'll see the database/log LUNs appear. All this will have been mapped out in advance so the right LUNs will appear to the right hosts and guests. Because the SQL guest already knows about all of its database and log LUNs and the signatures haven't changed they will be presented up as the right drive letters or mount points.

*Breaking a SnapMirror relationship is not destroying it. The underlying relationship remains so that when you want to go back to the production site from DR you reestablish the sync in the other direction and only the changed blocks, rather than the entire database, goes back to production. It's a quicker and more bandwidth efficient process.

With a FlexClone you can do what you want with the storage and nothing will affect the mirror underneath. If you make changes they will only affect that temporary copy. The mirroring will still be happening in case you need to invoke DR for real. When you're done you can disconnect the LUNs which will destroy the FlexClones. Don't have the FlexClones around for too long though. The underlying snapshot cannot be deleted until there are no clones relying on it. If you only keep a couple of days of snaps on primary disk don't keep the clones around for a week or more!

azhu Valued Member Points: 64 More actions · Answer 13

Thanks a lot for the detailed answer. I am sure I will refer back to this again and again during implementation.