I’ve talked about it before; you shouldn’t have a backup strategy, you should have a recovery strategy. I can’t possibly care if my backups succeed if I’m not bothering to test that they can be restored. And if they can’t be restored then, both technically and practically, I don’t have backups.
In one of the systems I manage, they built a very simple “test restore” process long before I became involved. Every night, it would pull the full backup for each database, restore it on a test system, and run DBCC CHECKDB
against it. It would alert on any failure, of course, but the primary purpose was to always be confident that the backups could, in fact, be restored.
The process could afford to be simple because it assumed the following would always be true:
- All databases took full backups every night.
- Each database was backed up to a single
.bak
file named<DatabaseName>.FULL.<YYYYMMDD>.bak
. - Each database we were restoring this way consisted of a single data file (called
DataFile
) and a single log file (LogFile
). - All databases were backed up to the same location (and in a subfolder exactly matching the database name), and all databases would always be able to fit in that location.
- The biggest database, once restored, would always fit on a single drive on the test system.
- All databases could be restored and
CHECKDB
‘d in under 24 hours.
There is a table that tells the code which databases to restore:
1 2 3 4 5 6 7 |
CREATE TABLE dbo.SourceDatabases ( DatabaseName sysname /* other columns not pertinent to this post */ ); |
And another table to tell the code, for example, where to find backups, and where to move data/log files on restore:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
CREATE TABLE dbo.RestoreConfig ( SourceFolder nvarchar(4000), DestinationDataFolder nvarchar(4000), DestinationLogFolder nvarchar(4000), TempDBFolder nvarchar(4000), ... ); INSERT dbo.RestoreConfig ( SourceFolder, DestinationDataFolder, DestinationLogFolder, TempDBFolder ) VALUES ( N'\\corp-backup-share\sql-backups\', N'D:\SQL\Data\', N'L:\SQL\Log\', N'T:\SQL\TempDB\' ); |
Then the (vastly simplified) code looked something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
DECLARE @Source nvarchar(4000), @DataFolder nvarchar(4000), @LogFolder nvarchar(4000), @dbname sysname, @ymd char(8) = CONVERT(char(8), getutcdate(), 112), @sql nvarchar(max), @c cursor; SELECT @Source = SourceFolder, @DataFolder = DestinationDataFolder, @LogFolder = DestinationLogFolder FROM dbo.RestoreConfig; SET @c = CURSOR STATIC READ_ONLY FORWARD_ONLY FOR SELECT DatabaseName FROM dbo.SourceDatabases ORDER BY DatabaseName; OPEN @c; FETCH NEXT FROM @c INTO @dbname; WHILE @@FETCH_STATUS <> -1 BEGIN SET @sql = N'RESTORE DATABASE ' + QUOTENAME(@dbname + N'_testing') + N' FROM DISK = N''' + @Source + @dbname + N'\' + @dbname + N'.FULL.' + @ymd + N'.bak''' WITH REPLACE, RECOVERY, MOVE N''DataFile'' TO N''' + @DataFolder + @dbname + N'.' + @ymd + N'.mdf''', MOVE N''LogFile'' TO N''' + @LogFolder + @dbname + N'.' + @ymd + N'.ldf'';'; EXEC sys.sp_executesql @sql; /* then do CHECKDB things */ FETCH NEXT FROM @c INTO @dbname; END |
I’ll let you guess how many of those assumptions remain facts today. If you guessed zero, you’d be right. Life got complicated, the restore system is now also being used for other purposes, and source databases have grown significantly. I’d have to help adjust the code to deal with those changes, and this post is about how.
Here is their new reality relative to the “truths” above:
- They started performing full backups weekly, with differential backups in between.
- They started striping backups to multiple files.
- Many databases had additional files and filegroups; even those that didn’t started having logical file names straying from the existing convention.
- Different databases had to be backed up to different locations.
- The biggest database outgrew the
D:\
drive on the test system. - The biggest database also squeezed out all the others in terms of duration, so that they couldn’t all be backed up and
CHECKDB
‘d within a calendar day.
Some things did remain true, though:
- They always produce backups
WITH INIT
, so we never have to figure out which file (backup set) we need within any given.bak
file. - They don’t use features like
FILESTREAM
,FILETABLE
, or In-Memory OLTP, so we never have to consider those differences. They do use full-text search, but there are no dedicated files for full-text catalogs involved in backup/restore testing. - They encrypt backups, but we’re going to leave that complication out for now.
Also, while backups might be striped to 8 or 16 files (or any number, really), they produce file names in a predictable format, e.g. <DatabaseName>.FULL.<YYYYMMDD>.<file number>.bak
. Backups are generated using Ola Hallengren’s solution, so this is easy to do by passing in the right @FileName
and @NumberOfFiles
arguments:
1 2 3 4 5 6 |
EXEC dbo.DatabaseBackup @FileName = N'{DatabaseName}.{Type}.{Year}{Month}{Day}.{StripeNumber}.bak', @NumberOfFiles = 16 /* , ... other parameters */ |
We actually store that pattern in the (distributed) config table too, so we can change it in one place instead of in every SQL Server Agent backup job on every instance.
Similarly, I created table types for things we do often enough that we don’t want to define them over and over again. Because the test server is on a different subnet, and because the backups come from servers all over the network, we can’t always query msdb
on a source instance to get backup information about a given database. So, instead, we rely on looking at the folder and finding the most recent file, then examining the output of these RESTORE
commands:
RESTORE LABELONLY
(for determining how many stripes exist for a given backup):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE TYPE dbo.RestoreLabelOnlyOutput AS TABLE ( MediaName nvarchar(128), MediaSetID uniqueidentifier, FamilyCount int, FamilySequenceNumber int, MediaFamilityID uniqueidentifier, MediaSequenceNumber int, MediaLabelPresent tinyint, MediaDescription nvarchar(255), SoftwareName nvarchar(128), SoftwareVendorID int, MediaDate datetime, MirrorCount int, IsCompressed bit ); |
RESTORE FILELISTONLY
(for determining the logical data and log files present in the backup):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
CREATE TYPE dbo.RestoreFilelistOnlyOutput AS TABLE ( LogicalName nvarchar(128), PhysicalName nvarchar(260), Type char(1), FileGroupName nvarchar(128), Size numeric(20,0), MaxSize numeric(20,0), FileID bigint, CreateLSN numeric(25,0), DropLSN numeric(25,0), UniqueID uniqueidentifier, ReadOnlyLSN numeric(25,0), ReadWriteLSN numeric(25,0), BackupSizeInBytes bigint, SourceBlockSize int, FileGroupID int, LogGroupGUID uniqueidentifier, DifferentialBaseLSN numeric(25,0), DifferentialBaseGUID uniqueidentifier, IsReadOnly bit, IsPresent bit, TDEThumbprint varbinary(32), SnapshotUrl nvarchar(360) ); |
We don’t need all of those columns, of course, but there’s no way to say:
1 2 3 |
SELECT <only these columns> FROM (RESTORE FILELISTONLY ...); |
Next, we added columns to the SourceDatabases
table to indicate where each database sends its backups and on what day of the week the fulls run. Then we used snake sorting to distribute the backups and restores across days of the week as evenly as possible. Let’s say we have this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
CREATE TABLE dbo.SourceDatabases ( DatabaseName sysname, BackupFolder nvarchar(4000), FullBackupDay tinyint /* other columns not pertinent to this post */ ); INSERT dbo.SourceDatabases ( DatabaseName, BackupFolder, FullBackupDay ) VALUES(N'DB_A', N'\\backup-share-1\SourceServerA\Full\', 1), (N'DB_B', N'\\backup-share-1\SourceServerA\Full\', 1), (N'DB_C', N'\\backup-share-2\SourceServerB\Full\', 2); /* so, for example, DB_A is backed up to share 1 on Sundays */ |
With the table types in place and the knowledge of where and when to find backups, we could make the code a lot more dynamic and determine exactly how to restore each database – even if it has been striped to multiple backup files and contains multiple data files. If there are multiple data files, we can alternate them onto the two drives we have available – even though one drive is supposed to be for tempdb
, it can serve as a temporary workaround as it has plenty of available space.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 |
CREATE PROCEDURE dbo.TestRestores @RunCheckDB bit = 0, @debug bit = 0 AS BEGIN SET NOCOUNT ON; DECLARE @DataFolderA nvarchar(4000), @DataFolderB nvarchar(4000), @LogFolder nvarchar(4000), @SourceRoot nvarchar(4000), @Source nvarchar(4000), @DirCommand nvarchar(4000), @BaseFile nvarchar(4000), @FileCount tinyint, @TestDBName sysname, @DatabaseName sysname, @Today date = getutcdate(), @ymd char(8) = CONVERT(char(8), getutcdate(), 112), @Stripe tinyint, @FromCommands nvarchar(max), @MoveCommands nvarchar(max), @LabelCommand nvarchar(4000), @FileListCommand nvarchar(4000), @c cursor; SELECT @DataFolderA = DestinationDataFolder, @DataFolderB = TempDBFolder, @LogFolder = DestinationLogFolder FROM dbo.RestoreConfig; /* get only databases with full backups today */ SET @c = CURSOR STATIC READ_ONLY FORWARD_ONLY FOR SELECT DatabaseName, BackupFolder FROM dbo.SourceDatabases WHERE FullBackupDay = DATEPART(WEEKDAY, @Today) ORDER BY DatabaseName; OPEN @c; FETCH NEXT FROM @c INTO @DatabaseName, @SourceRoot; WHILE @@FETCH_STATUS <> -1 BEGIN /* where are we looking for backups? */ SELECT @Source = CONCAT(@SourceRoot, @DatabaseName, N'\'), @TestDBName = CONCAT(@DatabaseName, N'_Testing'); /* perform a directory listing, newest first, stuff into @table */ /* xp_cmdshell must be enabled, and service account needs access */ SET @DirCommand = CONCAT('dir /b /O:D "', @Source, '*.bak"'); DECLARE @List TABLE(fn nvarchar(128)); INSERT @List EXEC master.sys.xp_cmdshell @DirCommand; /* is there one created today? is it striped? */ /* bit messy since database names may contain delimiters */ /* uses SQL Server 2022's STRING_SPLIT with ordinal */ /* if on earlier version, find ordered string splitter */ SELECT @BaseFile = fn, @Stripe = REVERSE(value) FROM ( SELECT TOP (1) fn FROM @List WHERE fn LIKE @DatabaseName + N'.FULL.' + @ymd + N'.%bak' ORDER BY fn DESC ) AS y CROSS APPLY STRING_SPLIT(REVERSE(fn), N'.', 1) AS x WHERE ordinal = 2; IF @BaseFile IS NOT NULL BEGIN IF TRY_CONVERT(tinyint, @Stripe) BETWEEN 1 AND 99 BEGIN /* striped backup! Need to build FROM DISK = args */ SET @LabelCommand = CONCAT (N'RESTORE LABELONLY FROM DISK = N''', @Source, @BaseFile, N''';'); DECLARE @Labels dbo.RestoreLabelOnlyOutput; INSERT @Labels EXEC sys.sp_executesql @LabelCommand; SELECT @FileCount = FamilyCount FROM @Labels; /* relies on SQL Server 2017's STRING_AGG */ /* relies on SQL Server 2022's GENERATE_SERIES */ /* if earlier, find group concat / sequence generation functions */ SELECT @FromCommands = N' FROM ' + STRING_AGG (CONVERT(nvarchar(max), CONCAT(N' DISK = N''', @Source, @DatabaseName, N'.FULL.', @ymd, N'.', value, N'.bak''')), N',') FROM GENERATE_SERIES(1, CONVERT(int, @Stripe)) AS x; END ELSE BEGIN /* just a single file */ SET @FromCommands = CONCAT(N' FROM DISK = N''', @Source, @BaseFile, N''''); END END /* generate data/log file args from any file */ SET @FileListCommand = CONCAT (N'RESTORE FILELISTONLY FROM DISK = N''', @Source, @BaseFile, ''';'); DECLARE @LogicalFiles dbo.RestoreFilelistOnlyOutput; INSERT @LogicalFiles EXEC sys.sp_executesql @FileListCommand; /* alternate data files between data folder A & B */ /* note: assumes LogicalName is always valid in file name */ SELECT @MoveCommands = STRING_AGG (CONVERT(nvarchar(max), CONCAT(N' MOVE N''', LogicalName, N''' TO N''', CASE FileID % 2 WHEN 1 THEN @DataFolderA ELSE @DataFolderB END, @TestDBName, N'.', LogicalName, N'.', @ymd, N'.mdf''')), N',') FROM @LogicalFiles WHERE Type = 'D'; /* also add log file(s) - yes, that happens */ SELECT @MoveCommands += N',' + STRING_AGG (CONVERT(nvarchar(max), CONCAT(N' MOVE N''', LogicalName, N''' TO N''', @LogFolder, @TestDBName, N'.', LogicalName, N'.', @ymd, N'.ldf''')), N',') FROM @LogicalFiles WHERE Type = 'L'; DECLARE @RestoreCommand nvarchar(max) = CONCAT( N'RESTORE DATABASE ', QUOTENAME(@TestDBName), N' ', @FromCommands, N' WITH REPLACE, RECOVERY,', @MoveCommands, N';' ); IF @debug = 1 BEGIN PRINT @RestoreCommand; END ELSE BEGIN EXEC sys.sp_executesql @RestoreCommand; END IF @RunCheckDB = 1 BEGIN /* also check @debug here */ PRINT '/* then do CHECKDB things */'; END FETCH NEXT FROM @c INTO @DatabaseName, @SourceRoot; END END |
Imagine DB_A
has 4 data files and a log file, and is backed up to 8 stripes, this command would restore the database using the following command (though maybe not quite as pretty, formatting-wise):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
RESTORE DATABASE [DB_A_Testing] FROM DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.01.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.02.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.03.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.04.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.05.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.06.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.07.bak', DISK = N'\\backup-share-1\SourceServerA\Full\DB_A.FULL.20240501.08.bak' WITH REPLACE, RECOVERY, MOVE N'A_DataFile1' TO N'D:\SQL\Data\DB_A_Testing.A_DataFile1.20240501.mdf', MOVE N'A_DataFile2' TO N'T:\SQL\TempDB\DB_A_Testing.A_DataFile2.20240501.mdf', MOVE N'A_DataFile3' TO N'D:\SQL\Data\DB_A_Testing.A_DataFile3.20240501.mdf', MOVE N'A_DataFile4' TO N'T:\SQL\TempDB\DB_A_Testing.A_DataFile5.20240501.mdf', MOVE N'A_Log' TO N'L:\SQL\Log\DB_A_Testing.A_Log.20240501.ldf'; |
This seems like complicated code, but we think it is worth the investment because we don’t just want a backup strategy. We want to always be confident that we can restore from our backups. And the only way to be truly confident is to always be doing it – and by always improving this process, it makes us that much more prepared for dealing with a disaster recovery situation when it happens for real.
Future considerations
The code could, in theory, determine in advance if a database has grown enough to come close to or even exceed the available space of both available drives. In that case, maybe it would be time to add some storage (if possible) or move these operations to a bigger server.
We also want to add into the mix occasional full + differential + log restores, since the current process only really tests full backups and not necessarily that we could recover to a specific point in time if we had to. This would complicate the code slightly – we’d have to identify the latest differential, and all subsequent log backups (or at least the first one), in order to generate the right sequence of RESTORE
commands. But, overall, the logic would remain unchanged.
And we may later have to change it if we can no longer test all the databases in a week. Maybe we only CHECKDB
any given database every other week, or distribute all the databases across two weeks, or only test each database monthly, or build additional test servers and split the work that way. That’s not a tomorrow or next week problem, but it’s coming someday.
Further reading
Some other information, particularly if you get snagged on an older version and need to swap in alternative functions:
- Ordered string splitters for older versions (Stack Overflow)
- Grouped concatenation for older versions (SQLPerformance)
- Sequence generators for older versions (Simple Talk)
- Minimizing the impact of DBCC CHECKDB (SQLPerformance)