The Importance of Validating Backups

  • QUESTION: The database maintenance plan dialogs in Microsoft SQLServer2000 (yes, we're still running it) has a "Verify the integrity of the backup upon completion" option.

    The problem with that option is that it takes forever with a large database. When we were using it the database backup times were twice as long. We had to stop using it due to the lack of time available each night. It blew through our window.

  • Yes, that's what does the WITH CHECKSUM option on RESTORE VERIFYONLY. Either do it on the prod system, or copy and restore somewhere else.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal,

    What would you suggest for those that cannot do the 'verify backup integrity' due to time constraits?

    In my environment we have multiple production database clusters, each with multiple TB databases.

    1) It doesn't seem feasible to use 'verify backup integrity' because of the extra time requirements.

    2) Restoring all of the backups elsewhere does not seem viable just due to the number of backups that exist.

    We are requested to refresh QA, Dev or Performance testing environments every few weeks. So we do take prod backups and restore them elsewhere. But there is not a fixed schedule for this.

    So what would be suggested then? Thanks

  • Very nice article. I have started using this in my backup after reading it 4-5 month back in some other forum.

    Also, I have configured another job to verify the backup. This job includes

    1) Verify the backup

    2) Fill the result in a table

    3) Put the table in dailycheck list once you back in office.

    The below steps configured in job to verify the backup.

    begin try

    restore verifyonly from disk = 'your backup file path'

    with checksum

    end try

    begin catch

    insert into DBMaint.dbo.bkpverify (ErrorMessage ,dateNtime)

    ( SELECT

    ERROR_MESSAGE() AS ErrorMessage,getdate())

    END CATCH;

    The code to create this bkpverify table is :-

    create table DBMaint.dbo.bkpverify (ErrorMessage varchar(200),dateNtime datetime)

    Also, you can configure to purge the records in this table which are older than 5 days(or your preferred value)

    ----------
    Ashish

  • Paul Randal (9/24/2012)


    2) Did CHECKDB complete and find errors? Check the value of @@ERROR afterwards - guarantees to be non-zero if CHECKDB found/had a problem.

    Thanks Paul, that's a much better way to do it!

  • JamesMorrison (9/24/2012)


    What would you suggest for those that cannot do the 'verify backup integrity' due to time constraits?

    In my environment we have multiple production database clusters, each with multiple TB databases.

    1) It doesn't seem feasible to use 'verify backup integrity' because of the extra time requirements.

    2) Restoring all of the backups elsewhere does not seem viable just due to the number of backups that exist.

    One option would be to to the restore verify with checksum on another machine. That way you're not using the space it would take to restore but you're getting some level of assurance it's good. You can also look into a tool that lets you run a DB from a backup. RedGate has one as does Idera's backup software. I'm not sure if Quest's does it yet or not but if not I'm betting the next version or two will.

    JamesMorrison


    We are requested to refresh QA, Dev or Performance testing environments every few weeks. So we do take prod backups and restore them elsewhere. But there is not a fixed schedule for this.

    If the organization is open to a schedule for one of those environments that's something to consider. It may even be worth doing outside this since there are additional benefits to that which may outweigh the drawbacks.

  • Hi James,

    As as already been suggested, you could use a virtual restore product instead of full restores. However, as hard as it may be, even for many mutli-TB databases, if you're not testing your backups actually work, you're putting yourself at risk in the event of a disaster. I can only suggest you make it clear to the business that backups may fail to restore (on any platform, not just SQL Server) and that they provide some budget for a test restore system and time to implement an automated mechanism to do it.

    I know it's hard to do, but there's just no alternative. Even with the best redundancy system, it can fail and you're left relying on your backups.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I wasn't able to convince my boss to set SAN space aside to restore our databases but since drives for a workstation are cheaper I was able to convince him to get a drive big enough to restore our largest DB. It won't be a good performance test but it does let us restore any DB so we can automate restores.

  • Exactly - doesn't need to be superfast, just needs to be capable of doing it so you know the back at least restores.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I run CHECKDB on the restored backup to make sure that, not only is the prod database healthy but also the backup and restore process doesn't create corruption.

    I have a SQL Agent job that backs up the prod database every night and restores it to a dev database which I then use to code against

    STEP 1 - BACKUP

    BACKUP DATABASE [ServicePro] TO DISK = N'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Backup\#####.bak' WITH NOFORMAT, INIT, NAME = N'#####-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10

    GO

    declare @backupSetId as int

    select @backupSetId = position from msdb..backupset where database_name=N'#####' and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'####' )

    if @backupSetId is null begin raiserror(N'Verify failed. Backup information for database ''####'' not found.', 16, 1) end

    RESTORE VERIFYONLY FROM DISK = N'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Backup\#####.bak' WITH FILE = @backupSetId, NOUNLOAD, NOREWIND

    GO

    STEP 2 - RESTORE

    ALTER DATABASE ######DEV

    SET SINGLE_USER WITH ROLLBACK IMMEDIATE

    GO

    RESTORE DATABASE ######DEV

    FROM DISK = 'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Backup\####.bak'

    WITH MOVE '###_Data' TO 'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\####DEV_Data.MDF',

    MOVE '###_Log' TO 'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\###DEV_Log.LDF',

    REPLACE

    GO

    ALTER DATABASE #####DEV

    SET MULTI_USER WITH ROLLBACK IMMEDIATE

  • For those interested, it's worth mentioning that Red Gate SQL Backup[/url] has an automation routine built into the newest release that makes backup testing a lot easier. It will do the full restore, with options enabled, run a DBCC against it and clean it up for you, all on the fly as part of your scheduled processing. It might be worth checking out if you haven't yet built a bespoke version of this process.

    DISCLOSURE: I'm a proud & happy Red Gate employee.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Gazareth (9/24/2012)


    Yes, thanks for the article Paul.

    Is there actually an easy way to check the output from DBCC CHECKDB?

    I currently scan the SQL log using xp_readerrorlog for the DBCC output info, another way is to use TABLERESULTS clause, and check for the completion line.

    However, both methods could be called undocumented - you won't find either in BOL!

    This caused me much grief, also -- trying to capture the output from DBCC CHECKDB. I did a bit of research into this and found a very useful website (now 404) and info. from Paul Randal (including some very useful deliberately mangled databases that can be used for testing).

    Based on that -- and much testing and some painful code failures :ermm: -- I put together the following SP. It will run DBCC CHECKDB for every database on the server (with some hard-coded exceptions; change this to suit your needs) and send an HTML-formatted email to you if it finds any errors. There's just one input parameter: your email address.

    There's a lot of text in the notes section at the top that might be useful to you, including what the shortcomings of using a maintenance plan are, info. on the returned data types from CHECKDB, web links to relevant materials, etc. Note that one of Paul's broken databases deliberately produces a serious enough error that it will prevent the rest of the SP from running for other databases (see notes); to handle situations like that, I wrapped the DBCC CHECKDB in a TRY..CATCH so that I will be notified separately that the script didn't execute and I need to investigate pronto.

    It's been a while since I built this, and if folks here have suggestions or find problems, I'd appreciate hearing about them.

    USE [DBA]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE PROC Run_CHECKDB @email VARCHAR(8000)

    As

    BEGIN

    /*

    Utility SP to run DBCC against a hardcoded list of DBs and report any found errors.

    Adapted from: http://www.lockergnome.com/sqlsquirrel/2009/03/11/inserting-dbcc-checkdb-results-into-a-table/

    (That page has gone 404 as of 2012-05-31)

    CREATED: 2011-05-21

    PURPOSE: For daily/weekly DB integrity checks. Why needed? See notes below. There are problems using a maintenance plan task.

    INPUTS:

    @email-a [;]-delimited list of recipients to be notified if errors found

    HARD-CODED:

    curDB-Cursor w/ WHERE clause of databases to in/exclude. Modify below as needed.

    NOTES:

    Overview

    -Create a cursor of databases and run DBCC for each.

    -Run DBCC WITH TABLERESULTS, NO_INFOMSGS to generate a table of output.

    -Encapsulate everything in TRY...CATCH b/c of the possibility of fatal errors when running DBCC CHECKDB against severely corrupted DBs.

    -Though I initially coded this to save the results to a temp table, I decided I'd like to have each day's output saved to the DB

    for later querying if errors were reported.

    DBCC CHECKDB reporting sucks.

    -There's no easy way to save/report the results.

    -If you just run DBCC CHECKDB maint. plan as an agent job and if CHECKDB finds any DB errors,

    a "failed job step" is logged in msdb, which isn't really true. This results in an

    unnecessary report from the Failed Job Steps agent job.

    -Furthermore, a maint. plan "notify operator task" is based on the success/failure of CHECKDB reports for ALL databases;

    no way to report on specific database success/failure.

    -Another problem w/ maint. plans: no way to script a dynamic list of databases to EXclude. You specify which ones

    to INclude, but if you later add new databases, they won't be included automatically.

    -Even using the method in this SP, there's nothing online about what all the columns mean, nor

    how to easily identify rows that represent a true error.

    Corrections to website/datatypes

    -Several column definitions from the website are wrong, others may be too....

    -Which raises another problem with DBCC CHECKDB: As confirmed by Paul Randall (lost the URL...),

    there is NO complete published list of the output data types. Not listed in BOL either. Grrrrrr...

    Testing

    -Tested against production and corrupted databases from Paul Randal's website: http://sqlskills.com/pastConferences.asp

    -One of his corrupted DBs, DemoFatalCorruption2, deliberately fails so bad that it terminates DBCC CHECKDB early,

    preventing subsequent databases from being run and which would cause the agent execution to log a failed job step.

    It returns simply: "Possible schema corruption. Run DBCC CHECKCATALOG."

    So, use TRY...CATCH w/ an error-handler to terminate gracefully and warn the DBA.

    How to return just the errors from DBCC?

    -Running DBCC CHECKDB WITH NO_INFOMSGS excludes the rows we're not interested in, so use that.

    -Before I found the WITH NO_INFOMSGS option, I tried various WHERE clauses to return just the errors from the

    table results. I could not find any consistent way to identify just the rows reporting errors.

    The only sure way I could determine was to look for the last lines of the report output, which are:

    LIKE 'CHECKDB found%' AND NOT LIKE 'CHECKDB found 0 allocation errors and 0 consistency errors%'

    See superseded code in comments at bottom if interested.

    */

    SET NOCOUNT ON

    DECLARE @db NVARCHAR(128);

    DECLARE @html VARCHAR(MAX);

    DECLARE @CrLf CHAR(2);

    SET @CrLf = CHAR(13) + CHAR(10);

    IF OBJECT_ID('DBCCResults', 'U') IS NOT NULL

    DROP TABLE DBCCResults;--Drop and recreate (instead of just TRUNCATE) so we can easily change column list & type below if more turn out to be incorrect.

    CREATE TABLE DBCCResults

    (

    --Columns I added to make reporting easier:

    RunDateTime DATETIME DEFAULT CURRENT_TIMESTAMP,

    ObjectName NVARCHAR(128),

    DBName NVARCHAR(128),

    --Columns matching DBCC WITH TABLERESULTS

    [Error] [int] NULL,

    [Level] [int] NULL,

    [State] [int] NULL,

    [MessageText] [varchar](7000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [RepairLevel] NVARCHAR(128) NULL,--Website incorrectly had this as INT.

    [Status] [int] NULL,

    [DbId] [int] NULL,

    [ObjectID] BIGINT NULL,--Website incorrectly had this as INT.

    [IndId] [int] NULL,

    [PartitionId] BIGINT NULL,--Website incorrectly had this as INT.

    [AllocUnitId] BIGINT NULL,--Website incorrectly had this as INT.

    [File] [int] NULL,

    [Page] [int] NULL,

    [Slot] [int] NULL,

    [RefFile] [int] NULL,

    [RefPage] [int] NULL,

    [RefSlot] [int] NULL,

    [Allocation] [int] NULL

    );

    --Create cursor of databases to test

    DECLARE curDB CURSOR LOCAL FOR

    SELECT [name]

    FROM master..sysdatabases As DB

    WHERE [name] NOT IN ('tempdb','Broken');

    --Wrap all in TRY..CATCH

    BEGIN TRY

    OPEN curDB

    FETCH NEXT FROM curDB INTO @db;

    WHILE (@@FETCH_STATUS = 0)

    BEGIN

    SET NOCOUNT ON

    INSERT INTO DBCCResults(

    Error,

    [Level],

    [State],

    MessageText,

    RepairLevel,

    [Status],

    [DbId],

    ObjectID,

    IndId,

    PartitionId,-- specific to SQL Server 2005, remove for SQL Server 2000 results

    AllocUnitId,-- specific to SQL Server 2005, remove for SQL Server 2000 results

    [File],

    Page,

    Slot,

    RefFile,

    RefPage,

    RefSlot,

    Allocation

    )

    EXEC ('DBCC CHECKDB(''' + @db + ''') WITH TABLERESULTS, NO_INFOMSGS');

    FETCH NEXT FROM curDB INTO @db;

    END

    CLOSE curDB;

    DEALLOCATE curDB;

    --Update our added columns, based on info. returned by DBCC

    UPDATE DBCCResults

    SET ObjectName = OBJECT_NAME(ObjectID, DbID),

    DBName = DB_Name(DbID);

    --If any errors found, send an email with an abbreviated list of errors

    IF EXISTS(SELECT * FROM DBCCResults)

    BEGIN

    --Create html

    SELECT @html =

    '<html>' + @CrLf +

    '<table border="1" cellpadding="2">' + @CrLf +

    '<tr><th>RunDateTime</th><th>Database</th><th>Object</th></tr>' + @CrLf;

    SELECT @html = @html +

    '<tr>' +

    '<td>' + Convert(VARCHAR(30), RunDateTime, 120) + '</td>' +

    '<td>' + COALESCE(DBName, '') + '</td>' +

    '<td>' + COALESCE(ObjectName, '') + '</td>' +

    '</tr>' + @CrLf

    FROM DBCCResults

    --ORDER BY DBName, ObjectName;

    SELECT @html = @html +

    '</table>' + @CrLf +

    '</html>';

    -- Send email out

    EXEC msdb.dbo.sp_send_dbmail

    @recipients = @email,

    @subject = 'DBCC CHECKDB Encountered Errors',

    @body = @html,

    @body_format = 'HTML';

    END

    END TRY

    BEGIN CATCH

    --If we're here, then a fatal error may have been thrown while checking a DB.

    --If cursor is still open, close it.

    IF CURSOR_STATUS('local', 'curDB') >= 0

    BEGIN

    CLOSE curDB;

    DEALLOCATE curDB;

    END

    EXEC msdb.dbo.sp_send_dbmail

    @recipients = @email,

    @subject = 'Fatal Errors Were Encountered Running SQL Agent DBCC CHECKDB!',

    @body = 'Run DBCC CHECKDB ASAP and verify most recent backups.',

    @body_format = 'HTML';

    END CATCH

    END

    GO

    I went to archive.org and found an old version of that now-404 web page and saved it off to a PDF. See attached. As indicated in the notes of my SP, some of the data types in that web page are incorrect. I discovered these during testing when I received "data type mismatch" errors or some such.

    Rich

    P.S. (added later). You'll see a reference in the notes to "Failed Job Steps agent job." This is a separate Agent job I created to query msdb for any failed agent job steps and notify me about them by email. I have many multi-step, non-sequential jobs that need to continue execution if any individual step fails, so I cannot simply rely on having an email sent if the job fails: in these cases, the job succeeds, only a step fails.

    P.P.S.: Developed and run on SQL 2005 Enterprise.

  • Hi

    this one is really a needed article.

    But apart from this I want to know one thing, that is I have a store procedure made by myself for (xyz thing) "sorry can't disclose it", so how can i get that patented to my name? I am from INDIA.

    Hope to see some helpful steps from your end.

    Thanks.

    [font="Comic Sans MS"]Rahul:-P[/font]

  • Rahul The Dba (9/27/2012)


    Hi

    this one is really a needed article.

    But apart from this I want to know one thing, that is I have a store procedure made by myself for (xyz thing) "sorry can't disclose it", so how can i get that patented to my name? I am from INDIA.

    Hope to see some helpful steps from your end.

    Thanks.

    Well, I can only assume your idea deals with backing up databases. If not, then please start a new topic.

    I don't know much about patents except that the whole issue of patents has pretty much run out of control - everywhere. Not to mention Apple vs Samsung. You have now motivated me to dig into that story and exactly how the Apple patents were violated by Samsung.

    But your question has raised many questions in my mind. Thus what you want to patent is not a stored procedure but an idea. This means that once your idea becomes public, anyone with some sql skills (say a million or so people) can propabily write the stored procedure that does exactly what your idea proposes. In your opinion, would this violate the patent? But what if I write a stored procedure that does the same thing, but is 10 times faster than yours. Or what if I write a process, not using an sql stored procedure, that does the same thing 50 times faster. Or what if I write something that does the same thing but is 3 times better than your idea?

    In any case, in my opinion, patenting stored procedures is no way to become rich quickly. And besides, this is a forum where you share things.

  • Oh yes, getting back to my original question, is there anyone using sqlcmd.exe to backup databases? Or has this become 'obsolete'?

Viewing 15 posts - 16 through 30 (of 65 total)

You must be logged in to reply to this topic. Login to reply