Verify backup files. How much does it help?

Question

Verify backup files. How much does it help?

Alexander Zhang

Default port

Points: 1497
More actions
February 28, 2018 at 2:17 pm

#410281

Since I start my DBA career, as the best practice, I always verify backup to make sure it is complete and entire backup is readable. But, actually, I never ran into the situation(Let's name it ProblemX) that backup is successfully done while verify shows failures. I have some questions
1. Did you ever ran into this issue?
2. If yes, what's the root cause?
Microsoft doc says, 'However, RESTORE VERIFYONLY does not attempt to verify the structure of the data contained in the backup volumes.'. It impresses me that it's helpful, but not so much.
Recently, I invested tons of time on some programming to consolidate backup & restore. One of my confusion is, whether it's necessary to implement backup and Verify in parallel (means, verify without need to wait backup to be done). Technically, the process is
1. fetch a slice (byte array) from backup stream
2. Make a copy. Totally have two identical copies (original and new copy) now.
3. One copy for thread 1 which do the verify job
4. The other copy for thread 2 which writes data to backup file
But, if root cause of ProblemX is right in step 4, the verify does NOT make any sense.
Please let me know your thoughts on it. Thanks.
GASQL.com - Focus on Database and Cloud

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

rvsc48 SSCertifiable Points: 7796 More actions · Answer 1

Experts say, rather than do a verify, simply prove the backup is a good one completely by restoring it, where if the restore is successful, you've proved the backup is a good one. Places I have worked, we have never seen a failure on the verify if it was in place. In fact, I even read where some people have even experienced a good verify of the backup and later found their backup was not a good one after all when trying to restore it.

Alexander Zhang Default port Points: 1497 More actions · Answer 2

RVSC48 - Wednesday, February 28, 2018 2:44 PM
Experts say, rather than do a verify, simply prove the backup is a good one completely by restoring it, where if the restore is successful, you've proved the backup is a good one. Places I have worked, we have never seen a failure on the verify if it was in place. In fact, I even read where some people have even experienced a good verify of the backup and later found their backup was not a good one after all when trying to restore it.

Thanks for sharing your thought. Do you think it's a good idea to remove Verify from backup job? You know, if DB is huge, Verify does spend much time.

GASQL.com - Focus on Database and Cloud

Sue_H SSC Guru Points: 90860 More actions · Answer 3

Alexander Zhang - Wednesday, February 28, 2018 2:17 PM
Recently, I invested tons of time on some programming to consolidate backup & restore. One of my confusion is, whether it's necessary to implement backup and Verify in parallel (means, verify without need to wait backup to be done).

The verify also sees if SQL Server can read the file so if the file hasn't been fully written to disk I don't think you can say it's a valid file that SQL Server can read from. Verify also reads some of the header and the header also has the date/time completed. It seems that writing the file header would need to be completed after the backup has completed if it includes that date/time.
But along the same lines as what RVSC48 said, the only assurance that it's a decent backup that can be used is by restoring it.

Sue

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 4

Unless you take your backup WITH CHECKSUM, the verify just reads the header. It's pretty worthless.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Alexander Zhang Default port Points: 1497 More actions · Answer 5

Thanks guys for your input. I will give up implementing backup and Verify in parallel. Instead, I will implement backup and restore(in other machine) in parallel.

It makes to think about another question. Why do we backup? The short answer is for restore. But when? Actually, disaster is a very small probability event. In most cases, we do that for new environment initialization or data migration. The tradition process is, to backup DB to local disk or share, then copy backup to target server, and then restore it. There are two problems.
(1) We can't restore until Backup is 100% finished.
(2) Extra storage is needed.

With backup and restore in parallel, Backup DB in Instance-A, and directly restore it in Instance-B at the same time, even though one or both of them stand in AWS EC2. No storage & No Wait! Sounds good?

GASQL.com - Focus on Database and Cloud

Anonymous SSC-Dedicated Points: 32891 More actions · Answer 6

Alexander Zhang - Friday, March 2, 2018 9:15 AM
Thanks guys for your input. I will give up implementing backup and Verify in parallel. Instead, I will implement backup and restore(in other machine) in parallel.
It makes to think about another question. Why do we backup? The short answer is for restore. But when? Actually, disaster is a very small probability event. In most cases, we do that for new environment initialization or data migration. The tradition process is, to backup DB to local disk or share, then copy backup to target server, and then restore it. There are two problems.
(1) We can't restore until Backup is 100% finished.
(2) Extra storage is needed.
With backup and restore in parallel, Backup DB in Instance-A, and directly restore it in Instance-B at the same time, even though one or both of them stand in AWS EC2. No storage & No Wait! Sounds good?

The database backup is also for a whoops, I deleted all of the data in the table because I didn't include a where clause, I updated the wrong rows, I have corruption in the database.... many, many reasons to have backups. It is just not for an actual disaster. You may also have a need to refresh a non production environment for testing purposes as well.

Alexander Zhang Default port Points: 1497 More actions · Answer 7

Summer90 - Monday, March 5, 2018 1:37 PM
Alexander Zhang - Friday, March 2, 2018 9:15 AM
Thanks guys for your input. I will give up implementing backup and Verify in parallel. Instead, I will implement backup and restore(in other machine) in parallel.
It makes to think about another question. Why do we backup? The short answer is for restore. But when? Actually, disaster is a very small probability event. In most cases, we do that for new environment initialization or data migration. The tradition process is, to backup DB to local disk or share, then copy backup to target server, and then restore it. There are two problems.
(1) We can't restore until Backup is 100% finished.
(2) Extra storage is needed.
With backup and restore in parallel, Backup DB in Instance-A, and directly restore it in Instance-B at the same time, even though one or both of them stand in AWS EC2. No storage & No Wait! Sounds good?
The database backup is also for a whoops, I deleted all of the data in the table because I didn't include a where clause, I updated the wrong rows, I have corruption in the database.... many, many reasons to have backups. It is just not for an actual disaster. You may also have a need to refresh a non production environment for testing purposes as well.

Yes, you are correct. It's very bad that I assume the issues I never ran into did not or rarely exist. Thanks for reminding me of them.

GASQL.com - Focus on Database and Cloud