Differential backup too big

Question

Post reply

Differential backup too big

Robert vd Berg

SSC Eights!

Points: 872
More actions
October 17, 2016 at 9:02 am

#314516

Since two weeks, my differential backups are getting too big. I have a 4 TB database. Our backup schedule: full backup every week, differential backup every night, log backup every 5 minutes. The full backup is 800 GB (compressed). After a few days of small differential backups, suddenly the differential backups become around 700 GB. This has happened this week, and last week. This database is not very active. All transaction logs are around 200kB, and the amount of changed extents is just above a thousand (according to a script from Paul Randal), less than 0.01%. So I'd expect a tiny differential backup.
There are two more things that are worth mentioning: the database is a subscriber in transactional replication, and CDC is used on these articles. Both replication and CDC jobs appear to be working fine.
At this moment, I have no idea where to look. Can anybody help?
Robert van den Berg
Freelance DBA
Author of:

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004355 More actions · Answer 1

Was the TR and CDC in place for a good period of time before the large DIF problem occurred?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

1) What is the exact script you used to identify changed extents?

2) What changed between the period where DIFFs were always small and the recent issues?

3) What activity was done the day/night that preceded the huge DIFF this week? Was last week the same scenario (small for one or two then huge on same day)?

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 3

Another good bit of info to get would be whether most of the data backed up for the large differentials is from the data files or the log file.

Something like the following query should point that out:

SELECT bs.database_name,

bs.backup_start_date,

backup_MB=SUM(bf.backed_up_page_count)/128,

bf.file_type

FROM msdb.dbo.backupset bs

INNER JOIN

msdb.dbo.backupfile bf ON bs.backup_set_id=bf.backup_set_id

WHERE bs.type='I' --differential backups

AND

bs.database_name='your_database_name'

GROUP BY bs.database_name,

bs.backup_start_date,

bf.file_type

ORDER BY bs.backup_start_date DESC,

backup_MB DESC

Cheers!

Robert vd Berg SSC Eights! Points: 872 More actions · Answer 4

@jeff: yes, the TR and CDC were in place several weeks before we've been having this problem.

@kevin:

1) What is the exact script you used to identify changed extents?

http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/

2) What changed between the period where DIFFs were always small and the recent issues?

The backup share filled up. Otherwise, I have no idea. All the "usual suspects" claim they did nothing, and I have no logging to suggest otherwise.

3) What activity was done the day/night that preceded the huge DIFF this week? Was last week the same scenario (small for one or two then huge on same day)?

The only thing I know of, is the database reorganisation. But since this finished in zero seconds, and didn't cause the transaction log backups to grow above 200 kB, I'm guessing that's not it.

@jacob: thanks for the script, I'll run that tomorrow morning.

Robert van den Berg

Freelance DBA
Author of:

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

Robert vd Berg (10/17/2016)
@Jeff: yes, the TR and CDC were in place several weeks before we've been having this problem.
@kevin:
1) What is the exact script you used to identify changed extents?
http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/
2) What changed between the period where DIFFs were always small and the recent issues?
The backup share filled up. Otherwise, I have no idea. All the "usual suspects" claim they did nothing, and I have no logging to suggest otherwise.
3) What activity was done the day/night that preceded the huge DIFF this week? Was last week the same scenario (small for one or two then huge on same day)?
The only thing I know of, is the database reorganisation. But since this finished in zero seconds, and didn't cause the transaction log backups to grow above 200 kB, I'm guessing that's not it.
@jacob: thanks for the script, I'll run that tomorrow morning.

When posting links, please use the [ url] and [ /url] (no spaces inside the square brackets) to make the link "clickable".

http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

Thinking outside the box:

1) Got filestream by any chance?

2) In-Memory OLTP?

3) Columnstore indexes?

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Robert vd Berg SSC Eights! Points: 872 More actions · Answer 7

Hello Kevin,

"No" to all three.

Greetings,

Robert

Robert van den Berg

Freelance DBA
Author of:

x SSC-Insane Points: 23660 More actions · Answer 8

x

SSC-Insane

Points: 23660

October 17, 2016 at 12:37 pm

#1906977

Index reorgs / rebuilds?

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 9

Robert vd Berg (10/17/2016)
Hello Kevin,
"No" to all three.
Greetings,
Robert

Next guess is perhaps the algorithm Paul posted in 2008 is off for 2014 databases for some reason. I do note he specifically mentions large databases as a potential issue.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jeff Moden SSC Guru Points: 1004355 More actions · Answer 10

Right now, I'm thinking that either someone made a code change that updates more of the database than before or someone "got the religion" about index maintenance or similar. I also think that the code that Jacob Wilkins posted above might give you a clue about that. Have you tried running his code? And are you sure that the weekly backup actually took place? Lot's of folks have backup code that won't fail if something goes wrong with a particular database backup but (hopefully) does log the failure. You might want to check MSDB for backup failures.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Robert vd Berg SSC Eights! Points: 872 More actions · Answer 11

I ran the code from Jacob. This is the result for last nights backup:

backup_start_datebackup_MBfile_type

2016-10-17 21:40:38.0001723566.125000D

2016-10-17 21:40:38.0001.609375L

That is what I would expect, since the log file is practically empty, and all log backups are small (these small log backups are also why I have no doubt about the script from Paul Randal).

@jeff: yes, I did double check that we in fact do have a full backup, both on disk and using the following code:

SELECTdatabase_name, type, backup_finish_date, backup_size

FROMmsdb..backupset

WHEREbackup_finish_date > '2016-10-01'

and database_name = 'my database'

and type <> 'L';

My next move will be to make a native backup (instead of using Red Gate). I doubt this will solve the problem, but it's worth a try.

Robert van den Berg

Freelance DBA
Author of:

Robert vd Berg SSC Eights! Points: 872 More actions · Answer 12

I found the cause, so I thought I'd let you know. Between the small diffs and the larger diffs was a failed full backup (due to the backup share being full). I replicated that, by forcing a full backup and, while that was running, killing the backup service. The next diff is currently running, and it will be large, even though the number of changed extents was lower after the crashed full backup than before.

Robert van den Berg

Freelance DBA
Author of:

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

Robert vd Berg (10/19/2016)
I found the cause, so I thought I'd let you know. Between the small diffs and the larger diffs was a failed full backup (due to the backup share being full). I replicated that, by forcing a full backup and, while that was running, killing the backup service. The next diff is currently running, and it will be large, even though the number of changed extents was lower after the crashed full backup than before.

Please open a ticket with Microsoft about this, or at least report it on Connect. I wonder if it should work that way and/or if there is something that can be done to make it not do that. I suspect not, but can't hurt to ask. At first blush it sounds like all the bit-mapped DIFF pages are wiped clean at the start of the FULL backup but I would think perhaps this shouldn't be done until the very last thing to avoid this very scenario.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 14

Thanks for that update!

From looking at the log it looks like it probably is something along the lines of what Kevin suggested.

For both failed and successful full backups, there's a transaction "Backup:InvalidateDiffMaps" that occurs very early in the backup, and Paul Randal suggests at http://www.sqlskills.com/blogs/paul/debunking-a-couple-of-myths-around-full-database-backups/ is the clearing of the DCM, which makes sense.

The failed backup just seems to be missing the commit of the differential base, and some writes to the file headers and boot pages, which is expected.

It seems then that SQL Server knows it can't rely on the diff maps for subsequent differentials, and does differentials differently in that case.

I can reproduce the increased size easily enough by cancelling a full backup and then running a diff.

Differential backups after a failed full also switch to processing every page in the DB, even though they doesn't necessarily include all of them in the backup.

This behavior was also pointed out in the blog post here: https://sqlscope.wordpress.com/2014/12/06/failed-full-backup-can-impact-next-differential-backup/

It definitely seems like something intentional on the part of MS, but it would be nice if it were better (at all) documented.

Cheers!

EDIT: Added what detail I was able to get from the log.