Big/huge Database restore taking lot of time

Question

Big/huge Database restore taking lot of time

anoop.mig29

SSCrazy

Points: 2224
More actions
August 10, 2023 at 4:20 am

#4257382

Hi
We have 500 GB database backup file which while restoring it shows that it would take around more then 48+ hours to complete restore .Database size might be around 4 TB . This server on which we are restoring is basically standby server where most of database are in standby/readonly state . What metrics should we capture to iron out issues .I mean from perfmon like avg disk sec/write(read or transform) or avg disk queue length .Any oyther parameter to check which can point if disk storage system is having issue .What would be there ideal threshold values of this parameters.
Even of restoring log backup files of size around 5 to 15 gb it takes more than 2-3 hours
Is there any other through which we can reduce restoration time as we have less windows time for the same
Regards
Anoop

August 11, 2023 at 12:56 pm

This was removed by the editor as SPAM

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 1

More, bigger, faster hardware. That's the deal. No other magic run faster switches for a restore operation... although in 2022 the recovery aspect of a restore is radically improved, so, an upgrade may help.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

frederico_fonseca SSCoach Points: 16154 More actions · Answer 2

I partially disagree - splitting the backup into 4 or 8 files, and changing restore BUFFERCOUNT and MAXTRANSFERSIZE do affect restore performance.

but one thing that hugely affects restore is the number of VLF' on the log file - this one, if too big, is a major performance hit on restore.

to the OP - can you give us the output of the following. if number is high the best advise is to do a once off shrink of the log

select [name]
     , s.database_id
     , count(l.database_id)                                as 'VLF Count'
     , sum(vlf_size_mb)                                    as 'VLF Size (MB)'
     , sum(cast(vlf_active as int))                        as 'Active VLF'
     , sum(vlf_active * vlf_size_mb)                       as 'Active VLF Size (MB)'
     , count(l.database_id) - sum(cast(vlf_active as int)) as 'In-active VLF'
     , sum(vlf_size_mb) - sum(vlf_active * vlf_size_mb)    as 'In-active VLF Size (MB)'
from sys.databases s
    cross apply sys.dm_db_log_info(s.database_id) l

where s.name = 'yourdbname' -- CHANGE THIS

group by [name]
       , s.database_id
order by 'VLF Count' desc

and output of this one as well

-- Find all data files with Percent or 1MB growth
select d.name                                                     as database_name
     , mf.name                                                    as file_name
     , mf.physical_name                                           as file_path
     , mf.type_desc                                               as file_type
     , convert(decimal(20, 2), (convert(decimal, mf.size) / 128)) as filesizeMB
     , mf.growth                                                  as growth
     , 'Percent'                                                  as growth_increment
from sys.master_files mf
    join sys.databases d
        on mf.database_id = d.database_id
where is_percent_growth = 1
and d.name = 'yourdbname' -- CHANGE THIS

union
select d.name                                                     as database_name
     , mf.name                                                    as file_name
     , mf.physical_name                                           as file_path
     , mf.type_desc                                               as file_type
     , convert(decimal(20, 2), (convert(decimal, mf.size) / 128)) as filesizeMB
     , (case
                 when mf.growth = 128 then
                     1
       end)                                                       as growth
     , 'MB'                                                       as growth_increment
from sys.master_files mf
    join sys.databases d
        on mf.database_id = d.database_id
where is_percent_growth = 0
and d.name = 'yourdbname' -- CHANGE THIS
--      and mf.growth = 128
order by d.name
       , mf.name

This reply was modified 1 year, 3 months ago by frederico_fonseca. Reason: change blocksize mention to the correct parameters on restore
This reply was modified 1 year, 3 months ago by frederico_fonseca.

Steve Jones - SSC Editor SSC Guru Points: 736193 More actions · Answer 3

As Grant noted, hardware, disk or network. You have something slowing down.

I would check the throughput you are getting, the peak reads and writes/s on the disks. Likely somewhere you will find hardware that is going slow. Are you reading directly from disk? Is this from some de-dupe tech? Restores from those are sometimes very sllllllloooooooowwwwwwwwww

Emperor100 SSCrazy Points: 2173 More actions · Answer 4

For moving the files faster, use your network backbone.

500GB is huge... Is it compressed backup?

=======================================================================

Jeff Moden SSC Guru Points: 1004329 More actions · Answer 5

We do nightly restores of two of our primary databases.

The large one is a 2TB database and has an 80GB log file with a compressed backup size of 500GB.

The smaller on is a 300GB database and has a 67GB log file with a compress backup size of 70GB.

It takes almost exactly 1 hour to restore both and the backups are store on a NAS. I do not set number of buffers or buffer size for the restores. Unlike during backups, I've found such settings just slows the restores down.

One of the KEYs to this performance is the NAS must be on the same domain as the server. When it's not, I get comparatively very slow restores. Not the 10 of hours like you are, though.

We've also excluded MDF, LDF, and NDF files from any form of virus checking, which can be a huge problem if you don't.

We've also been careful to ensure we don't have runaway growth of the number of VLFs in our log files, which could really slow things down.

Of course, the speed of your network makes a lot of difference.

Also, many NAS devices actually have a backup battery in them for cache. If it has reached end of life and can no longer charge, that won't hurt restores but it'll seriously slow down backups.

The big thing here will be to test the NAS for throughput, test the network for throughput, and check the server disks for thoughput.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Johan Bijnens SSC Guru Points: 135369 More actions · Answer 6

FWIW: I've read on multiple threads regarding "backup database", providing parameters , BUFFERCOUNT = 17 , MAXTRANSFERSIZE = 1048576 have better results regarding performance of the command in many cases.

After some tests, I have indeed modified all our backup jobs to use those parameters.

With regards to your problem restoring databases: "Restore Database" also has these parameters.

Are you using them?

Have you tested them ?

If not, I suggest you give it a try.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Jeff Moden SSC Guru Points: 1004329 More actions · Answer 7

I can vouch for the settings that Johan mentions because those are the ones that changed my life for backups years ago. However, I've found that they actually make restores worse. YMMV.

What I'm really concerned about is from the original post where the OP stated...

Even of restoring log backup files of size around 5 to 15 gb it takes more than 2-3 hours

That's worse than trying to do a restore from an old USB 1.0 thumb drive and that's going to be either a serious network issue or a storage device issue or both. Even without any special settings, that should take only seconds to complete. See my previous post above for more info.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

anoop.mig29 SSCrazy Points: 2224 More actions · Answer 8

Steve Jones - SSC Editor wrote:

As Grant noted, hardware, disk or network. You have something slowing down.
I would check the throughput you are getting, the peak reads and writes/s on the disks. Likely somewhere you will find hardware that is going slow. Are you reading directly from disk? Is this from some de-dupe tech? Restores from those are sometimes very sllllllloooooooowwwwwwwwww

Sorry for asking how to check for throughput of disk the peak reads and writes/s on the disks>i will check and post it here

anoop.mig29 SSCrazy Points: 2224 More actions · Answer 9

Emperor100 wrote:

For moving the files faster, use your network backbone.
500GB is huge... Is it compressed backup?

Most of your backups file are compressed only

anoop.mig29 SSCrazy Points: 2224 More actions · Answer 10

Jeff Moden wrote:

We do nightly restores of two of our primary databases.
The large one is a 2TB database and has an 80GB log file with a compressed backup size of 500GB. The smaller on is a 300GB database and has a 67GB log file with a compress backup size of 70GB.
It takes almost exactly 1 hour to restore both and the backups are store on a NAS. I do not set number of buffers or buffer size for the restores. Unlike during backups, I've found such settings just slows the restores down.
One of the KEYs to this performance is the NAS must be on the same domain as the server. When it's not, I get comparatively very slow restores. Not the 10 of hours like you are, though.
We've also excluded MDF, LDF, and NDF files from any form of virus checking, which can be a huge problem if you don't.
We've also been careful to ensure we don't have runaway growth of the number of VLFs in our log files, which could really slow things down.
Of course, the speed of your network makes a lot of difference.
Also, many NAS devices actually have a backup battery in them for cache. If it has reached end of life and can no longer charge, that won't hurt restores but it'll seriously slow down backups.
The big thing here will be to test the NAS for throughput, test the network for throughput, and check the server disks for thoughput.

Hi jepp let me know how to test server disk throuput and network throughput .Between i think all database files are exempted from antivirus scanning so they are issue i guess ..

Ley me given update that this server act as subscriber to one of database and there are around 30+ database which are in log shipping secondary .siez of this database in log shipping varies from small MBs to large TB sometime around 4 -5 ....

This log shipping database comes from different primaries