Restore help needed for 1.5 TB database.

Question

Restore help needed for 1.5 TB database.

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply

sirishyam SSC Eights! Points: 813 More actions · Answer 1

Hello Dear,

Please use red gate evaluation SQL backup tool for restoration purpose. Even in my environment i used the same tool for restoring the DB.My db size is 900 GB and files were 100 files.its restored within 5 hrs time.

Good luck.:-P:-P

Ross McMicken SSCarpal Tunnel Points: 4438 More actions · Answer 2

Hugo Kornelis (3/31/2016)
Chitown (3/31/2016)
Hugo Kornelis (3/31/2016)
1200 logical files? That is a lot!
I recommend using this opportunity as a test case to check how long the restore will take. Then compare that with the recovery time objective (RTO) that is included in the SLA. If the production database goes down, are you even able to meet that RTO? If not, you will either have to renegotiate the SLA or change the backup/restore strategy to ensure that restores can be performed fast enough.
We have HA in place so just in case if this Prod goes down.
That's a good start.
Now what if someone accidentally forgets to add a WHERE clause to "DELETE FROM OrderData"?

Why would anyone be running a query like that directly in production using, presumably, their own ID? Use automated tools to deploy scripts that have been fully tested and vetted in an acceptance environment. I cannot think of a valid a reason to make on the fly updates in production, or for any users to have update access with their everyday ID.

jasona.work SSC Guru Points: 50118 More actions · Answer 3

Bear with me here, I want to make sure I've got things right:

1. You've got 1200 (wow) logical files, I'm presuming these are the backup files and that you're striping the backups?

2. When you try to set up a restore using the SSMS GUI, with the intent to script it out, it times out, presumably while it's trying to load the filelist?

If I've got these right, then I'd be inclined to think that you've got just way too many files for the GUI to handle for a restore. In which case, the suggestion from tripleAxe, generating the script(s) for the restore yourself would be the right way to go. Arguably, I'd probably do it using Powershell, rather than within SSMS, partly because I'd half expect the RESTORE FILELIST to fail as well.

Lastly, as Hugo has pointed out, this should also be used as a test (maybe once you've sorted out how you're going to handle the restore) of whether you can meet your RTOs with your current backups. Remember, HA =/= DR. You may have a HA solution in place and tested, but that doesn't necessarily mean you've got a working DR solution in place. As a for instance, if you've got a SQL cluster set up and tested, but both servers are in the same datacenter, what's your DR plan if the datacenter burns to the ground? If your backups are taken offsite, you need to be able to restore them, and right now it sounds like you can't. So, you currently have no DR solution in place, that you *KNOW* works. And, while everyone's favorite "what if" is a DELETE with no WHERE, what about an UPDATE with an incorrect WHERE? There are so many lovely, lovely ways that the data can get borked...

Ideally, whatever solution you come up with will be something you can save to use in the future, rather than a "one-off" solution...

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 4

I'm thinking that the 1200 files could be due to the partitioning of 1 or more large tables. It's not unheard of for folks to partition by week or even by day especially if they need to do something like float the last six months or year.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

jasona.work SSC Guru Points: 50118 More actions · Answer 5

Jeff Moden (4/1/2016)
I'm thinking that the 1200 files could be due to the partitioning of 1 or more large tables. It's not unheard of for folks to partition by week or even by day especially if they need to do something like float the last six months or year.

It could be, that's why I asked if I had it right. If it is a case of something like that, then he might be a candidate for looking at piecemeal filegroup restores (of which I have no experience with, other than knowing they exist, and have some fairly picky limitations)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

Ross McMicken (4/1/2016)
Hugo Kornelis (3/31/2016)
Chitown (3/31/2016)
Hugo Kornelis (3/31/2016)
1200 logical files? That is a lot!
I recommend using this opportunity as a test case to check how long the restore will take. Then compare that with the recovery time objective (RTO) that is included in the SLA. If the production database goes down, are you even able to meet that RTO? If not, you will either have to renegotiate the SLA or change the backup/restore strategy to ensure that restores can be performed fast enough.
We have HA in place so just in case if this Prod goes down.
That's a good start.
Now what if someone accidentally forgets to add a WHERE clause to "DELETE FROM OrderData"?
Why would anyone be running a query like that directly in production using, presumably, their own ID? Use automated tools to deploy scripts that have been fully tested and vetted in an acceptance environment. I cannot think of a valid a reason to make on the fly updates in production, or for any users to have update access with their everyday ID.

I must say that you exist in a VERY different SQL Server world than the one I have been consulting in for almost 20 years now. From mom-and-pop shops all the way to Fortune 100, EVERY single client I have ever come across does exactly the things you state you can't imagine happening. Multiple people (often LOTS or even more often EVERYONE) has SA credentials. Direct DML is common. I can count on one or maybe two hands those that have had automated tools, deploy scripts. Almost none have "testing and vetting" of those things in an acceptance environment. I can count on 2 fingers the number of clients that have had an environment that truly mimicked production.

Be very happy for the nice, safe world in which you reside. 😎

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

EdVassie SSC Guru Points: 60478 More actions · Answer 7

jay81 and jasona got there before me, but they have both said something very important. It may not help you with this restore, but could do so for the future.

With a database this size, you need to divide the database into filegroups and do filegroup-based backup, with each backup striped across multiple files. This means your filegroup backups can run in parallel. When you do a restore each filegroup can be restored in parallel, with greater parallelism given from the striping.

This will give you a complicated backup and restore scenario, but the driver behind all of this is first your RTO, then second your backup window duration.

Back in 2001 I dealt with a client whose largest database was 17TB, and thought of their minor 2TB database systems as small. They did all of these things, and regularly practiced a full restore to make sure they knew how to do it. Their system was hosted on DB2, but the principals of concepts of filegroup (tablespace in DB2) backup with striping are the same as in SQL Server. RTO and RPO and complexity are also the same.

One technique they used to meet the backup window was to do a full backup of a set of filegroups on day 1, with differentials for the rest of the filegroups. Over the next 6 days a different set of filegroups got the full backup, so that the oldest full backup was 1 week old. A restore to a given point in time (PIT) always needed full, diff and log backups to be restored. Complicated but workable.

The limiting factor in your backup and restore process needs to be the saturation level of your storage subsystem. If it is not saturated with IO then backup and restore is taking longer than it needs to. If you are saturating your storage subsystem with IO and RTO is not being met, then it is time to talk to your storage vendor on the options and costs of improving things.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

sjimmo SSChampion Points: 11139 More actions · Answer 8

My question is are all of the files making up the database on one file/filegroup?

Ideally I would, for a database this size, perform all of the dumps using file/filegroup. It makes the restore process for DR much faster, as you are only restoring the defective filegroup.

In the case that you are describing though, creating a production db on a QA system, the restore time is going to be based upon many factors including disk/san speed, controller settings, network speed, and various such things. QA/Dev systems are normally not as beefy as your production boxes.

Good luck. Grab a coffee and a good book 😉

Steve Jimmo
Sr DBA
“If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 9

EdVassie (4/1/2016)
This means your filegroup backups can run in parallel. When you do a restore each filegroup can be restored in parallel, with greater parallelism given from the striping.

If they're all on the same spindle, it won't help. In fact, it will probably slow things down because of head chatter. Electronics and skill are still subject to the unimpeachable limits of the physical bottleneck known as the read/write head and arm. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

EdVassie SSC Guru Points: 60478 More actions · Answer 10

If they're all on the same spindle, it won't help.

...very true!!

If the OP is looking for the fastest way to backup and restore a 1.5TB database they will try to spread the IO load across multiple spindles, and ideally across multiple IO channels. Also using instant file initialisation (as said in one of the posts) is important. The bigger the DB you have to administer, the more thought, testing and money spent is needed.

If they want the slowest backup and restore a single file to a single spindle is a major part of achieving this...

Most of us never see a single DB above (say) 500GB. For these smaller databases you can start with the same maintenance template for all of them and just do minor tweaks where needed. For a VLDB everything has to be custom designed.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Chitown Hall of Fame Points: 3390 More actions · Answer 11

I have learned how to stay calm. Even though it felt like a lifetime, I scripted it out, restored the full backup and dif and no TLOGS. Took forever but it's done.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 12

Excellent!

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 13

jay81 (3/31/2016)

Use Backup file Striping with the methods described in the below blog. I could reduce to backup times from few hours to with in an hour for large databases.

http://henkvandervalk.com/how-to-increase-sql-database-full-backup-speed-using-compression-and-solid-state-disks%5B/quote%5D

Thanks for this!

Jeff Moden SSC Guru Points: 1004407 More actions · Answer 14

xsevensinzx (4/9/2016)

jay81 (3/31/2016)
Use Backup file Striping with the methods described in the below blog. I could reduce to backup times from few hours to with in an hour for large databases.
http://henkvandervalk.com/how-to-increase-sql-database-full-backup-speed-using-compression-and-solid-state-disks%5B/quote%5D
Thanks for this!

Just be careful. Backup "striping" will make backups run slower if you happen to be backing all files up to the same physical spindle. Some sans will try to spread the load but (from what I've been told) that's not what it will always do.

To add to that, if you're backing up to a SAN that also has your SQL Server data/log files on it and you lose the SAN, your pretty much toast because no one keeps tape backups up to the second. Don't laugh... it happened to us two weeks ago. We were lucky that it was "only" a Dev system but we used the opportunity as a DR drill. The only good part about it was that I got to say "I told you so".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Perry Whittle SSC Guru Points: 234013 More actions · Answer 15

Jeff Moden (4/1/2016)
I'm thinking that the 1200 files could be due to the partitioning of 1 or more large tables. It's not unheard of for folks to partition by week or even by day especially if they need to do something like float the last six months or year.

And usually they would be in separate filegroups but there's no mention of that here from the OP

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉