MCTS Training Kit Question

  • Below is the question and the two part answer that the training kit gives.

    My answer was only d. I disagree with doing e. taking a full backup of only a single filegroup a day, and alternating through the three of them. It would leave you with 3 filegroup back ups by the end of the week that are out of sync with each other (one from wed, thu, and fri).

    THE QUESTION

    You are designing a backup strategy for a very large database name Mineralogical. Testing has shown that it takes 10 hours to perform a full backup of this database. Unfortunately, except for the weekends , you have only a five-hour window (12:01A.M., to 5:00A.M.). The database has six tables spread across three filegroups of similar size. Each filegroup hosts two tables. All tables are updated heavily throughout the day during the workweek. You implemented a full backup to occur every Saturday morning at 12:15A.M.

    Which of the following strategies should you pursue? (Choose two)

    a. Configure a full backup of the database to occur every morning except Saturday at 12:15A.M.

    b. Configure a differential backup of the database to occur every morning except Saturday at 12:15A.M.

    c. Configure a full backup of all filegroups to occur each morning except Saturday at 12:15A.M.

    d. Configure a backup of the transaction log to occur every morning except Saturday at 5:25A.M.

    e. Configure a full backup of a single filegroup to occur each morning at 12:15A.M., except Saturday. Alternate the filegroup that is backed up each day.

    ANSWER d. & e.

    Explanation:

    Backing up a single alternating filegroup each day of the week except Saturday dramatically reduces the amount of time needed to back up the database. This strategy works only if you also back up the transaction logs, because when performing a backup of filegroups, it is necessary to ensure that transactions are backed up regularly as well.

    WHAT DO YOU THINK?

  • I agree with (d) and (e). Remember that a backup strategy should also be a restore strategy. If you only take log backups (d), and your database crashes 10 minutes before your full backup starts (Saturday, 12:05am), you will have to replay 1 week's worth of transaction logs.

    If you went for (d) and (e), and the database died in the same place, you would restore all filegroups, and then apply all transaction logs. However, there is less work to do - you can restore the filegroup backups from Wednesday, Thursday, Friday morning, and you then only have 3, 2, and 1 days of log activity to replay. It does not matter that they are out of synch - you apply all logs from the time of the oldest backup. Reading the log is fast and skipping over that which is already done is fast - performing the actual changes is slower. You'll end up replaying all of Friday's activity, 2/3 of Thursday's activity, and 1/3 of Wednesday's activity. This is a lot less than 3/3 for 7 days.

    I'm a little shocked that the transaction log is only backed up once per day, but it's probably a data warehouse that loads overnight, and has no activity after 5am.

  • Thanks for replying. I really want to understand this and it seems i'm missing something fundamental.

    I believed that restores had to happen in order: full, dif, trn OR full, trn OR trn, trn, trn. I believed that filegroup backups worked when they contained mutually exclusive data that didn't depend on other filegroups....alas I must be wrong.

    Will the restore operation sort itself out with the duplicate transactions (some will be both in t-log and some in filegroup back ups) and the LSNs that will be out of sync?

    In what order would you do such a restore?

    Thank you.

  • Take a look at http://msdn.microsoft.com/en-us/library/ms177425.aspx under the section "Piecemeal Restore Under the Full Recovery Model". There's an example as well.

    First, you want to take a tail log backup (if possible) to get the transactions between the last log backup (Friday 5am) and the current time. Otherwise you'll only be able to recover up to Friday 5am.

    RESTORE DATABASE Mineralogical FILEGROUP='PRIMARY' FROM DISK = 'LastWeekFullBackup.bak' WITH PARTIAL, NORECOVERY

    RESTORE DATABASE Mineralogical FILEGROUP='A' FROM DISK = 'WednesdayFGBackup.bak' WITH NORECOVERY

    RESTORE DATABASE Mineralogical FILEGROUP='B' FROM DISK = 'ThursdayFGBackup.bak' WITH NORECOVERY

    RESTORE DATABASE Mineralogical FILEGROUP='C' FROM DISK = 'FridayFGBackup.bak' WITH NORECOVERY

    RESTORE LOG Mineralogical FROM DISK = 'SaturdayLog5am.trn' WITH NORECOVERY

    RESTORE LOG Mineralogical FROM DISK = 'SundayLog5am.trn' WITH NORECOVERY

    RESTORE LOG Mineralogical FROM DISK = 'MondayLog5am.trn' WITH NORECOVERY

    ...

    RESTORE LOG Mineralogical FROM DISK = 'FridayLog5am.trn' WITH NORECOVERY

    RESTORE LOG Mineralogical FROM DISK = 'TailLogBackup.trn' WITH RECOVERY

    Note that the only reason we restore all transaction logs is for the PRIMARY filegroup. This must also be brought up to the current time. As the example didn't specify where the PRIMARY is located, we can only assume that it is only being backed up weekly.

    It's also possible to do a partial online restore - in this case, you could just specify FILEGROUP A, finish the process, and later on repeat the process for B and C. B and C just won't be accessible until they have been restored.

    When each record in the transaction log is analysed, it will determine if this change has already been applied to the database. For example, in the log backups prior to Wednesday's FG A backup, any modifications that will affect FG A will be ignored, as SQL Server already knows that Filegroup A is already past that point in time.

  • Thank you so much, Jim! I looked over the example and it makes sense to me and clarifies it all. I also read the question the same way, that it implied that the PRIMARY filegroup gets backed up during the weekly full.bak (well, actually didn't think about it head-on). And, now I understand about SQL Server being able to toss out transactions that are duplicates. I'm assuming that is by using the LSNs.

    OK, got it!:-)

  • Jim McLeod (7/18/2010)


    I agree with (d) and (e). Remember that a backup strategy should also be a restore strategy. If you only take log backups (d), and your database crashes 10 minutes before your full backup starts (Saturday, 12:05am), you will have to replay 1 week's worth of transaction logs.

    If you went for (d) and (e), and the database died in the same place, you would restore all filegroups, and then apply all transaction logs. However, there is less work to do - you can restore the filegroup backups from Wednesday, Thursday, Friday morning, and you then only have 3, 2, and 1 days of log activity to replay. It does not matter that they are out of synch - you apply all logs from the time of the oldest backup. Reading the log is fast and skipping over that which is already done is fast - performing the actual changes is slower. You'll end up replaying all of Friday's activity, 2/3 of Thursday's activity, and 1/3 of Wednesday's activity. This is a lot less than 3/3 for 7 days.

    I'm a little shocked that the transaction log is only backed up once per day, but it's probably a data warehouse that loads overnight, and has no activity after 5am.

    I think you're quite right to be shocked by the infrequency of the log backups as the question states "All tables are updated heavily during the day throughout the work week". This isn't the strategy I would have gone for. It is the best out of the options available though. Poor question IMO.

  • Chris Houghton (7/19/2010)


    Jim McLeod (7/18/2010)


    I'm a little shocked that the transaction log is only backed up once per day, but it's probably a data warehouse that loads overnight, and has no activity after 5am.

    I think you're quite right to be shocked by the infrequency of the log backups as the question states "All tables are updated heavily during the day throughout the work week". This isn't the strategy I would have gone for. It is the best out of the options available though. Poor question IMO.

    Thanks, Chris - I must have skipped over that sentence. In that case, I suppose the business is OK with losing up to a day's worth of data, but that definitely doesn't sit comfortably with me. That transaction log backup could be quite large! And in the event of having to restore the database, that large transaction log file will need to be zeroed out before use. Ouch.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply