MinLSN

  • Books Online states that the minimum LSN (MinLSN) is the earlier of the start of the oldest active transaction or the start of the checkpoint. Since the start of the oldest active transaction will always be before the start of the checkpoint, can we safely assume that the MinLSN will be the start of the checkpoint when there are no active transactions by the time the checkpoint is completed?

    I would appreciate if somebody answered this question.

    Karim

  • Definitely not.

    The point of the MinLSN is to handle large and long transactions. There may be many checkpoints that occur during these operations. There may also be checkpoints that occur while one of these transactions is being rolled back, and during database recovery.

    Consider the default for DTS data pumps and bulk loads, which is that an entire load operation is a single transaction. If the size of the file loaded is larger than RAM for the buffer pool, SQL Server will need to do several checkpoints during the load. If the last record has an error, the entire load must be rolled back. Since all the pages that were previously modified and flushed back to disk must be re-read and re-modified, there will be several checkpoints during this process. Until this is accomplished, MinLSN must be at most equal to the starting LSN of the load operation's transaction. The same applies to open transactions that may be caused because a user is just staring at the screen a long time (or getting coffee, in a meeting, etc.), or if a connection has dropped without SQL Server noticing.

    Transactional replication also affects MinLSN. Any part of the T-Log that has not been scanned and transmitted is treated as if it was part of an open transaction.

    When a checkpoint starts, it writes a T-Log entry with a table of all open transactions (presumably with their starting LSNs). It also writes a T-Log entry when done. However, as far as I know, it does not change MinLSN. However, the latest checkpoint LSNs are recoreded in the "boot pages" (Page 0 of primary data and log files), so SQL Server knows where to start processing the T-Log for a recovery operation.

    If there are no active transaction when a checkpoint happens, SQL Server would have a "pristine" data file, that would not need any recovery, rollback or rollforward operations. In this case, MinLSN is probably the next LSN to be assigned, since technically, no part of the T-Log is active. (This would be the LSN + 1 of the "checkpoint end" T-Log entry.)

    I believe MinLSN is only for the active part of the T-Log, and SQL Server keeps track of the recoverable part of the log separately. This is the part of the T-Log that must be kept so it can be written to Log Backup(s). Truncating the log resets this and makes this portion "reusable" again. However, the active part (and MinLSN) is not affected by either backing up or truncating the log.

    Hope this sheds some light on the subject.

    David Lathrop
    DBA
    WA Dept of Health

  • David,

    Thanks a lot for your reply.

  • David,

    Thanks a lot for your reply. I do have a couple of questions, however:

    • You said that there may also be checkpoints that occur...during database recovery. I am wondering why a checkpoint would occur during database recovery? What would be a possible reason for it?
    • Secondly, consider the t-log below:

    T1 - CHANGE 1

    T2 - CHANGE 1

    CHECKPOINT 1 - START

    T1 - COMMIT

    CHECKPOINT - END

    T1 - CHANGE 2

    CHECKPOINT 2 - START

    T2 - COMMIT

    CHECKPOINT - END

    In the case of this log, by the end of checkpoint 2, there are no active transactions. Won't the MinLSN after the end of the second checkpoint be the start of the second checkpoint? Please clarify.

    Thanks,

    Karim

  • A checkpoint is the process that transfers database pages from the buffer pool to the disk files. It happens whenever the buffer pool is so full of dirty pages that it impacts performance. It also happens when the number of dirty pages and transactions will cause recovery times to exceed the targeted recovery interval. This is true during normal operations, but also during recovery. (The Lazywriter will also write dirty pages, but only a few pages at a time, if they haven't been used recently.) 

    Consider the million record, single batch, DTS data load I mentioned earlier. Rather than an error, someone kicked the power cord just as the last record was being inserted. When SQL Server starts up, it has to roll back that entire transaction as part of the recovery phase. This won't just be removing data rows from table pages--you probably have data page allocations, index entry inserts, index page splits, index page allocations, possibly expansion of data files, etc. All of this may have to be rolled back. This could be several millions of pages to be read and modified, probably more than the buffer pool can store. And this is just one transaction, and there could be many more. So there will probably be multiple checkpoints to clear space in the buffer pool.

    The main thing about MinLSN is understanding its impact on T-Log size and operations, such as log backups, truncation, growing and shrinking the file, and database backup and recovery.

    At the end of your example, MinLSN will be set to the start of the second checkpoint. This is because SQL Server cannot guarantee that all of T2 has actually been permanently recorded on disk (see following), so the recovery process must begin with the T-Log entry for the start of Checkpoint 2.

    In addition to MinLSN, SQL Server also seperately keeps track of when the last completed checkpoint started. So after the "T1 - COMMIT" it remembers both "T2 - CHANGE 1" (MinLSN) and "CHECKPOINT 1 - START". Further, it knows that every page that was dirty at CHECKPOINT 1 - START was written to disk and made permanent. (CHECKPOINT 1 - END is not written to T-Log until the OS has confirmed all the checkpoint's disk writes.) However, there is no way to know if any modifications made during the checkpoint operation happened before or after the dirty page was written to disk (a page can only be written once in a checkpoint), or even that the page was written at all if it wasn't dirty before the checkpoint start.

    The recovery process begins with the "checkpoint start" T-Log entry for the the last completed checkpoint. It knows all prior work has been written to disk, but later work may or may not be. It will read the list of incomplete transactions and start at that point forward-reading T-Log entries, reproducing their actions, and updating the list of incomplete transactions. When it reaches the end of the T-Log, it figures out which transactions were still incomplete, and reads the T-Log backwards to rollback these transactions. (This backwards scan may or may not pass the last completed checkpoint.) Finally, if there is transactional replication needing to be done, it can start the replication log reader process reading forward from the MinLSN. The MinLSN indicates how far back the T-Log entries must be retained to make all this possible, not where it starts the recovery process.

    If there were no further changes to the database, and a third checkpoint occurred followed by a log backup, SQL Server could take advantage of the situation to completely restart the T-Log from the beginning. Notice that there would only be two "active" entries, for start and end of the last checkpoint, and these would be very small (zero incomplete transactions). This would allow the transaction log to be shrunk down to minimum size and the size of the virtual log files to be optimized. In this case MinLSN is moot, since there is no need for recovery or any T-Log entries, and a "dirty database" switch would suffice.

    By the way, when a transaction is rolled back, SQL Server creates T-Log entries for the changes it makes to undo the effects of the requested operations. So if the transaction included a CREATE INDEX, the T-Log will include both the index creation operations (SYSINDEXES insert, page alloctions, index insertions, etc.) and the index destruction operations (page deallocations, SYSINDEXES delete, etc.). The transaction is only "complete" when all of these undo T-Log entries have been written to the T-Log. SQL Server always tries to insure that there is sufficient space in the T-Log to guarentee these undo operations can be logged. These undo operations are included in, and repeated by, the recovery process during the first phase (forward scan). The backward recovery scan just fills in the missing undo operations, which it also adds to the T-Log. (This is when you can have checkpoints happening during recovery, and these are definitely recorded in the T-Log with the still "incomplete" transactions.) This way, if one restart fails for some reason, the next restart can pick up where it left off. Also, this allows the backup chain to remain valid across recovery operations. This is also the reason why you need to backup both the database and the log after a successful restore+recovery, or just a long restore.

    David Lathrop
    DBA
    WA Dept of Health

  • David,

    Your reply was very helpful and thought provoking. I didn't, however, understand the last couple of sentences in your response that state the following:

    Also, this allows the backup chain to remain valid across recovery operations. This is also the reason why you need to backup both the database and the log after a successful restore+recovery, or just a long restore.

    Could you kindly elaborate on this a bit more. I would appreciate it.

    Regards,

    Karim

  • The backup files contain the first and last LSNs in the backup. During a restore (before recovery), the database contains the last LSN that was applied, and will not allow you to restore a log backup if its starting LSN leaves a gap from the last LSN that was previously applied.

    During the recovery process, additional T-Log entries are created, advancing the current LSN. While there may be checkpoints taken during this process, this is not necessary; SQL Server can complete a minor recover and resume operation without doing a checkpoint and causing buffers to be written to disk. The goal is to minimize the time before the server is available for use, so it doesn't wait. However, it needs to know what actions were taken during recovery (written to disk in the T-Log) so a future recovery can recreate the point-in-time when it resumed normal operations.

    T-Log entries reflect low level details, such as "allocated page 13/56/256 to table 27" and "this row (before & after images) was inserted into slot 1 of page 13/56/256". That page may have been part of an index on table 14, and was freed up during recovery. If SQL Server cannot repeat the free operation, the later operations would fail. And don't forget that recovery may be undoing deletes that deallocated pages from tables which got reassigned to other tables for committed operations; recovery has to allocate a different page to undo the delete. Optimizations in operations like selecting a free page for allocation make these non-repeatable calculations; they have to be in the T-Log for SQL Server to figure out what was done.

    These recovery operations happen during normal startup and shutdown operations, so they have to be available in the T-Log for log backups. Otherwise, there would be a gap in LSNs, and your backup chain would become invalid everytime you shutdown and started up SQL Server.

    Its rare, but if you lose those recovery T-Log entries, you cannot apply further log backups. The main cause is that someone truncates the log to shrink it, not realizing the impact this has.

    David Lathrop
    DBA
    WA Dept of Health

  • David,

    I was able to follow your response up until the point you began talking about freeing up the index. I might be a little confused here. Did you mean restore here instead of recovery?

    Also, I am a little unsure about the content of the following two paras:

    "Optimizations in operations like selecting a free page for allocation make these non-repeatable calculations; they have to be in the T-Log for SQL Server to figure out what was done.

    "These recovery operations happen during normal startup and shutdown operations, so they have to be available in the T-Log for log backups. Otherwise, there would be a gap in LSNs, and your backup chain would become invalid everytime you shutdown and started up SQL Server."

    I am unsure as to what optimization in operation means. Also, why the recovery operations become unrepeatable calculations. Does the word "calculation" here mean the same thing as operation? Also, I would appreciate if you elaborated a bit more on the content of the second paragraph.

    Karim

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply