October 14, 2009 at 3:38 am
I have been trying to get a basic understanding of how SQL works. Despite extensive Googling I have not yet been able to find the answers I desire, so what follows is my understanding at present of what goes on:
I (think) I understand the concept of the transaction log. All the insert/update/delete statements stack up in the .trn file. At some point these are written to the .mdf and the matching items in the .trn are marked as 'done'.
First question: When does this process happen? When a processor gets a free minute?
Now, from reading up, I understand that the 'done' transactions don't get deleted, they stay in the .trn marked as 'done'. New transactions build up. This process continues until the end of time, or, your hard disk explodes.
To remove the 'done' entries you need to 'truncate' the log. This happens automatically when you backup a transaction log or you can do a TRUNCATE command.
Truncation does not remove the space used by the 'done' rows though. To do this you need to 'shrink' the log. But shrinking seems universally despised as it leads to fragmentation. Which is where my interest in this started...
I am trying to knock up a sensible SQL backup/restore/maintenance plan than minimises fragmentation and it's associated performance hit. Thus far I believe I need to:
Turn off AutoShrink - it's plain evil.
Replace AutoGrow with a fixed database size - how does one guesstimate this?!
Have a nightly full backup and 4-hourly (say) transactional backup (which will truncate my log files).
Create a tidyup job to delete backup files over 1 day old - we have the older files on tape so no need to keep a file copy, right?.
Create a job to delete old log files - although reading suggests this is harder to do that it ought to be!
I would love some expert advice here in order to confirm/deny/rubbish my current understanding and plans. I have thick skin though so don't hold back 😉
Cheers
Will
October 14, 2009 at 4:11 am
1. Transactions are committed when the database command 'COMMIT TRANSACTION ' is submitted, OR after an implicit transaction completes (http://msdn.microsoft.com/en-us/library/ms188317(SQL.90).aspx)
2. You can fix the log file size to a maximum, but !!!!!!!! BEWARE !!!!!!!! the log file max size must be big enough to take the busiest days transaction for your required time span. If it's not then you will get into the situation where you can no longer log transaction, which will also mean you can no longer write data to your database!!! Very bad news.
3. When deciding on your time intervals between transaction log backup, ask yourself - what period of time can we accept that we could lose all data transaction from?? Generally speaking, most people will do a log backup every hour or half hour. (i'll hasten to add you can do a TAILLOG backup if you db is nackered but your log files are ok)
Also, don't forget you can do differential backups (I do these every 4 hours).
4. Look at the INIT and FORMAT options to over write your existing backups on tape.
I hope this helps.
_____________________________________________________________________________MCITP: Business Intelligence Developer (2005)
October 14, 2009 at 4:40 am
dave-488664 (10/14/2009)
1. Transactions are committed when the database command 'COMMIT TRANSACTION ' is submitted, OR after an implicit transaction completes (http://msdn.microsoft.com/en-us/library/ms188317(SQL.90).aspx)OK - but *when* is this?! Should it effectively be in real-time (unless your transaction stack is HUGE)?
2. You can fix the log file size to a maximum, but !!!!!!!! BEWARE !!!!!!!! the log file max size must be big enough to take the busiest days transaction for your required time span. If it's not then you will get into the situation where you can no longer log transaction, which will also mean you can no longer write data to your database!!! Very bad news.
So, better to leave autogrowth on (as a fail safe option) but set the initial database size to somethign sensible (guestimated max size)?
3. When deciding on your time intervals between transaction log backup, ask yourself - what period of time can we accept that we could lose all data transaction from?? Generally speaking, most people will do a log backup every hour or half hour. (i'll hasten to add you can do a TAILLOG backup if you db is nackered but your log files are ok)
Also, don't forget you can do differential backups (I do these every 4 hours).
Will do some more reading on differential backups...
4. Look at the INIT and FORMAT options to over write your existing backups on tape.
Will do!
I hope this helps.
Thanks for the reply - I feel like I am getting somewhere...
October 14, 2009 at 4:49 am
To help you further in your understanding, look at the checkpoint statement in BOL.
to answer your further questions, 'truncate the log'. it will happen if recovery model is simple or you do a backup log as you have mentioned.
backing up the log increases the free space available in the log file. it has no effect on the physical size of the log file. to decrease the physical file size of the log file, you need to shrink it, which will remove all the 'done' transactions as you have described them from the log file.
turn off autoshrink.
capacity planning is an integral part of being a dba, if possible you size your database and log files to what you think they possibly could grow to.
to keep the physical file size of the log file small, schedule more frequent log backups, some people have log backups run from anything from every 5 minutes to 15 minutes.
If you have any more questions, let us know.
--------------------------------------------------------------------------------------
[highlight]Recommended Articles on How to help us help you and[/highlight]
[highlight]solve commonly asked questions[/highlight]
Forum Etiquette: How to post data/code on a forum to get the best help by Jeff Moden[/url]
Managing Transaction Logs by Gail Shaw[/url]
How to post Performance problems by Gail Shaw[/url]
Help, my database is corrupt. Now what? by Gail Shaw[/url]
October 14, 2009 at 5:44 am
1. Yes, I believe it is more of less real time.
2. Yeah I would. Try to manage the log file size by doing sensible backups, and if you have to, shrink the file, post backup. You can shrink it to a specified file size, to reduce the need for growing the log file.
3. if your database is new and empty, then you'll need to work out how much data roughly, that you will be logging and take it from there. If however you already have your db, the DB file size, should initially be your DB size + say 10%-15%.
The only real reason for leaving the file bigger than neccessary is to reduce the need for database growth, which puts overhead on the server.
_____________________________________________________________________________MCITP: Business Intelligence Developer (2005)
October 14, 2009 at 6:51 am
Thanks for the advice - really useful stuff.
October 14, 2009 at 8:10 am
If you have not read this article by Gail Shaw, I suggest you do so.
http://www.sqlservercentral.com/articles/64582/
and by Paul Randal
Randal's blog post also contains links to several additional articles in Technet magizing which are equally as informative.
Another good read replete with examples is:
October 14, 2009 at 9:57 am
WillC9999 (10/14/2009)
I (think) I understand the concept of the transaction log. All the insert/update/delete statements stack up in the .trn file. At some point these are written to the .mdf and the matching items in the .trn are marked as 'done'.
Not exactly.
When a data modification occurs (insert, update, delete or any DDL), the change is done to the data pages in memory. The change is then recorded in the log buffer (also in memory). At the point that the transaction commits the contents of the log buffer are written to the ldf file. This is SQL's log-before-data rule. The changes must be hardened (on disk) in the transaction log file before the transaction is considered complete.
At this point the transaction has committed, the data changes have been done in memory and the record of those changed are in the actual transaction log file on disk.
Sometime later one of two processes runs to write the dirty (modified) data page to disk (see the blog post of mine that bitbucket recommended)
Now, from reading up, I understand that the 'done' transactions don't get deleted, they stay in the .trn marked as 'done'. New transactions build up. This process continues until the end of time, or, your hard disk explodes.
Depends on the recovery model.
In simple recovery when the checkpoint process runs (writes all dirty data pages to disk) the inactive log records are not needed any longer. That's because the changes have been written into the data file. If the server reboots suddenly at this point, those log records will not be needed for recovering the database. In simple recovery a checkpoint operation 'truncates' the log. It marks the space used by the inactive log records as reusable (this is a simplification, but it's sufficient for now)
In bulk-logged or full recovery model, SQL will retain even inactive log records. This is because when you put a DB into full/bulk-logged recovery you're saying that you want the log records to be retained so that you can back them up for the purpose of been able to restore the DB to the point of failure.
Truncation does not remove the space used by the 'done' rows though. To do this you need to 'shrink' the log. But shrinking seems universally despised as it leads to fragmentation. Which is where my interest in this started...
Shrinking a data file causes fragmentation of the indexes. Shrinking of a log file does not. Repeated shrinks and grows can result in fragmentation at the file-level. However the main reason for not shrinking the log is because it will have to grow again. When the transaction log grows, changes occurring (insert/update/delete) have to wait until the grow is complete.
I am trying to knock up a sensible SQL backup/restore/maintenance plan than minimises fragmentation and it's associated performance hit.
Please read through this - Managing Transaction Logs[/url]
Also note that you *need* jobs to rebuild/reorg the indexes and you need to run regular integrity checks. These are in addition to whatever backups you have.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply