August 13, 2009 at 2:12 pm
Greetings all 🙂
I have a Data Warehouse that is loaded once a night. I run a backup before and differential backup after the load. Seeing as there is no need to log transactions throughout the day, I have set this to a simple recovery mode since I do not need a point in time recovery option.
I do know from reading many articles and books online, that certain actions still are logged. One being an insert, delete, or update on a table with a Clustered Index.
I have a process that throws about 8GB of data into the transaction log because of this, and for the life of me I could have sworn I saw a command to prevent a script from logging to the transaction log.
Am I crazy or is there a command for this? Or should I live with truncating the transaction log after the transaction?
Link to my blog http://notyelf.com/
August 13, 2009 at 4:11 pm
You're crazy :hehe:
Every transaction gets logged in the transaction log, regardless of the recovery model. The difference is just in the details of how the transactions are dealt with after the fact.
In simple recovery model, once the transaction has been committed and a checkpoint has occurred - the space in the transaction log is marked as reusable.
In full recovery model - the space is not marked as reusable until a transaction log backup has occurred.
In bulk-logged recovery model, certain transactions only record the page allocations/deallocations instead of the full transaction. The space is not marked as reusable until a backup is performed - which backs up all of the data that was changed and not just the pages.
The problem with changing recovery models this way is that you are required to perform a full/differential backup after changing from simple back to full. If you don't, then your transaction log backups will fail. I'd recommend switching to bulk-logged instead, which does not require a backup after switching back to full. But, be aware that reorganizing an index is not a bulk-logged operation and only index rebuilds are.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
August 13, 2009 at 5:42 pm
Awesome thanks Jeffrey. And I guess since you are a SSC Crazy you would know :).
I do admit I need to research more on recovery models, back ups and most importantly the transaction log, 'cause honestly that one still confuses me a bit 😀
Link to my blog http://notyelf.com/
August 13, 2009 at 6:56 pm
Here's a great article from MVP Gail Shaw: http://www.sqlservercentral.com/articles/64582/
One thing you can think about is changing your process to commit at smaller intervals. If you can do that, the log will not grow as much.
August 13, 2009 at 7:03 pm
[font="Verdana"]How are you loading up the data into your data warehouse ETL? Are you using SSIS, bcp, bulk insert? If so, those options are likely to be minimally logged anyway.
Where you will likely find most of your transaction logging is within any SQL ETL. If you use SSIS for all of your ETL (pros and cons there), then this will reduce logging, but probably at the expense of complexity and maintainability.
If you have any long running ETL processes, you can look at batching up the changes (so do them in batches, rather than all in one go). That will reduce the logging footprint.
I'd need to know more about how you have your DW refresh structured before I could make any additional suggestions. Is there a reason why the transaction logging is actually an issue?
[/font]
August 13, 2009 at 10:25 pm
shannonjk (8/13/2009)
Awesome thanks Jeffrey. And I guess since you are a SSC Crazy you would know :).I do admit I need to research more on recovery models, back ups and most importantly the transaction log, 'cause honestly that one still confuses me a bit 😀
Hah, just cause I'm crazy doesn't mean I know anything :w00t:
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
August 18, 2009 at 12:24 pm
Sorry for the slow response Bruce, been a busy week!
I use an SSIS ETL process. We have a staging database that gets wiped and reloaded each night from an as400 source. From there the data is evaluated and loaded into the DW (update records, new records, normal stuff you get the idea).
The transactions logging isn't an issue per se, but why create data that is unnecessary? There are no transactions after this until the data warehouse is loaded again the next day. So there is no need for a point in time recovery. My mindset is efficiently executing things, and part of that would be not expanding a transaction log that in my mind is unnecessary for this specific solution.
Link to my blog http://notyelf.com/
August 18, 2009 at 12:36 pm
The transaction log in this kind of situation is being used as a data safeguard. It helps prevent the database from becoming corrupted by partial writes and things of that nature.
Read up on what it does, it'll make sense. It's not just about point-in-time recovery, there's a LOT more to it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 18, 2009 at 12:44 pm
Awesome thank you for the info! I definitely need to read up on this when I get the chance.
Link to my blog http://notyelf.com/
August 18, 2009 at 2:37 pm
Look into what "ACID" means with regards to databases and transactions. That'll point you in the right direction on this.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 18, 2009 at 3:07 pm
shannonjk (8/18/2009)
There are no transactions after this until the data warehouse is loaded again the next day. So there is no need for a point in time recovery. My mindset is efficiently executing things, and part of that would be not expanding a transaction log that in my mind is unnecessary for this specific solution.
And what happens if the SQL instance crashes half-way through (power failure, for eg)?
If an unlogged operation was possible and an instance restart happened part way through, the database would have to be marked suspect on restart.
The log's primary function is not point-in-time recovery. If it was, there would be no logging in simple recovery. The log's primary purpose is to ensure that the database is transactionally consistent, no matter what happens.
If you do research ACID, the log is used for all but one of those properties (isolation)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply