March 7, 2011 at 9:17 pm
Comments posted to this topic are about the item Avoiding Logging
March 7, 2011 at 9:32 pm
Great editorial, Steve. You can't make solid, durable furniture (or databases) without logging!
March 7, 2011 at 10:27 pm
HA!
If it wasn't for the Murphy's of the world, we wouldn't need a tlog in the first place. But here we are, now everyone gets to become very familiar with recoverability strategies. You are welcome for the contribution.
I think some folks think that if the db is in the Simple recovery model, then the tlog is not used. Wrong. The .ldf is still locked by the OS indicating that it is still in use. And deleting the .ldf with the services stopped is, uh, bad. Why? Because SQL needs and uses the tlog, even on db's which are in Simple. It just doesn't use the tlog LONG TERM. Well, this is relative; when the next checkpoint occurs and whatnot.
I'm changing my name to Jones. Just trying to keep up with Steve.
Jim
Jim Murphy
http://www.sqlwatchmen.com
@SQLMurph
March 8, 2011 at 1:38 am
Have to agree Steve. Here in Christchurch, thanks to the earthquake, I'm becoming intimately familiar with the other side of disaster recovery. Fortunately without too much drama, and very thankful it's so robust!
Tim Elley
Christchurch, New Zealand
March 8, 2011 at 2:02 am
In the somewhat special case of ETL, I'd sometimes like to turn of logging.
If a data load fails, the destination table can be truncated and the load can start over again, so you would not have to worry about inconsistency.
I'm only talking about the import layer here (the E of ETL). If updates are performed on datasets in the database, I would very much like logging, as I would like to go back to a previous state if necessary. But for imports, nah, I don't need logging 🙂
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 8, 2011 at 6:26 am
Koen Verbeeck (3/8/2011)
In the somewhat special case of ETL, I'd sometimes like to turn of logging.If a data load fails, the destination table can be truncated and the load can start over again, so you would not have to worry about inconsistency.
I'm only talking about the import layer here (the E of ETL). If updates are performed on datasets in the database, I would very much like logging, as I would like to go back to a previous state if necessary. But for imports, nah, I don't need logging 🙂
Would the Bulk Logged recovery model accomplish what you want on that?
You can also have a staging database, where you bulk import, et al, kept in Simple recovery, and just leave it out of the backup and maintenance plans. The log will grow to accommodate your imports, but it's simpler and less critical than a "real" database. If needed/wanted, keep that one on a cheap RAID 0 array. If it crashes and burns, replace the disks and re-run the create script from source control, and don't worry about recovery. Just make sure it's set up so that you don't lose anything that matters if you lose the whole database.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 8, 2011 at 6:41 am
Actually, it is possible to do this in Oracle. I had a nice discussion about it with one of the Oracle gurus where I work. He would never recommend using this capability on an online database, but would use it in a tightly controlled batch process where online activity is prevented from accessing the database. He would also take precautions, including ensuring that there was a backup prior to and after the batch process.
Perhaps it is because this can be done in Oracle that people think SQL Server has a similar capability.
March 8, 2011 at 6:47 am
GSquared (3/8/2011)
Koen Verbeeck (3/8/2011)
In the somewhat special case of ETL, I'd sometimes like to turn of logging.If a data load fails, the destination table can be truncated and the load can start over again, so you would not have to worry about inconsistency.
I'm only talking about the import layer here (the E of ETL). If updates are performed on datasets in the database, I would very much like logging, as I would like to go back to a previous state if necessary. But for imports, nah, I don't need logging 🙂
Would the Bulk Logged recovery model accomplish what you want on that?
You can also have a staging database, where you bulk import, et al, kept in Simple recovery, and just leave it out of the backup and maintenance plans. The log will grow to accommodate your imports, but it's simpler and less critical than a "real" database. If needed/wanted, keep that one on a cheap RAID 0 array. If it crashes and burns, replace the disks and re-run the create script from source control, and don't worry about recovery. Just make sure it's set up so that you don't lose anything that matters if you lose the whole database.
Bulk Logged recovery model certainly is an option. So is the Simple recovery model.
My point is that when the ETL tightly controls the batch process and the destination database is only used as a "dump" for the data (aka volatile staging area, where destination tables are cleared before the import process), that all the logging is just extra overhead interfering with (BULK) INSERT performance. For the same reason it is also recommended not to have constraints (be it foreign keys or check constraints) and to minimize indexing (you can even drop the indexes and recreate them after the import process).
If it is a non-volatile staging area, then a backup before the import process is sufficant, as Lynn already mentioned.
But maybe I'm preaching to the choir 🙂
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 8, 2011 at 7:26 am
Earlier version of SQL Server did support No Logging.
Bulk Copy was the reason. In those days BCP was non-transacted resulting in higher performance. In order for it to be non-transactional you had to set your recovery mode to simple (They really didn't have Recovery Mode back then, but that is the closest thing they have today). In the oldest versions (4.21 say) it didn't even update indexes when you did a bulk copy of this nature using the BCP.EXE command.
After doing a BCP you had to update statistics, rebuild indexes and backup your database. Basically, it was the fastest way to get staging data into SQL Server. It was a pain. IT WAS FAST!
As you say, this mode is not supported with SQL Server today, but the idea is not a fantasy...it used to exist. Why doesn't it exist today; I don't know or care. Frankly, with todays hardware I see reasonable performance with logging, so why turn it off.
A second method of not using the transaction log was
SELECT ... INTO #SomeTempTable FROM ...
This query did not use the transaction log in any of the databases including TempDB. There were problems with this technique fixed in version 7 where the SYSOBJECTS table in Tempdb was locked until the select statement completed (A potential disaster). So the technique was not often used.
In SQL Server 2005 an dlater this technique still performs faster than creating a temp table and inserting data into it with a select statement. However, I know there is logging involved because I use that ability in transactions for SAVE POINTS and transactions. The level of logging is unknown to me nor why there is increased performance. However, I do have performance history demonstrating the technique is still valid.
There were lots of things we did, or had to do, in the old days that are no longer relevant, or methods have changed. That doesn't mean they were never true. For example, MS Best practices used to teach us to use a non-squential column with low data distribution for a clustered index. It was on their SQL Server tests for certification. Today the best practice is the exact opposite. MS recommends a sequential value as a clustered index, even if it is not the primary key.
Thanks for reminding me how old I am, and how long I have been working with SQL Server. 🙂
Ben
March 8, 2011 at 7:44 am
taylor_benjamin (3/8/2011)
Thanks for reminding me how old I am, and how long I have been working with SQL Server. 🙂
There was a 4.21 version??? 😛
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
March 8, 2011 at 7:49 am
Koen Verbeeck (3/8/2011)
If a data load fails, the destination table can be truncated and the load can start over again, so you would not have to worry about inconsistency.
What would happen if the server failed at the point SQL was modifying the allocation constructs or system tables in the database and there was no logging? Not so easy to fix.
Logging is not just for the user data. It's for page allocations, allocation page modifications, system table modifications and a while lot more.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 8, 2011 at 8:14 am
Transaction logging usually isn't an issue unless you're inserting 100,000+ records one.. at.. a.. time.. in a loop, or the DBA allows the transaction log to grow until it fills up available disk space. It's rarely an issue during the normal operation of an OLTP database. You can mitigate the negative performance effects of logging by placing the transaction log files on a seperate drive system, using the BULK COPY utility for your batch loads, and perhaps the Bulk Logged recovery model on occasion.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 8, 2011 at 8:43 am
GilaMonster (3/8/2011)
Koen Verbeeck (3/8/2011)
If a data load fails, the destination table can be truncated and the load can start over again, so you would not have to worry about inconsistency.What would happen if the server failed at the point SQL was modifying the allocation constructs or system tables in the database and there was no logging? Not so easy to fix.
Logging is not just for the user data. It's for page allocations, allocation page modifications, system table modifications and a while lot more.
I had this argument with someone recently about loads. They were assuming that the load would fail and they'd restart it, but you need logic to allow restarts. Not everyone builds this into their processes and even if you don't have restart logic, the cleanup of old data needs to be transactional. So have you saved anything if we allowed imports w/o logging? Not sure, and honestly, not sure the vast majority of people are qualified to decide this.
Even if you are, is that the best decision for the company? The next person that does your job might not understand this type of feature and use it in other places.
If you don't need transactional logging during imports, then commit periodically, inserting batches of 10,000 or so, and running some log backups. If you're that busy and you have that much data, then I assume you ought to have some money to put log backups on separate spindles, and the t-log on separate spindles, and get better performance.
Ultimately, I don't think it's worth the risk to relational data to allow this. If the data isn't that important, or can be reloaded easily, perhaps it would be better to put this into some other structures. Maybe a NoSQL or columnar construct.
March 8, 2011 at 8:44 am
Koen Verbeeck (3/8/2011)
taylor_benjamin (3/8/2011)
Thanks for reminding me how old I am, and how long I have been working with SQL Server. 🙂There was a 4.21 version??? 😛
There was a version that didn't run on Windows 😉
March 8, 2011 at 9:00 am
Wow! That really goes back. Yes, the first MS version of SQL Server was to take Sybase and port it to run on OS/2.
I am not sure, but as I recall, version 4.21 was the first version to run under the Windows OS (Window NT Advanced Server was the branding as I recall).
Version 4.21a was the first version of SQL Server to break away from Sybase as a purely MS code base.
Viewing 15 posts - 1 through 15 (of 38 total)
You must be logged in to reply to this topic. Login to reply