May 2, 2008 at 7:48 am
You can see this yourself.
Create a new database and set the recovery model to full.
Add a single table with one INTEGER field.
Insert a record into your table.
Run: DBCC LOG(' ',3)
You should see that in the transaction log you have information like the LSN for the backups, the operation, the transaction id, some record length information, allocation information, page information, lock counts, etc. This is all in addition to the actual data being logged in the transaction log. So, the log entry for a transaction not only includes the data, it includes a great deal of descriptive information.
Now, if your table is tiny - like just an integer field - it is pretty easy for the descriptive information to take up more space than the data involved in the transaction.
May 2, 2008 at 8:22 am
you are missing that the OP is inserting 10M rows in a single transaction. I do agree that if you did a series of single-row inserts the overhead stuff would be significant.
create database test
go
use test
go
DBCC LOG('test',3) --71 rows here now, just for empty database
go
create table test (a int)
go
DBCC LOG('test',3) --115 rows now, so 84 for create table action
go
insert test
select id from syscolumns --420 rows inserted
go
DBCC LOG('test',3)
--566 rows in log now, 420 of which are the actual row inserts,
--leaving only 31 rows of 'overhead' associated with the insert statement itself
--thus as the number of rows inserted goes up, the ratio of transaction 'overhead'
--to rows inserted approaches zero (31+allocation logging/10M rows of actual inserts)
go
use master
go
drop database test
go
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 2, 2008 at 8:34 am
Yes, the number of rows is not the same. You get some additional log entries in the transaction log to go with inserts. That is not what the issue is.
Pretend the transaction log is a table for a second. Your table has a single field in it (an integer). The transaction log "table" has a variable length field in it to hold your data plus 10 other integer fields to describe what is happening (it's not 10 and they are not integers, but that makes the math easy). Your table requires 4 bytes per row. The log table would require your 4 bytes per row plus the other 40 bytes.
It is not a perfect ratio like this, but one of the things you get in the log file with each piece of data is an indicator for the transaction to associate your data together. In the case of a table with a single integer field, the transaction identifier alone doubles the amount of data to be logged.
May 2, 2008 at 8:43 am
Treading lightly in the company of so many gurus....
have you considered doing a BulkInsert and/or checking to see what the FillFactors are? If files are set up without a lot of freespace you could have rippling effects from page splits, etc. Insert enough records to split a page... then insert more which cause both those to split again... etc...
Toni
*** corrected grammar **
May 2, 2008 at 8:46 am
never tread lightly....speak your mind.
May 2, 2008 at 8:52 am
Michael Earl (5/2/2008)
never tread lightly....speak your mind.
Definitely agree with this!!
Didn't the OP state it was a HEAP table, in which case fillfactor isn't applicable. It would be interesting to see how much extra logging a clustered index with a fillfactor of 1 would take due to page/extent splits and allocations. Certainly would create a bloated database tho! 🙂
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 5, 2008 at 6:16 pm
while facing similar problem earlier, i had given up this because of not having much time to spend on this. now i think i got the answer to this problem.
thanks to all gurus here.
Viewing 7 posts - 16 through 21 (of 21 total)
You must be logged in to reply to this topic. Login to reply