its definately a SQL Server Bug (table with 80 million rows)

Question

Post reply

its definately a SQL Server Bug (table with 80 million rows)

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Michael Earl-395764 SSC Guru Points: 53873 More actions · Answer 1

You can see this yourself.

Create a new database and set the recovery model to full.

Add a single table with one INTEGER field.

Insert a record into your table.

Run: DBCC LOG(' ',3)

You should see that in the transaction log you have information like the LSN for the backups, the operation, the transaction id, some record length information, allocation information, page information, lock counts, etc. This is all in addition to the actual data being logged in the transaction log. So, the log entry for a transaction not only includes the data, it includes a great deal of descriptive information.

Now, if your table is tiny - like just an integer field - it is pretty easy for the descriptive information to take up more space than the data involved in the transaction.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

you are missing that the OP is inserting 10M rows in a single transaction. I do agree that if you did a series of single-row inserts the overhead stuff would be significant.

create database test

go

use test

go

DBCC LOG('test',3) --71 rows here now, just for empty database

go

create table test (a int)

go

DBCC LOG('test',3) --115 rows now, so 84 for create table action

go

insert test

select id from syscolumns --420 rows inserted

go

DBCC LOG('test',3)

--566 rows in log now, 420 of which are the actual row inserts,

--leaving only 31 rows of 'overhead' associated with the insert statement itself

--thus as the number of rows inserted goes up, the ratio of transaction 'overhead'

--to rows inserted approaches zero (31+allocation logging/10M rows of actual inserts)

go

use master

go

drop database test

go

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Michael Earl-395764 SSC Guru Points: 53873 More actions · Answer 3

Yes, the number of rows is not the same. You get some additional log entries in the transaction log to go with inserts. That is not what the issue is.

Pretend the transaction log is a table for a second. Your table has a single field in it (an integer). The transaction log "table" has a variable length field in it to hold your data plus 10 other integer fields to describe what is happening (it's not 10 and they are not integers, but that makes the math easy). Your table requires 4 bytes per row. The log table would require your 4 bytes per row plus the other 40 bytes.

It is not a perfect ratio like this, but one of the things you get in the log file with each piece of data is an indicator for the transaction to associate your data together. In the case of a table with a single integer field, the transaction identifier alone doubles the amount of data to be logged.

toniupstny Hall of Fame Points: 3986 More actions · Answer 4

Treading lightly in the company of so many gurus....

have you considered doing a BulkInsert and/or checking to see what the FillFactors are? If files are set up without a lot of freespace you could have rippling effects from page splits, etc. Insert enough records to split a page... then insert more which cause both those to split again... etc...

Toni

*** corrected grammar **

Michael Earl-395764 SSC Guru Points: 53873 More actions · Answer 5

Michael Earl-395764

SSC Guru

Points: 53873

May 2, 2008 at 8:46 am

#810124

never tread lightly....speak your mind.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

Michael Earl (5/2/2008)
never tread lightly....speak your mind.

Definitely agree with this!!

Didn't the OP state it was a HEAP table, in which case fillfactor isn't applicable. It would be interesting to see how much extra logging a clustered index with a fillfactor of 1 would take due to page/extent splits and allocations. Certainly would create a bloated database tho! 🙂

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Anam Verma SSCertifiable Points: 5365 More actions · Answer 7

while facing similar problem earlier, i had given up this because of not having much time to spend on this. now i think i got the answer to this problem.

thanks to all gurus here.