its definately a SQL Server Bug (table with 80 million rows)

Question

its definately a SQL Server Bug (table with 80 million rows)

Prakash Heda

SSCertifiable

Points: 5539
More actions
April 30, 2008 at 9:45 am

#380252

I faced an strange issue, i debug it down to the lowest level, simplified....here is detail...
One table with only one column having 80 million rows, try to insert that data in another table with same one column(not select into method) and log file grows to 17 times than the data...
insert into pullVendor
select id from tvendor
facts:
pullVendor is empty table and have id column with same data type as tvendor(numeric (19,0))
database is in simple log recovery
sql 2k5 std edition with 64 bit sp2 and 4gb ram, quad core (i tried same thing on sql s2k also same logging behavior)
total reserved data size of tvendor table is 1.2 GB
select into just works fine, no issues, insert into another table creates log file to 20 GB+
so overall it looks like sql server behaves strangely while handling 80 million rows, no logic suggests a simple insert of 1.2 gb data created log of 20+ GB
Tried and tested with many options and settings, ensure database is not corrupt also
This is the first time I am handling such no of rows thus just wanna check if anybody think this is not a bug and something can be done to fix this logging issue...
Awaiting your responses...
Thanks....Prakash:)
Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184394 More actions · Answer 1

I'm going to throw a guess at this one. You have autogrow on the database and the autogrow process is causing the log file to grow. Try making sure your database file has enough space add the data and see if that causes the same issue.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 2

Jack Corbett (4/30/2008)
I'm going to throw a guess at this one. You have autogrow on the database and the autogrow process is causing the log file to grow. Try making sure your database file has enough space add the data and see if that causes the same issue.

this is already done with 30 GB log file size it works fine with 24 min, autogrow is on though, but not used during this query execution

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

Jack Corbett SSC Guru Points: 184394 More actions · Answer 3

Are there any indexes on the created table? Select Into will not create indexes.

Is the behavior the same if you do a subset of rows?

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

PW-201837 SSC-Insane Points: 20805 More actions · Answer 4

Are there indexes on the table being inserted into ?

Are there triggers on the table being inserted into ?

Are there any indexed (materialised) views that reference the table being inserted into ?

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 5

PW (4/30/2008)
Are there indexes on the table being inserted into ?
Are there triggers on the table being inserted into ?
Are there any indexed (materialised) views that reference the table being inserted into ?

insert into table does not have any index, heap method is preffered, select table has index on id column though

no triggers, nothing fancy like materialised vioew, as i mentioned i debug it down to lowest level, which means create a table with 1 column as bigint and then insert 80 million rows and then try inserting data from this table to another and you should see this issue...

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

Michael Earl-395764 SSC Guru Points: 53873 More actions · Answer 6

You have a table that has a single integer field and it is a heap.

Every time a transaction is logged against this table, it has to include both the piece of data, and the information about what is happening. Because your table is so simple, the overhead of the other information far out-weighs the actual data.

What you are describing is to be expected.

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 7

Michael Earl (4/30/2008)
You have a table that has a single integer field and it is a heap.
Every time a transaction is logged against this table, it has to include both the piece of data, and the information about what is happening. Because your table is so simple, the overhead of the other information far out-weighs the actual data.
What you are describing is to be expected.

1.2 GB of data translates into 20+ GB of log files...this is like 17 times, seems to me you are suggesting its nothing to do with 80 million rows but thats how sql server works....

I would like to differ here as even a clustered index creation is okay with 3 times of space in data and log files.....

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

Michael Earl-395764 SSC Guru Points: 53873 More actions · Answer 8

Do you have an application that will allow you to look at the contents of your log file? If you do, look at what is logged. You will find that for each transaction, the data from your table is logged, the transactionID (that alone made the log entry twice as large as the data), type of transaction, and a bunch of other information that escapes me right now.

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 9

Michael Earl (4/30/2008)
Do you have an application that will allow you to look at the contents of your log file? If you do, look at what is logged. You will find that for each transaction, the data from your table is logged, the transactionID (that alone made the log entry twice as large as the data), type of transaction, and a bunch of other information that escapes me right now.

Hi Michael,

thanks for your input...there are some external utilities allows you to look into transaction log, and I remember there was some dbcc command also which does that to some extent....to validate the concept you suggseted, I am planning to run same excersize with 10 million rows and see how log increased, if its the same ratio, that proves its normal to sql server....how does that sounds?

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

PW-201837 SSC-Insane Points: 20805 More actions · Answer 10

The command is

DBCC Log('YourDatabaseName', 1)

(2nd parameter is level of detail).

Run:

Backup Log YourDatabaseName With truncate_only

Checkpoint

The DBCC results should show only 2 entries

Now run your insert statement. It shouldn't take 10 million rows to illustrate what's going on.

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 11

PW (4/30/2008)
The command is
DBCC Log('YourDatabaseName', 1)
(2nd parameter is level of detail).
Run:
Backup Log YourDatabaseName With truncate_only
Checkpoint
The DBCC results should show only 2 entries
Now run your insert statement. It shouldn't take 10 million rows to illustrate what's going on.

Thanks tried these and got all the rows in log file as expected....

So it seems to conclude thats SQL server does generate log files 7 times bigger than the data in some cases like here....and this is something acceptable as nothing can be done to improvise log file usage...

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 12

Yeah you can... insert a million rows at a time instead of all 80 million. Log won't grow but a bit on simple recovery.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Prakash Heda SSCertifiable Points: 5539 More actions · Answer 13

Jeff Moden (4/30/2008)
Yeah you can... insert a million rows at a time instead of all 80 million. Log won't grow but a bit on simple recovery.

Thanks Jeff, I was worried about 1.2 gb data translates into 20+ GB log file, and the explanation given by Michael seems justifying why it happens and its acceptable behavior from SQL Server....

Thanks Guys and especially to Michael....

Prakash Heda
Lead DBA Team - www.sqlfeatures.com
Video sessions on Performance Tuning and SQL 2012 HA

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 14

Michael Earl (4/30/2008)
You have a table that has a single integer field and it is a heap.
Every time a transaction is logged against this table, it has to include both the piece of data, and the information about what is happening. Because your table is so simple, the overhead of the other information far out-weighs the actual data.
What you are describing is to be expected.

Sorry Michael, I don't follow what you are saying. Especially since this is a single transaction. Can you please elaborate and/or provide some references? TIA.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service