Insert Performance

Question

Post reply

Insert Performance

radRCL

SSC Rookie

Points: 28
More actions
March 12, 2013 at 8:49 pm

#394377

Hi,
Which will perform faster/better inserting million rows to a table or put all insert in a transaction then commit?
Does it have a considerable difference?
Example
insert into ...
insert into ...
insert into ...
insert into ...
insert into ...
or
begin transaction
insert into ...
insert into ...
insert into ...
insert into ...
insert into ...
commit transaction;
Thanks!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Sergiy SSC Guru Points: 110208 More actions · Answer 1

Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.

_____________
Code for TallyGenerator

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 2

Sergiy (3/12/2013)
Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.

UNION ALL million rows? I don't think so!

What is the source? Where data coming from?

BTW. Million separate inserts in one transaction will be faster than without transaction if you have enough space in your transaction log to cover it without grow.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Sergiy SSC Guru Points: 110208 More actions · Answer 3

Actually, building such a query for a million rows will produce about a GB of a script.

It will take ages lust to parse it.

For million rows uploads there are BULK INSERT and BCP commands.

Everything else is so bad - there is no point even in discussing what's better.

_____________
Code for TallyGenerator

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

Sergiy (3/13/2013)
Actually, building such a query for a million rows will produce about a GB of a script.
It will take ages lust to parse it.
For million rows uploads there are BULK INSERT and BCP commands.
Everything else is so bad - there is no point even in discussing what's better.

That is why I am asking OP what is the source.

It may be that BULK INSERT, BCP or SSIS should be used, or, if the source is just another tables in the same database, then simple INSERT INTO ... SELECT can be used (or SELECT INTO...).

There are not enough details provided to make any meaningful suggestion...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Sergiy SSC Guru Points: 110208 More actions · Answer 5

From how it looks - it's about scripting DB deployment as a part of an installation package.

That's how common tools build those scripts.

Not sure what kind of million-rows table could be a part of such a package though...

But anyway - saving data in files and scripting BULK INSERT upload would be the most effective approacbh.

_____________
Code for TallyGenerator

Grant Fritchey SSC Guru Points: 398913 More actions · Answer 6

A million individual insert statements is a horrible approach. Period. Doing it within a single transaction will be painful because you're going to have to have a huge transaction log to support. If the log has to grow while you're running the transaction it will slow it down even more. You need to find a more batch oriented approach. If you're pulling data from a location, see if you can pass it as a table variable or XML and then write a single insert statement that moves all one million rows. That's the better approach.

But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

T.Ashish SSCarpal Tunnel Points: 4248 More actions · Answer 7

T.Ashish

SSCarpal Tunnel

Points: 4248

March 13, 2013 at 6:07 am

#1596456

I agree with Grant

Sergiy SSC Guru Points: 110208 More actions · Answer 8

Year, hard to argue.

Except, probably, for this one:

Grant Fritchey (3/13/2013)
But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.

Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.

And still single load always outperforms set of partial ones (in my experience).

_____________
Code for TallyGenerator

Grant Fritchey SSC Guru Points: 398913 More actions · Answer 9

Sergiy (3/13/2013)
Year, hard to argue.
Except, probably, for this one:
Grant Fritchey (3/13/2013)
But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.
Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.
And still single load always outperforms set of partial ones (in my experience).

I guess it really depends on how you do the test. I haven't done precise measures on these for several years. But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Sergiy SSC Guru Points: 110208 More actions · Answer 10

Grant Fritchey (3/13/2013)
But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.

Were those single inserts wrapped into a isngle transaction?

Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.

Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.

_____________
Code for TallyGenerator

radRCL SSC Rookie Points: 28 More actions · Answer 11

Thanks for the insights.

I don't have really an implementation for this. My boss just asked me that sort of question and just wondering if some of you might have a real life experience.

Grant Fritchey SSC Guru Points: 398913 More actions · Answer 12

Sergiy (3/13/2013)
Grant Fritchey (3/13/2013)
But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.
Were those single inserts wrapped into a isngle transaction?
Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.
Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.

We did a series of tests going from single transactions with single inserts, single transactions with multiple single inserts, and single transactions with batches of inserts. The performance order was the opposite of what I listed. Batch was best followed by groups of inserts within a transaction followed by single transactions.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

The log buffer flushes from the individual inserts without a wrapping begin/commit tran will absolutely crush your throughput.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service