March 12, 2013 at 8:49 pm
Hi,
Which will perform faster/better inserting million rows to a table or put all insert in a transaction then commit?
Does it have a considerable difference?
Example
insert into ...
insert into ...
insert into ...
insert into ...
insert into ...
or
begin transaction
insert into ...
insert into ...
insert into ...
insert into ...
insert into ...
commit transaction;
Thanks!
March 12, 2013 at 10:44 pm
Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.
_____________
Code for TallyGenerator
March 13, 2013 at 3:01 am
Sergiy (3/12/2013)
Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.
UNION ALL million rows? I don't think so!
What is the source? Where data coming from?
BTW. Million separate inserts in one transaction will be faster than without transaction if you have enough space in your transaction log to cover it without grow.
March 13, 2013 at 3:43 am
Actually, building such a query for a million rows will produce about a GB of a script.
It will take ages lust to parse it.
For million rows uploads there are BULK INSERT and BCP commands.
Everything else is so bad - there is no point even in discussing what's better.
_____________
Code for TallyGenerator
March 13, 2013 at 4:01 am
Sergiy (3/13/2013)
Actually, building such a query for a million rows will produce about a GB of a script.It will take ages lust to parse it.
For million rows uploads there are BULK INSERT and BCP commands.
Everything else is so bad - there is no point even in discussing what's better.
That is why I am asking OP what is the source.
It may be that BULK INSERT, BCP or SSIS should be used, or, if the source is just another tables in the same database, then simple INSERT INTO ... SELECT can be used (or SELECT INTO...).
There are not enough details provided to make any meaningful suggestion...
March 13, 2013 at 4:36 am
From how it looks - it's about scripting DB deployment as a part of an installation package.
That's how common tools build those scripts.
Not sure what kind of million-rows table could be a part of such a package though...
But anyway - saving data in files and scripting BULK INSERT upload would be the most effective approacbh.
_____________
Code for TallyGenerator
March 13, 2013 at 4:59 am
A million individual insert statements is a horrible approach. Period. Doing it within a single transaction will be painful because you're going to have to have a huge transaction log to support. If the log has to grow while you're running the transaction it will slow it down even more. You need to find a more batch oriented approach. If you're pulling data from a location, see if you can pass it as a table variable or XML and then write a single insert statement that moves all one million rows. That's the better approach.
But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 13, 2013 at 6:07 am
I agree with Grant
March 13, 2013 at 4:31 pm
Year, hard to argue.
Except, probably, for this one:
Grant Fritchey (3/13/2013)
But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.
Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.
And still single load always outperforms set of partial ones (in my experience).
_____________
Code for TallyGenerator
March 13, 2013 at 5:32 pm
Sergiy (3/13/2013)
Year, hard to argue.Except, probably, for this one:
Grant Fritchey (3/13/2013)
But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.
And still single load always outperforms set of partial ones (in my experience).
I guess it really depends on how you do the test. I haven't done precise measures on these for several years. But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 13, 2013 at 7:34 pm
Grant Fritchey (3/13/2013)
But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.
Were those single inserts wrapped into a isngle transaction?
Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.
Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.
_____________
Code for TallyGenerator
March 14, 2013 at 1:46 am
Thanks for the insights.
I don't have really an implementation for this. My boss just asked me that sort of question and just wondering if some of you might have a real life experience.
March 14, 2013 at 4:39 am
Sergiy (3/13/2013)
Grant Fritchey (3/13/2013)
But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.Were those single inserts wrapped into a isngle transaction?
Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.
Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.
We did a series of tests going from single transactions with single inserts, single transactions with multiple single inserts, and single transactions with batches of inserts. The performance order was the opposite of what I listed. Batch was best followed by groups of inserts within a transaction followed by single transactions.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 14, 2013 at 7:12 am
The log buffer flushes from the individual inserts without a wrapping begin/commit tran will absolutely crush your throughput.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply