March 18, 2011 at 1:01 am
Hi All,
Have a doubt.
I wanted to insert 6500+ email addresses in a new table.
That table has only one column "email".
I tried the following.
Insert into newtable (email)
select 'email_1'
union select 'email_2'
union select 'email_3'
union select 'email_4'
.
.
union select 'email_6500'
It was taking hell loooot of time and I had to cancel it.
But when I tried with the simple insert statement, it got completed in few seconds.
Insert into newtable (email) values 'email_1'
Insert into newtable (email) values 'email_2'
Insert into newtable (email) values 'email_3'
Insert into newtable (email) values 'email_4'
Can anyone explain me what exactly happens internaly that makes the first query that slow.?
Thanks in advance,
San
March 18, 2011 at 3:26 am
San, this doesn't make sense from this side. If you're selecting email_1, email_2 ... that sounds like different columns, yet you state it's one column.
can you right-click, script the table and post the definition here?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 18, 2011 at 3:36 am
Craig Farrell (3/18/2011)
San, this doesn't make sense from this side. If you're selecting email_1, email_2 ... that sounds like different columns, yet you state it's one column.can you right-click, script the table and post the definition here?
Thanks for your quick reply.
But I din't select it that way. I have "UNION" in between which wil get me the resule as only one column right.?
I repeat, I have only one column in my table... infact I dont even need to keep that table in DB, using it just for joining in a query.
Thanks again.
March 18, 2011 at 3:52 am
Ahhh, sorry San, I misread it on the first pass. Have to bear with me, it's ~3AM here, after a day in America celebrated by much alcoholic intake. 😀
I've ran into this problem in the past building out large UNION ALL structures (union's even worse, it DISTINCTs before it's done), and I don't know the reason offhand other then the combination of TEMPDB, the windows pagefile, and your RAM just choke after a certain point.
Build your INSERT INTO out in about 1k-2k row increments, and make sure the target DB's logfile isn't choking on the size of the data you're putting in (IE: make sure it's not trying to autogrow while you're doing this).
As a side note, if I shove 20k records into a temp table first, I don't have this problem. There's something about the UNION ALL specifically in 2k5 that just self-destructs when it gets too big.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 18, 2011 at 5:40 am
Each of those INSERT statements is treated as a miniature transaction and the INSERT ... SELECT ... UNION is a giant transaction. That's probably where the differences lie.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 21, 2011 at 4:38 am
Thank you.
Anybody has got any other explanation.?
Thanks.
March 21, 2011 at 6:44 am
I think the difference lies in the fact that in the first insert command, SQL Server needs to build the data that needs to be inserted into the table. Using just 2000 UNIONs, the execution statistics were as follows on my system:
SQL Server parse and compile time:
CPU time = 13812 ms, elapsed time = 13822 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 189 ms.
Using 2000 UNION ALLs, the execution statistics were as follows:
SQL Server parse and compile time:
CPU time = 5258 ms, elapsed time = 5258 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 158 ms.
Slightly better because SQL Server did not have to build a temporary table in tempdb to check for duplicates, but still slow.
Using 2000 discrete INSERTs, the execution statistics for each insert were as follows:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
It actually took only 1 second to run the entire 2000 INSERTs.
In the first 2 inserts using UNIONs, the parse and compile time took up almost all the processing time. If I then just run the SELECT portions e.g.
SELECT 'email1'
UNION ALL SELECT 'email2'
UNION ALL SELECT 'email3'
...
Creating a result set that returns only 1000 rows took 2 seconds, for 2000 rows it was 5 seconds, for 3000 rows it was 14 seconds, for 4000 rows it was 27 seconds, and finally for 5000 rows it took 48 seconds. I guess UNION/UNION ALL just isn't very efficient when it comes to constants.
Interestingly, I got the following error on SQL Server 2000 when running the SELECT query to return 3000 rows:
Server: Msg 8621, Level 17, State 88, Line 1
Internal Query Processor Error: The query processor ran out of stack space during query optimization.
And on SQL Server 2008 R2, the timings for the SELECT of 2000, 3000, 4000 and 5000 rows took 3, 8,
14 and 24 seconds respectively, so it would seem that things have improved somewhat.
SQL BAK Explorer - read SQL Server backup file details without SQL Server.
Supports backup files created with SQL Server 2005 up to SQL Server 2017.
March 21, 2011 at 6:46 am
I think Craig has got it , the amount of work the optimizer is doing to provide a 'good enough' plan is enormous.
Is the 'compile time' big in the execution plan ?
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply