is UNION ALL inherantly slow?

Question

Post reply

is UNION ALL inherantly slow?

BenWard

SSCertifiable

Points: 5903
More actions
February 28, 2011 at 5:01 am

#232753
Hi,
We're trying to import data from CSV files into a SQL Server database. I REALLY REALLY want to avoid using SSIS - it's clunky and awkward and means the team have to become familiar with yet another (overly complicated) technology.
To get around this I have written a PHP script that reads the csv file and then creates a sql query that basically consists of:
insert into [tblTableName] (
)
select [row 2 of csv]
union all
select [row 3 of csv]
...
union all
select [row n of csv]
for a file that is about 10 columns wide and a few hundred lines long it works quite happily. once we try to load a file that is 52 columns wide and 1850 lines long it takes 6 minutes JUST to process the select statement. This seems like an excessively long time to me. it doesnt necessarily matter as it's an overnight process, it's just annoying me.
whats faster, insert -> select -> union all -> select...
or insert -> select ->insert -> select?
I would have though the union all option would have been faster...
anyone got any bright ideas?
Thanks
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 1

Seems to me that BCP or OPENQUERY may be a better route for you.

Clear Sky SQL
My Blog[/url]

Lowell SSC Guru Points: 323518 More actions · Answer 2

I agree With Dave B; bcp, or BULK INSERT, or openquery would all be blazingly fast in comparison;

I've used bulk insert to load files with a million rows in under a minute.

BULK INSERT YourTable FROM 'c:\Export_o.txt'

WITH (

DATAFILETYPE = 'char',

FIELDTERMINATOR = ',',

ROWTERMINATOR = ' \n',

FIRSTROW = 1

)

--or

SELECT *

FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',

'Text;Database=C:\Files\CSV;HDR=YES;FMT=Delimited',

'SELECT * FROM example.csv')

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

BenWard SSCertifiable Points: 5903 More actions · Answer 3

Ah genius! thanks fellas 🙂

having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).

I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂

Thanks

Ben

^ Thats me!

----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 4

Just a side comment, frequently when dealing with UNION, UNION ALL is a way of speeding things up, not slowing things down. If you're hitting problems, it must have to do with the queries involved, not UNION ALL itself.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1004318 More actions · Answer 5

BenWard (2/28/2011)
Ah genius! thanks fellas 🙂
having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).
I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂
Thanks

1850 lines of data with 52 columns in 35 seconds? You might want to look beyond OPENROWSET and it's near cousins. Look into BULK INSERT. On my poor ol' 8 year old single CPU desktop, it'll load 1.2 million rows 20 columns wide with a good amount of built in data cleansing in 51 seconds flat.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

BenWard SSCertifiable Points: 5903 More actions · Answer 6

heh heh I know it's dreadful compared to how it *could* run but in terms of our requirements it does the job and doesn't require us to rewrite the entire import process just yet.

When the second phase of our project comes through I'll look at re writing our import processes to follow some better practise like bulk insert.

cheers for your help guys, I learn something new every time I come here!

Ben

^ Thats me!

----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

I'm with the BULK INSERT camp when it can be used on the particular file type.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service