February 28, 2011 at 5:01 am
Hi,
We're trying to import data from CSV files into a SQL Server database. I REALLY REALLY want to avoid using SSIS - it's clunky and awkward and means the team have to become familiar with yet another (overly complicated) technology.
To get around this I have written a PHP script that reads the csv file and then creates a sql query that basically consists of:
insert into [tblTableName] (
)
select [row 2 of csv]
union all
select [row 3 of csv]
...
union all
select [row n of csv]
for a file that is about 10 columns wide and a few hundred lines long it works quite happily. once we try to load a file that is 52 columns wide and 1850 lines long it takes 6 minutes JUST to process the select statement. This seems like an excessively long time to me. it doesnt necessarily matter as it's an overnight process, it's just annoying me.
whats faster, insert -> select -> union all -> select...
or insert -> select ->insert -> select?
I would have though the union all option would have been faster...
anyone got any bright ideas?
Thanks
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
February 28, 2011 at 5:10 am
Seems to me that BCP or OPENQUERY may be a better route for you.
February 28, 2011 at 5:34 am
I agree With Dave B; bcp, or BULK INSERT, or openquery would all be blazingly fast in comparison;
I've used bulk insert to load files with a million rows in under a minute.
BULK INSERT YourTable FROM 'c:\Export_o.txt'
WITH (
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = ' \n',
FIRSTROW = 1
)
--or
SELECT *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Text;Database=C:\Files\CSV;HDR=YES;FMT=Delimited',
'SELECT * FROM example.csv')
Lowell
February 28, 2011 at 5:51 am
Ah genius! thanks fellas 🙂
having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).
I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂
Thanks
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
February 28, 2011 at 1:29 pm
Just a side comment, frequently when dealing with UNION, UNION ALL is a way of speeding things up, not slowing things down. If you're hitting problems, it must have to do with the queries involved, not UNION ALL itself.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 28, 2011 at 7:22 pm
BenWard (2/28/2011)
Ah genius! thanks fellas 🙂having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).
I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂
Thanks
1850 lines of data with 52 columns in 35 seconds? You might want to look beyond OPENROWSET and it's near cousins. Look into BULK INSERT. On my poor ol' 8 year old single CPU desktop, it'll load 1.2 million rows 20 columns wide with a good amount of built in data cleansing in 51 seconds flat.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 1, 2011 at 4:33 am
heh heh I know it's dreadful compared to how it *could* run but in terms of our requirements it does the job and doesn't require us to rewrite the entire import process just yet.
When the second phase of our project comes through I'll look at re writing our import processes to follow some better practise like bulk insert.
cheers for your help guys, I learn something new every time I come here!
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
March 2, 2011 at 4:05 pm
I'm with the BULK INSERT camp when it can be used on the particular file type.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply