is UNION ALL inherantly slow?

  • Hi,

    We're trying to import data from CSV files into a SQL Server database. I REALLY REALLY want to avoid using SSIS - it's clunky and awkward and means the team have to become familiar with yet another (overly complicated) technology.

    To get around this I have written a PHP script that reads the csv file and then creates a sql query that basically consists of:

    insert into [tblTableName] (

      )

      select [row 2 of csv]

      union all

      select [row 3 of csv]

      ...

      union all

      select [row n of csv]

      for a file that is about 10 columns wide and a few hundred lines long it works quite happily. once we try to load a file that is 52 columns wide and 1850 lines long it takes 6 minutes JUST to process the select statement. This seems like an excessively long time to me. it doesnt necessarily matter as it's an overnight process, it's just annoying me.

      whats faster, insert -> select -> union all -> select...

      or insert -> select ->insert -> select?

      I would have though the union all option would have been faster...

      anyone got any bright ideas?

      Thanks

      Ben

      ^ Thats me!

      ----------------------------------------
      01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
      ----------------------------------------

    1. Seems to me that BCP or OPENQUERY may be a better route for you.



      Clear Sky SQL
      My Blog[/url]

    2. I agree With Dave B; bcp, or BULK INSERT, or openquery would all be blazingly fast in comparison;

      I've used bulk insert to load files with a million rows in under a minute.

      BULK INSERT YourTable FROM 'c:\Export_o.txt'

      WITH (

      DATAFILETYPE = 'char',

      FIELDTERMINATOR = ',',

      ROWTERMINATOR = ' \n',

      FIRSTROW = 1

      )

      --or

      SELECT *

      FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',

      'Text;Database=C:\Files\CSV;HDR=YES;FMT=Delimited',

      'SELECT * FROM example.csv')

      Lowell


      --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

    3. Ah genius! thanks fellas 🙂

      having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).

      I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂

      Thanks

      Ben

      ^ Thats me!

      ----------------------------------------
      01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
      ----------------------------------------

    4. Just a side comment, frequently when dealing with UNION, UNION ALL is a way of speeding things up, not slowing things down. If you're hitting problems, it must have to do with the queries involved, not UNION ALL itself.

      "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
      - Theodore Roosevelt

      Author of:
      SQL Server Execution Plans
      SQL Server Query Performance Tuning

    5. BenWard (2/28/2011)


      Ah genius! thanks fellas 🙂

      having done some testing using insert instead of union is VASTLY quicker and requires less re-write of our existing programs so we'll probably end up going with that (completes in 35 seconds now).

      I mjust admit I haven't really used openquery before, didnt really know it existed. looking at it I think it's going to solve some enourmous headaches elsewhere in my little world so thanks very much for the pointer 🙂

      Thanks

      1850 lines of data with 52 columns in 35 seconds? You might want to look beyond OPENROWSET and it's near cousins. Look into BULK INSERT. On my poor ol' 8 year old single CPU desktop, it'll load 1.2 million rows 20 columns wide with a good amount of built in data cleansing in 51 seconds flat.

      --Jeff Moden


      RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
      First step towards the paradigm shift of writing Set Based code:
      ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

      Change is inevitable... Change for the better is not.


      Helpful Links:
      How to post code problems
      How to Post Performance Problems
      Create a Tally Function (fnTally)

    6. heh heh I know it's dreadful compared to how it *could* run but in terms of our requirements it does the job and doesn't require us to rewrite the entire import process just yet.

      When the second phase of our project comes through I'll look at re writing our import processes to follow some better practise like bulk insert.

      cheers for your help guys, I learn something new every time I come here!

      Ben

      ^ Thats me!

      ----------------------------------------
      01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
      ----------------------------------------

    7. I'm with the BULK INSERT camp when it can be used on the particular file type.

      Best,
      Kevin G. Boles
      SQL Server Consultant
      SQL MVP 2007-2012
      TheSQLGuru on googles mail service

    Viewing 8 posts - 1 through 7 (of 7 total)

    You must be logged in to reply to this topic. Login to reply