Bulk Data Transfer

  • I have a temporary table that contains 12,000,000 records that was imported from CSV files.

    I have to move certain columns of ALL the records to a specific table.

    How can I do this without having my SQL Server tired out.

    Example:

    Can I use CURSORS to retrieve batches of data based on record dates.

    Is there a resource safe INSERT Statement.

    Will doing a "INSERT INTO xxx FROM yyy" cause my SQL Server to become REALLY slow.

    [font="Courier New"]-----------------------------------------------------------------------------------------------------------------------
    😀 Arguing with Programmers are like wrestling a pig in mud, soon you realize the pig is enjoying it. 😀
    -----------------------------------------------------------------------------------------------------------------------[/font]

  • I actaully used the following:

    "[font="Courier New"]INSERT INTO xxx FROM yyy[/font]"

    The statement was running for over 8min when one of the other developers noticed that his connection to SQL Server has failed. Canceling the statement didn't even help, cause the SQL Server was inaccessible to us.

    We ended up restarting the TEMP Server PC we were working using as a SQL Server for testing.

    I really need a better solution than the above mentioned. I cant afford to have the main Server seize

    up like this.

    [font="Courier New"]-----------------------------------------------------------------------------------------------------------------------
    😀 Arguing with Programmers are like wrestling a pig in mud, soon you realize the pig is enjoying it. 😀
    -----------------------------------------------------------------------------------------------------------------------[/font]

  • Ok,

    How static is the old data.

  • The data I'm working with is Stock Exchange Closing Prices per Ric

    This data will never change again but is used for statistical purposes.

    The reason its in a temp table is because the data needed to be formatted and adjusted to suite our needs, and have look-ups done for ID columns.

    Once the data is moved to the MAIN Closing Price Table, the temp table will be removed.

    I just need the easy, safe, resource friendly method of transferring 12mil records from one table to another.

    [font="Courier New"]-----------------------------------------------------------------------------------------------------------------------
    😀 Arguing with Programmers are like wrestling a pig in mud, soon you realize the pig is enjoying it. 😀
    -----------------------------------------------------------------------------------------------------------------------[/font]

  • You can easily create the new target table either manually or by using a SELECT/INTO with a WHERE 1=0 if you're lazy like me.

    Since every server has a "tipping point" (and I believe you saw what happens when you reach the "tipping point"), you may want to write a control cursor (one of the few places where a cursor should be used) or While Loop to loop through batches of, say, 250,000 to 1,000,000 rows at a time to insert the data in the new table.

    If you're "cursor phobic" like I am ;-), you can write some dynamic SQL to do the job without any looping at all. The dynamic SQL would come out as multiple insert statements with the necessary criteria to limit the "batch sizes" to whatever number you chose.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s. Also, unless you're in the "Simple" Recovery Mode, watch your log file. You may want to do some backups between "steps" in the multi-insert process to keep the size under control.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I kinda crashed our SQL Server using a [font="Courier New"]INSERT INTO tblX SELECT FROM tblY[/font] method. Haha!

    After damage control, I decided to take a safer approach by using SSIS ... and WOW!

    It works like a charm. :w00t:

    Where as SQL statement abducts every resource on a server, eventually driving it into a brick wall at 300km/h, SSIS is way more resource friendly.

    I created a detailed process to take each of original CSV files (via Loop), import their data into a temp table, validate the data, and move it to the real table. The temp table is truncated and the next file is taken. It even sends me a mail when an n-amount of files has been successful meaning I don't have to sit around and keep an eye on it. It so safe, if a file fails the process just halts.

    Using SQL Heart Beat i was able to spot the HUGE difference in resource usage by the two methods (t-SQL and SSIS). Where t-SQL takes the server hostage and eventually kills it, SSIS shows a "Catch-And-Release" result, using resource for only a few second per CSV file.

    I would highly recommend using SSIS for these kind of tasks.

    [font="Courier New"]-----------------------------------------------------------------------------------------------------------------------
    😀 Arguing with Programmers are like wrestling a pig in mud, soon you realize the pig is enjoying it. 😀
    -----------------------------------------------------------------------------------------------------------------------[/font]

  • If T-SQL killed the server or uses as much resources as you say for such a simple ETL process, the T-SQL was simply written incorrectly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply