Bulk Insert Peformance Question

  • We have a SQL Server document repository that we need to get content into. I currently have a flat file with a list of paths to the content files located on a network share that need to be inserted into the DB. I'm looking for guidance/advice on what would be the most efficient way to insert this content. Would BCP be the best way? I am also very open to using SSIS, however I have just started creating packages, so I'm not sure if SSIS has a better, more efficient way of handling this kind of task. The BCP is currently being done by a developer's .NET code, and we have only been loading subsets of the content at a time (4-5GB). We could however have a need to insert all of the content at once (350GB). Anyone have any suggestions? Let me know if you'd like more information from me.

    Thanks,

    Adam

  • Just to make sure I understand:

    1. You have a file (file 1) that contains a list of other files.

    2. All of the files listed in file 1 need to be loaded into sql.

    Correct?

    Are the files being loaded as a blob, where the entire file is loaded into one column (i.e. a word or pdf document)?

    Or are the files to be loaded as data, i.e. a csv or similar file?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Hi Wayne,

    Thanks for the response. Correct, I have one flat file that contains a list of paths to other files that need to be loaded into SQL Server. The files are blobs, and are being inserted into a varbinary field that has filestream enabled. We're basically trying to find out what the most efficient way to insert the files will be. It seems like SSIS could be a good candidate for this, although as I mentioned I'm just beginning to dive into SSIS and I'm not entirely sure..

  • ackreul (9/3/2010)


    Hi Wayne,

    Thanks for the response. Correct, I have one flat file that contains a list of paths to other files that need to be loaded into SQL Server. The files are blobs, and are being inserted into a varbinary field that has filestream enabled. We're basically trying to find out what the most efficient way to insert the files will be. It seems like SSIS could be a good candidate for this, although as I mentioned I'm just beginning to dive into SSIS and I'm not entirely sure..

    Of course "It Depends" but I'd likely not load the actually blobs into the database especially if they contained 350GB in total. Imagine what that will do to a "Restore" if you ever need to do one.

    I believe I'd only load the UNC path to the file and do the rest from there.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Again, the column where the data is held has filestream enabled, so the actual data will not be stored in the database, storing only the paths is not an option for us. I'm basically just trying to find out if it's worth my time to explore SSIS as a viable option or not.

  • ackreul (9/3/2010)


    Again, the column where the data is held has filestream enabled, so the actual data will not be stored in the database, storing only the paths is not an option for us. I'm basically just trying to find out if it's worth my time to explore SSIS as a viable option or not.

    You could use OPENROWSET BULK to first pick up the paths from the flat file and then use OPENROWSET BULK again, this time with the SINGLE_BLOB option, to load the data

  • What about the 'import column' function in SSIS? Does anyone know anything about this feature and if it would be faster than a bulk insert?

    Here's a link for reference:

    http://www.bimonkey.com/2009/09/the-import-column-transformation/

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply