Blog Post

Importing binary files with SSIS

,

I have gotten a number of emails over the past few days asking about how I import binary files into SSIS as well as how to improve throughput by making tasks parallel.  I have, scattered throughout this blog, articles which show bits and pieces.  I have articles on how to use the import file task, articles on how to use the Enhanced Threading Framework I put together, articles on using SHA-1 to find duplicates, etc.  This is all well and good for the person who is looking for a specific piece of the puzzle, but what about the person who wants the whole puzzle?  I put together a sample SSIS package that has all of the components (including sql scripts to build the database pieces):

  • Script task to create a list of files to import (.pdf, but can easily be modified)
  • Data Flow Task to import that list into a table
  • Structures to show the Enhanced Threading Framework (2 engines in the example)
  • Within each ETF Engine:

    • Call to the procedure that extracts a single file to be imported
    • Script Component that sources the file to be imported
    • Import File Component
    • Script Component that calculates the SHA-1 hash
    • Lookup to only push files that do not exist (based upon hash)
    • Data destination to push the file and hash into the table

    The source can be downloaded here.

    Rate

    You rated this post out of 5. Change rating

    Share

    Share

    Rate

    You rated this post out of 5. Change rating