File Loading

  • Hey Everyone,

    I have a question about loading data...Here is the situation:

    There are 10 to 12 centers that send their data on a quarterly basis in a zipped folder...They have had this data since 2007 and now they want a DBA to load their data into a database...So each center has a zipped folder containing round abouts 20 .DBF and .FPT files for each quarter for the past 2 years...Thats about 3,000+ files in total that needs to be loaded into a SQL database...

    Is there an easy way to load this data? I mean manually I know I can load each file separately but with over 3,000 files, its very tedious and time consuming...Is there a way of automating it to load without having to load each file separately?

    Any suggestions are welcome!!!

    Thanks,

    Adam

  • i would create a SSIS packages to unzip the files to a working directory and then use the For-Each loop to loop through the files and import that data from there.

    This could be either easy or difficult depending on how consistent the files are,

    do alll the files have the same structure or are they from different databases?

  • I agree with Steve.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • So I can unzip the folders in SSIS? I didnt know if that would work or not...

    I am not familiar with For-Each loop to loop through the files???

    Some of the files are consistent but there is one file from each folder that the 1st 6 fields are the same but the rest is all different...

  • you can unzip in ssis, you can use the execute process task to execute the unzip application with a dynamic filename, you can also use a script task and do the unzip in .NET

    A for-each loop is a container that will execute the items in the container, similar to a for-each loop in code.

    you can loop through the files in a folder and then pass the file name into a data flow that could have a dyamic source based on the filename, if the file are consitent then this will be fairly easy, there are plenty of examples on google/bing that show how to use ssis to loop through files

    for the one file that is different you may have to create a seperated for-each loop so it can use its own data source as creating dynamic datasource columns can be a pain. Are you able to recognise these files based on filename?

  • Unzipping can be done in several ways in SSIS. The easiest one is to run the "command line" command for whatever unzipping program you have on that server. You can also do it from a Script Task using VB or C#. I haven't done that myself, but I have seen scripts for doing so.

    There are also customs tasks that can be aquired from different companies that do unzipping. I'm not sure if any of those are available for free.

    FYI, I was typing my reply at the same time as Steve, so I had not yet seen his reply.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Thank you guys both...I will google for each loop in ssis to see if I can use that...The file that is different has the name of nc04_use from each facility BUT they all have different amount of columns...

    So can I google SSIS unzipping folders and it will give me an example of how to do this in SSIS?

    Thanks again guys for your input!!!

  • there are a lot of ssis resource on

    http://www.sqlis.com"> http://www.sqlis.com

    in particular

    http://www.sqlis.com/post/Looping-over-files-with-the-Foreach-Loop.aspx"> http://www.sqlis.com/post/Looping-over-files-with-the-Foreach-Loop.aspx

    unzipping shouldn't be a problem as you are just executing an application in the same way as you would from the cmd line.

    Let me know if you run into any issues with the SSIS package..

  • Thanks steveb for helping me on these issues...I am going to start with one then go to the other...Easiest first...I want to unzip these folders...In this directory, there are 160 zipped folders...So instead of unzipping them one at a time, I can create a SSIS package to do this, correct? I opened up Buisness Intelligence Studio and started a new package (Unzip Folders)...I drug the Execute Process Task to the middle of screen...It has a red x at the side...Do I need to use the connection manager to connect to anything? ...When I double click on the Execute Process Task --> it starts with General Tab where to enter Name and Description...I entered Unzip All Folders as Name and left Description blank...---> Moved down to Process Tab...Not sure what I need to set or what I set these too? Unzipping it one by one, I just right clicked and selected Extract All...and a wizard pops up and it goes through the unzipping of that folder...Do I need to download a program that does this? What do you prefer? After I download that, if needed, What goes in these fields under the Process tab:Executeable, Arguments, WorkingDirectory, StandardInputVariable, StandardOutput Variable, WindowStyle????

    I apologize for this...Im new to SSIS and being a DBA really so my experience is not very high!!!

    Thank you again for helping me!

    Im look forward to your response to this!!!

  • steveb?

    what do I need to download that is free to unzip all the folders?

  • 7-Zip. Has a command line interface, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can use any tool that supports comand line unzipping, i have used WinRar and as Jeff mentioned 7-Zip would work as well. You would use pretty much the same format to call either

    in the process tab for the task in executables you would put the name of your unzip application ,and in arugments you put the parameters needed , since you have a number of folders to unzip you would need to loop through the files and assign the argument dynamically using a variable, a good example of this

    http://www.sql-server-performance.com/faq/ssis_unzip_p1.aspx"> http://www.sql-server-performance.com/faq/ssis_unzip_p1.aspx

  • Thank you again!

    That example is very helpful as well as the example you provided for the ssis package to load the data...Although I understand the concept of the package, surprise, surprise I have another question...Under the arguments, how do I know what to put under there? You mentioned parameters...I dont understand the parameters? In the example, they used the following:

    "e " +@[User::currentFile] + " -o+"

    as the argument for when using the foreachloop...

    For the single file, they used:

    e [Location of zipped file] -o+

    And I understand that one but the one for the foreachloop, the @[User::currentFile] confused me...Do I just put the location of where all the zipped folders are? Or what do I need to put?

    Again thank you so much for your input and your help!

  • For the parameters it would depend on the unzip util you are using, in that example the 'e' mean extract and the 'o' means overwirte.

    so they are building up the extract string dynamically with the file name on each iteration of the loop the filename would be updated.

    And I understand that one but the one for the foreachloop, the @[User::currentFile] confused me..

    .Do I just put the location of where all the zipped folders are? Or what do I need to put?

    so @[User::currentFile] would hold the filename and path on each iteration. eg 'C:\document\myfile.zip'

    To do this you need to create a variable that will hold the filename in the loop container click on the variable mapping options and use the drop down box to create new variable. This will hold the filename and path

    in your execute process task you can use this varaible to build the extract parameters so in each iteration of the loop

    "e " +@[User::currentFile] + " -o+"

    would evaluate to something like

    e 'C:\document\myfile.zip -o

    which if run in the command line would be

    winrar.exe e 'C:\document\myfile.zip -o which is the command for winrar to extract a folder.

    i have not tested this so the code may take some tweaking but the basic structure is the same.

  • I am using 7-zip so does that mean e still means extract and o still means overwrite or is it different?

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply