How to unzip the Updates files and avoid which are already unzipped?

  • Hi everyone,

    I recently started using WinSCP.

    I download some .zip files using it from a remote server on a daily basis.

    These files are basically updates that arrive on the server on a daily basis and save to my local folder: D:\Updates\

    Named like these:

    UPDATE_20220805.zip, UPDATE_20220804.zip, UPDATE_20220803.zip, UPDATE_20220802.zip .....

     

    I want to know two things:

    1) How can I automate this process of downloading the files using SSIS or may be some other method?

    2) How do I ONLY unzip the new file(s) that arrive daily.

    For example today I received UPDATE_20220805.zip

    and yesterday I received UPDATE_20220804.zip

    I already unzipped the yesterday file using the ssis package so I don't want to extract it again.

    • This topic was modified 2 years, 3 months ago by  Jobs90312.
  • Best practice is to archive unzipped files after processing them. Just move them to an Archive folder. That way, you know that anything in your updates folder has not yet been processed.

    WINSCP can be used in SSIS. Have a look here for some details.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin wrote:

    Best practice is to archive unzipped files after processing them. Just move them to an Archive folder. That way, you know that anything in your updates folder has not yet been processed.

    WINSCP can be used in SSIS. Have a look here for some details.

     

    Thing is my local folder basically synchronizes with remote server folder.

    Files at remote server are same as local folder.

    If I move the local files to a archive folder so when I synchronize on WinSCP again to download latest files, it will also download the missing files again...

    How do I avoid this?

    • This reply was modified 2 years, 3 months ago by  Jobs90312.
  • Yuck!

    If you can't change that sync stuff, here is one idea.

    Create a table called FileNamesProcessed (or whatever) and write the 'current' file name to that at the end of a successful load.

    Modify your ETL process to ignore any file names which exist in this table.

    Run times will gradually deteriorate, as more and more files are added to the folder, but it should do the job.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • multiple ways - most commons are

    • have a table on SQL server where you store the name of the zip files you already processed
    • only process files with a create/modified date greater than the last time you processed (again store on a table or on a file

    both options easy on your case as the filenames already contain a date on their name.

    implementing it should be easy enough with SSIS - although not with other synchronization software or with likes of robocopy (except for option 2 above)

    on a different note - this should be done on your SQL Server server - not on your own PC - and with a proper process in place to do this process.

  • You asked how to use WinSCP in SSIS - the assumption would be that you are currently using WinSCP manually to synchronize the folders.  Is that correct?

    My next question would be: Why are you keeping the files on the sFTP server once they have been downloaded?  The better option would be to remove the files once they have been downloaded.  If the files must be kept on that server (why?) - another option would be to move them from the root folder to an archive folder on the sFTP server.

    If you don't have permissions to remove the files or move them - then you really need to get that permission.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    You asked how to use WinSCP in SSIS - the assumption would be that you are currently using WinSCP manually to synchronize the folders.  Is that correct?

    My next question would be: Why are you keeping the files on the sFTP server once they have been downloaded?  The better option would be to remove the files once they have been downloaded.  If the files must be kept on that server (why?) - another option would be to move them from the root folder to an archive folder on the sFTP server.

    If you don't have permissions to remove the files or move them - then you really need to get that permission.

    Yes I am synchronizing manually.

    And yes I do not have access/rights to remove or delete files from the remote side. I can only download these files on my local directory and use them how ever I want.

    Lets just say whoever manages the server cannot give this access to me.

     

  • Frederico and I have both suggested solutions. Have you decided that they are not applicable in your case? If so, please explain why.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Jobs90312 wrote:

    Jeffrey Williams wrote:

    You asked how to use WinSCP in SSIS - the assumption would be that you are currently using WinSCP manually to synchronize the folders.  Is that correct?

    My next question would be: Why are you keeping the files on the sFTP server once they have been downloaded?  The better option would be to remove the files once they have been downloaded.  If the files must be kept on that server (why?) - another option would be to move them from the root folder to an archive folder on the sFTP server.

    If you don't have permissions to remove the files or move them - then you really need to get that permission.

    Yes I am synchronizing manually.

    And yes I do not have access/rights to remove or delete files from the remote side. I can only download these files on my local directory and use them how ever I want.

    Lets just say whoever manages the server cannot give this access to me.

    https://winscp.net/eng/docs/faq_script_modified_files

    WinSCP already has documented solutions to automatically download only new files.  This may not work for you without some additional modifications but it would be a good starting point.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply