Yes, there are many clients available in the market for FTP needs and they are good and highly intuitive. So why use the SSIS FTP Task? My response will be this task has very close integration with other data transformation tasks such as loading the file to a table. SSIS has a built-in feature for FTP, but point to be noted here is that Integration Services has native support for only FTP and not SFTP.
In this post, we shall design a SSIS package to send files or upload files to a local server from a FTP site using the File Transfer Protocol.
We have two distinct approaches available:
- Develop a script task either in C# or VB.NET which will perform actions such as create a remote session, perform the required activities such as looping to the designated directory , selecting the file and port the file to the local system at the path chosen, on the server and lastly gracefully exit the session.
- Use a FTP Task and configure the task (establish a connection; choose the action desired to perform) that suits the business need.
Business Case
Suppose you get a periodic load from a remote FTP site to a staging environment where you process the load. This is a very common scenario when a source, systemA, spits an aggregated data in form of a file (.CSV, .TXT, etc.). This file is ready to be consumed by systemB, but this needs to be made available in the local staging environment
Open the SSDT and drag the FTP Task component on the designer window.
Since, no connection manager exists at design time; you will see a red cross mark on the task. At this point, the designer has two options:
- Supply static connection details such as username, password and of course the FTP site name
- Create a re-usable connection manager, by supplying the necessary details dynamically at run-time; remember to set Delay Validation property of the connection manager to TRUE.
As shown in the graphic below, the designer can assign run time values to the necessary attributes of the connection manager and thereby making the component re-usable in a multi server environment.
For the sake of simplicity, let’s just stick to option 1
Configuring the conection manager
Now double click on the component. This will open the FTP Task Editor.
Using this window, one can establish a new FTP connection manager; configure the task for a specific FTP operation (such as Send Files, Receive Files etc.) and also add any expression to tweak the task execution. Click on the down arrow at the extreme right of FTPConnection. This will result into a new dialog box, wherein the user can provide necessary connection details and required behavior.
Note:
- You may configure the FTP is passive mode by checking the option – Use Passive Mode
- Chunk Size is the size of each packet involved in the transfer. This has a default of 1KB and can have a maximum size of 1MB in SSIS. This is the upper limit.
- Using this design, only one file can be sent to the remote destination. If this is not the requirement, then the package designer will have to introduce a looping mechanism, in order to loop over a set of files
- Also, data transfer can take place either in ASCII or in Binary format. ASCII can be used for plain text files and files which have strings numbers, on the hand if we have files such as {.zip, .exe, image files or rich text files}. Binary transfer mode is to be used.
Adding an Action to the Task
Click on the option File Transfer. This is where the designer would define the behavior of the task. Various options available are Send file, Receive file, Delete local files, Delete remote files etc.
There is a provision of supplying varying local and remote path(s) at run time, but such an action would require a constant loop mechanism over a dataset to populate the required values.
Now, for the file(s) that that are required in the local environment, all that is required is to specify their name in the RemotePath field. This field supports regular expressions. So if a value ending with say <<remote_path>>/<<home_directory>>/*.csv is supplied, the task will fetch all CSV(comma separated) files with the path specified.
The third and last option in the task configuration is the place where the designer can tweak the natural behavior of the task using custom expressions.
Tweak the task
Suppose you wish to overwrite the destination(the local or staging environement) with the new and updated file(s). Assuming there is no file archival process(move the file to a different folder once the file has been processed), the FTP task when scheduled to run will fail with a message like "File name already exists."
The FTP Task here should be altered in a manner to overwrite the destination, so that the task does not fail. This option is directly not made available on the main configuration screen, but the task can be modified to overcome the issue by applying a short tweak. Click on Expressions, and now select Overwrite destination. This field accepts a boolean type value (TRUE/FALSE).
To achieve this, create a SSIS variable of type bool and assign this variable to the attribute. This has been depicted in the graphic attached below.
Take aways
To sum it all up. You should now be able to configure the FTP Task in Integration Services to achieve:
- Single FTP server mode by supplying a fixed set of credentials at design time.
- Multiple FTP server mode by supplying a fixed set of credentials at run time using Integration Services variables.
- And most important, to recieve or get file(s) from a FTP site to local or staging environemnt for further processing.
References
http://msdn.microsoft.com/en-us/library/ms137656.aspx
http://msdn.microsoft.com/en-us/library/ms141015.aspx