There are a few ways to get data from a client to your system; FTP or email. Internally we tend to use email for two reasons: everyone has it and it provides an audit trail (I know who sent what and when). The corporate email standard is MS Outlook, which would normally pose a problem since getting information out of the system, not using POP3 (another corporate decision), required my department to have a special program written.
This program, DataLoader, was written many moons ago by a consultant. It's job was to monitor certain email accounts and then detach the attachments and place them in certain folders for processing by DTS packages ( I did say many moons ago). This worked for a while, but as we moved away from DTS and towards SSIS we realized that we should either use POP3 or another mechanism to get this information. Using email to transmit files was not the best method (file size limitations for one thing). If we were to recommend another solution we had to provide the same audit trail and we did, using SSIS and what I call a 'drop box'
A 'drop box' is nothing more than file share with a set of folders and permissions assigned to a given user. The structure we have looks like this:
\\file share\Josef\inbound
\\file share\Josef\rejected
\\file share\Mary\inbound
\\file share\Mary\rejected
\\file share\Archive
We assign permissions so that only Mary can access her folders and only Josef can access his. This is the audit trail. The SSIS package will be able to access all folders within \\file share\. This allows the package to move files from the inbound to rejected folder as well as from the inbound to the Archive folder. One of the benefits of the DataLoader program was that it was able to provide the email address that sent the original file, so correspondence was between the DTS package and the sender. So far we have been able to provide the audit trail, but now we need to make sure that if Mary is the one working on a given file, only she receives the email correspondence ('file was loaded','errors in the file, please correct','etc). There are a few components and some programming required to do this and I'll show you how.
The first piece of this puzzle is a component called File Watcher from Konesans. This is a neat little task which will sit and monitor a directory for a file(s) which match your criteria. It can use * as a wildcard or ? as a single value placeholder. I tried more complex regex expressions, but was unable to get them to work. I explain our solution to this problem later on. The component looks like this:
The important parts are the path, filter, and output variable. The path is \\file share\Josef\Inbound defined within a File Connection object. . The filter looks for any file that begins with Sony_ and has an extension of .xls within the path defined. For this process the input file is Sony_mmddyy.xls The output variable holds the full path, including filename. So in this particular case the the variable filename would contain
\\file share\Josef\Inbound\Sony_022410.xls or \\file share\Mary\Inbound\Sony_022410.xls I have two people who could drop this file for processing, so I have two FileWatcher tasks. It looks like this:
Once the filewatcher is activated, I use a Script Task to fill system variables ensuring any email goes to the proper person and any files go to the proper directories. Inside the Script Task the code is rather simple:
Since the drop boxes have been designed with a user name (in this example we are using first names, but in our actual code we use last names so there is less chance of a collision), we can pull the name out of the filename variable and know who the user we are going to interact with is. Using this technique offers additional benefits, such as dynamic file creation.
One of the requirements of the new system is to create a spreadsheet of invalid data. Using the variables defined by the drop box we do just that. First we define an Excel Connection Manager with an expression for the Excel path:
This Connection Manager is used by an Excel Destination Component:
The last piece of this puzzle is setting up the email to let the correct user know if things were processed properly or if there were errors and where to find the corresponding xls.
How did we solve the regex problem of the FileWatcher task? The name of each client we process is the prefix of the filename; i.e. Sony_mmddyy.xls and we have an SSIS package per client. Now you might say "why have one per client, why not have one master one". Remember we are reworking an existing process and the original process had one massive stored procedure and one massive DTS package to handle ALL clients. We found that each client had a slight twist that made them different. The difference between some were negligible, but where there was noticeable differences the exception code proved to be more of a hassle than a help. Trying to make a change for a single customer or find a bug affecting a customer was a maintenance nightmare. We decided to build this shell within SSIS and make the changes necessary to accommodate each client. More packages, yes, but easier to maintain.
There you have it. A skeleton example of how to set up drop boxes as a way to manage data from users or clients. You can have an ftp process (through SSIS or some external program) drop files from different clients into named directories and the process would be the same.