Overview
One common requirement is to import the latest file in a folder to append to a table for processing. The following package uses a ActiveX task to get the latest file name, a Dynamic Properties Task to set the source connection to it, followed by the usual processing.
Setup
Lets break this down. The first task is to setup 3 Global Variables in the Package, that will be used in the first ActiveX task.
The 3 Global Variables (GV’s) used are FileFolder, FilePrefix and FileName
- FileFolder holds the UNC path to the folder where the file you wish to process will reside.
- FilePrefix is the string the file always starts with. This enables you to choose a subset of files from a common area.
- FileName is the variable that will hold the name of the latest file. In the screenshot above, it is currently empty.
Now add an Connection of type Text File (Source)
. Choose a valid file name at this point, which we will reset later. Choose the appropriate format as you would do normally.
We now need to setup the initial ActiveX task that finds the name of
the latest file. Add an ActiveX task using your preferred method and add
the following code.
'********************************************************************** ' Visual Basic ActiveX Script '************************************************************************ Function Main() Dim filePreFix, folderName, fileName folderName = DTSGlobalVariables("FileFolder").Value filePreFix = DTSGlobalVariables("FilePrefix").Value 'Call the function, passing in the two Global Variables obtained above. fileName = LatestFile(folderName, filePreFix) DTSGlobalVariables("FileName").Value = fileName If DTSGlobalVariables("FileName").Value <> "" Then Main = DTSTaskExecResult_Success Else Main = DTSTaskExecResult_Failure End If End Function '---------------------------------------------------------------------- Private Function LatestFile(MyFolderName, MyFilePrefix) '---------------------------------------------------------------------- On Error Resume Next Dim oFSO, HighestDate, MyResultName, myFolder, file Set oFSO = CreateObject("Scripting.FileSystemObject") Set myFolder = oFSO.GetFolder(MyFolderName) MyResultName = "" HighestDate = DateValue("1/1/1995") For Each file In myFolder.files 'Check to make sure the file starts with the right stuff If UCase(Left(file.name,Len(MyFilePrefix))) = UCase(MyFilePrefix) Then 'Check last modified date If file.DateLastModified > HighestDate Then MyResultName = file.path HighestDate = file.DateLastModified End If End If Next LatestFile = MyResultName Set myFolder = Nothing Set oFSO = Nothing End Function
It should now look like
We now add an ActiveX task to run if the above
fails, which it will do if there are no files to process.
This will log that there was nothing to do. The main reason for this is
so you can check that the package actually ran. You will note that the
error logging script is longer than the actual code. 1
'********************************************************************** ' Visual Basic ActiveX Script '********************************************************************** Function Main() LogAction "C:\Logs\", "FileImport.Log", "No files found to process!" Main = DTSTaskExecResult_Success End Function '********************************************************************** Sub LogAction (folder, filename, strEntry) '********************************************************************** Dim strErrMsg, f, LogFile, oFS Set oFS = CreateObject("Scripting.FileSystemObject") MakeSureDirectoryTreeExists(folder) LogFile = folder & filename Const ForReading = 1, ForWriting = 2, ForAppending = 8 On Error Resume Next If Not (oFS.FileExists(LogFile)) Then oFS.CreateTextFile(LogFile) End If set f = oFS.OpenTextFile(LogFile, ForAppending) f.WriteLine "[ " & Now & " ] - " & strEntry f.close On Error Goto 0 set f = Nothing Set oFS = Nothing End Sub '********************************************************************** Function MakeSureDirectoryTreeExists(dirName) '********************************************************************** 'like it says on the tin Dim oFS, aFolders, newFolder, i Set oFS = CreateObject("Scripting.FileSystemObject") ' Check the folder's existence If Not oFS.FolderExists(dirName) Then ' Split the various components of the folder's name aFolders = split(dirName, "\") ' Get the root of the drive newFolder = oFS.BuildPath(aFolders(0), "\") ' Scan the various folder and create them For i = 1 To UBound(aFolders) newFolder = oFS.BuildPath(newFolder, aFolders(i)) If Not oFS.FolderExists(newFolder) Then oFS.CreateFolder newFolder End If Next End If Set oFS = Nothing End Function
It should now look like
We now add a On Failure
workflow
to point from the first ActiveX task to the Second. As discussed
earlier, this will only fire if the first Task finds no files to
process.
The next step is to add a Dynamic Properties Task, that sets
connection 1 to point to the file recovered in the first ActiveX step.
Setting this up can get a little baffling as the first screen rather unhelpfully looks like this:
The first thing to do is press Add...
and then navigate to the Connection 1 Datasource, highlight it and press Set...
Choose Global Variable (GV) for the Source and FileName as the Variable.
It should now look like
Click OK
twice and you should now have a One Connection, two ActiveX tasks, and a Dynamic Properties Task. Now add a On Success
Workflow from the first ActiveX task to a Dynamic Properties task.
Now add a Database Connection, Connection 2, so we can add a Data
Pump task between the Connection 1 & 2. At this point, Connection 1
should still be pointing at a valid file, so you can set source,
destination and transformations as you would do normally.
You may then wish to add a Optional SQL task to do any further
processing. In my case, a file is produced by an external task run on a
web server, which produces a file of new customers to be imported into
the database. The SQL Task loads the data into the customer tables, and
then truncates the import table.
Testing
Let’s review what we should now have:
If we execute the Dynamic Property Task, we
should see that the property of Connection 1 gets set to an empty
string, because the GV is currently empty. (Note: You need to press cancel after reviewing the Properties of Connection 1). If we execute the first ActiveX task, we should see the GV is set to the name of the latest file the folder specified in the GV
FileFolder. Execute the Dynamic Properties task again & you should
see that the Properties of Connection 1 now reflect this.
Move temporarily all files from the folder specified in the GV
FileFolder, and run the entire package. This should fail and the
workflow process will take you to the second ActiveX step and write out
an error log in the location specified.
Reset the GV FileName back to blank, run the
Dynamic Properties step & save your package. This will save
Connection 1 as a blank string, which is probably wise. You can now
move a selection of files back to the folder specified in the GV
FileFolder, and execute the package end to end. All being well, the
first step should get the latest file name, the second set Connection 1
to it and the Data Pump task pull the data across. If you added the
optional SQL Task, this should also fire at the correct time.
Remember to reset the GV FileName back to blank and re-run the Dynamic Properties task before you save your final version.
1 This is why properly implemented error handling rarely gets done. 😉