March 12, 2009 at 8:26 pm
Hi,
I am looking to use SSIS to sort out a heap of files of the same type. That's easy enough, I am getting a value from the text file (putting it into a variable) and putting them into a directory of the same name. The problem is that there will be any number of files that will have the same filename. These may or may not have the same information. (Someone else will have to resolve these later).
What I have done before in VBA is check if the file is already in that directory, then add a new directory called Copy_02, check if the file is in there, if not then write it there if not create a directory called Copy_03 and continue until there is somewhere to write the file.
I put it in the different directories so that I don't mess with the filenames, but I guess I could also put a suffix on the filename to achieve the same thing. I am working on copies so it doesn't matter how I structure the dataset.
I can do the first part - move files into the new directory, but don't know where to start with dealing with the duplicates.
Jason
March 13, 2009 at 8:57 am
Jason,
You'll most likely need to do this with a Script Task. You can check for the existence of a file using the System.IO namespace, and rename the file using the Move() method. I would recommend writing the files to the same directory using a different name (add an 001, 002, etc., to the filename), but this methodology would work just as well if you create a new directory and move the file to that location.
Let me know if you need some examples of the above.
hth,
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
March 13, 2009 at 11:36 am
another suggestion may be to use SSIS Expressions to rename the file.
See attached for an example of adding the date to a filename.
ArkWare
March 15, 2009 at 3:00 am
Hi,
I think you are right - renaming the file would be neater. I will change my approach. I got this far with the folder option, but am having trouble with variables. I can't get the hang of these - always seem to not be able to update them, or get the right values for. This is what I have got:
Dim LoadedFolder As String
Public Sub Main()
'
' Add your code here
'
Dim XCounter As Int32
Dim FName As String
Dim BDir As String
BDir = "C:\\Import LAS\\TestData\\Loaded\\" + Dts.Variables("HoleID").Value.ToString
LoadedFolder = BDir
XCounter = 1
FName = My.Computer.FileSystem.GetName(Dts.Variables("varFileName").Value.ToString)
Do Until My.Computer.FileSystem.FileExists(LoadedFolder + "\\" + FName) = False
If My.Computer.FileSystem.DirectoryExists(LoadedFolder) = False Then
My.Computer.FileSystem.CreateDirectory(LoadedFolder)
End If
LoadedFolder = BDir + "\\" + FName + "_Copy_" + CStr(XCounter)
XCounter = XCounter + 1
Loop
Dts.TaskResult = Dts.Results.Success
Dts.VariableDispenser.LockForWrite("varLoadedFolder")
Dts.Variables("varLoadedFolder").Value = LoadedFolder
End Sub
End Class
What am I doing wrong?
Jason
March 16, 2009 at 8:28 pm
In the properties of the script task, add the variable into the read/write variables list.
Putting User::variableName or just the variable name should work (provided you didnt change the namespace).
that will open the variables up to being modified from within the script task.
One other suggestion would be to append the date/time to the file instead of using 001,002, etc. I used to do this for files that sometimes needed to be re-imported. I would note the date/time in the filename to know when the file was pulled in in case someone asked "what was the difference between the 8AM and the 10AM file?"
filename_20090316_1015PM.csv might help out in the long run in knowing when the file ran (instead of looking at the modified date and properties of the file).
Best of luck,
Steve
March 16, 2009 at 9:45 pm
SK (3/16/2009)
---filename_20090316_1015PM.csv might help out in the long run in knowing when the file ran (instead of looking at the modified date and properties of the file).
If I ever do this sort of thing, I put the date 'bit' at the beginning of the filename: YYYYMMDD_HHMM_Filename.csv - allowing you to do a folder-level sort on the filename and quickly see the file-load order.
Phil
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply