November 11, 2009 at 2:32 pm
Hi All,
I have a requirement to load few files of same format & similar filename format using an SSIS package.
Filename format - example : ABC_PQR_??????.csv where ?????? can be any integer.
The import directory will have many files of this above format.
I know this can be done using a foreach loop container to loop through foreach file enumerator - as explained here:
But - here is the catch - the requirement is to process only one of the file while the package executes. When the package is triggered through SQL Agent - it will process another (any one amongst the available files). This is to have a better flexibility and control so that which execution has processed which file can be determined.
Is there a way to use this foreach loop container to pick one file and process - and then not to process the other files?
or to put otherwise:
How to process a single (random) file with dynamic filename from bunch of similar named file using SSIS?
Thanks ...
[font="Comic Sans MS"]--
Sabya[/font]
November 11, 2009 at 8:20 pm
One approach is to create a boolean variable and set it to false before the for each loop. It will indicate if a file has been selected.
The first step in the for each loop will check the file name.
If the boolean variable is true or the file name is not wanted, skip the rest of the steps. If the file name is wanted then process it. At the end, set the boolean variable to true.
Another approach is to get a list of the files in the folder. One way to do this is with a script task. Then call a data flow task to select one file from the list and put its name into a variable for the flat file connection manager.
Then call a data flow task to process the file.
November 11, 2009 at 9:53 pm
Just curious what kind of process would require loading a single random file from a given directory?
Also to be considered is if you can process the same file twice? If your business requirement doesn't allow this, you should archive the file after processing.
hth,
Tim
Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices
November 12, 2009 at 7:02 am
You don't have to use the foreach loop if only one file need to process a time but you can use it to read all the file names at the beginning. Here is the steps you can try
1. Use foreach loop to read all the files. Inside the foreach loop write a script file, so the file names can be written to a table e.g. ProcessFile. The table has two columns: ProcessFileName ( contains file name) and ProcessIndicator ( boolean Y/N). The default value is (N)
2. Create a SQL Task to read from the table ProcessFile and select the first file name that has a ProcessIndicator (N)
3. Process the file name read from step 2
4. Create another SQL Task to change the ProcessIndicator to (Y)
November 12, 2009 at 7:59 am
Thanks guys 🙂
@tim-2 : It's not a business requirement - more of audit requirement that each load (identified by an unique loadId) would process only a single file - so that any problem (or record) found on the staging table can be identified with the loadid ..
and yes - archiving is being done after the feed is processed.
@kbatta : I couldn't quite catch the first option you provided. But the 2nd approach is simalar to what Shuanna has suggested.
@Shuanna : Thanks for the heads up.
[font="Comic Sans MS"]--
Sabya[/font]
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply