December 15, 2016 at 9:45 am
I've got a situation where csv files are superseded by another in a given month; they gather on an FTP site for collection at my convenience.
they do follow a decent naming convention, like
FileName_20161211.csv,
FileName_20161212.csv,
FileName_20161220.csv
since i need to process only the "latest" file,the only way i could think of doing it was in a script task: find parse the file names, pull out the datelike-substring, and either assign a variable for the one-true-filename to be processed, or delete/archive the files that don't match the latest; then i'd let a foreach loop process the only file that was left.
would anyone suggest doing it any different than that?
Lowell
December 15, 2016 at 10:01 am
That's one way of doing it yes.
If you didn't want to use a Script Task, you could use a For Loop container and then assign the filename to a variable and extract the datestring from there (which you can convert to a DATE type varaible) and compare there. It might be easier to create than writing c#/vb.net to do, as it would only require a loop with a single Formula Expression node inside.
Edit:
I imagine a statement like this would work (I haven't tested this as I don't have SSDT on this laptop, but...)
--I'm assuming that the filename, without extension, is passed.
(DT_DATE) LEFT(RIGHT(@User::Filename, 8),4) + "-" + LEFT(RIGHT(@User::Filename, 4),2) + "-" + RIGHT(@User::Filename) > @User::MaxDate ? @User::MaxDate = LEFT(RIGHT(@User::Filename, 8),4) + "-" + LEFT(RIGHT(@User::Filename, 4),2) + "-" + RIGHT(@User::Filename) : @User::MaxDate = @User::MaxDate
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
December 15, 2016 at 10:01 am
Lowell (12/15/2016)
I've got a situation where csv files are superseded by another in a given month; they gather on an FTP site for collection at my convenience.they do follow a decent naming convention, like
FileName_20161211.csv,
FileName_20161212.csv,
FileName_20161220.csv
since i need to process only the "latest" file,the only way i could think of doing it was in a script task: find parse the file names, pull out the datelike-substring, and either assign a variable for the one-true-filename to be processed, or delete/archive the files that don't match the latest; then i'd let a foreach loop process the only file that was left.
would anyone suggest doing it any different than that?
I'd do this using the script task method. No need for any fancy parsing though, do it like this:
string[] files = Directory.GetFiles(sourceFolder, "*.csv");
Array.Sort(files);
Array.Reverse(files);
After doing the above, the 'latest' file name should be in files[0], which you can pass to an SSIS variable and use for your subsequent data source.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 16, 2016 at 7:11 am
both examples were extremely helpful, folks, thank you!
I did it both ways as my proof of concept; I was already leaning towards Phil's example of a script task, but I've got a lot of folks whose eyes cross when they have to read c# code, so i left my code in place, but disabled.
since i didn't really need to parse the name, i just modified Thom's example to use the highest filename in the loop;
my syntax was very similar:
@[User::MaxFileName] =@[User::CurrentFileName] > @[User::MaxFileName] ?@[User::CurrentFileName]:@[User::MaxFileName]
my code for my script task:
try
{
//User::ArchiveDirectory,User::FTPFileFilter,User::WorkingDirectory
//User::ForEachFileFilter
//files follow a nice naming convention like Filename_20161201_081015.csv
//so simply sorting and getting the highest file returns the "latest" file received.
//this would not work fi the datestring in filename was mm/dd/yyyy style
string ArchiveDirectory = (string)Dts.Variables["ArchiveDirectory"].Value;
string WorkingDirectory = (string)Dts.Variables["WorkingDirectory"].Value;
string FTPFileFilter = (string)Dts.Variables["FTPFileFilter"].Value;
string[] files = System.IO.Directory.GetFiles(WorkingDirectory, FTPFileFilter);
Array.Sort(files);
Array.Reverse(files);
string JustTheFileName = System.IO.Path.GetFileName(files[0]);
Dts.Variables["ForEachFileFilter"].Value = JustTheFileName;
//MessageBox.Show(JustTheFileName);
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
Dts.Events.FireError(-1, "Main", ex.Message, "", 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
Lowell
December 16, 2016 at 7:35 am
Phil's definitely is cleaner and faster, but yes does depend if you want to introduce a different language. Personally, I find Phil's very easy to read, and I don't write C# (although it's not exactly that different to VB.net, which I can write). It is only a few lines! :hehe:
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
December 16, 2016 at 11:16 am
Glad you solved it. Not sure whether you know this, but if you add
using System.IO;
into your Using block (towards the top of your code module)
it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
December 16, 2016 at 11:34 am
Phil Parkin (12/16/2016)
Glad you solved it. Not sure whether you know this, but if you addusing System.IO;
into your Using block (towards the top of your code module)
it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.
Yes, I knew that Phil thanks!
Right now its a force of habit from previous job; that was the style they wanted in their scripts, is it's still sticking with me for now.
I heard that at that previous place, they had a developer that got frustrated copy pasting code from the interwebz without finding the Using blocks, and he had to hunt for the appropriate library references, so to make him happy, they adopted that style.
Lowell
December 16, 2016 at 12:13 pm
Lowell (12/16/2016)
Phil Parkin (12/16/2016)
Glad you solved it. Not sure whether you know this, but if you addusing System.IO;
into your Using block (towards the top of your code module)
it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.
Yes, I knew that Phil thanks!
Right now its a force of habit from previous job; that was the style they wanted in their scripts, is it's still sticking with me for now.
I heard that at that previous place, they had a developer that got frustrated copy pasting code from the interwebz without finding the Using blocks, and he had to hunt for the appropriate library references, so to make him happy, they adopted that style.
My pleasure. If anyone were to try to persuade me using that argument, they'd better come prepared for a fight!!
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply