SSIS Process only the "Latest" file based on name

  • I've got a situation where csv files are superseded by another in a given month; they gather on an FTP site for collection at my convenience.

    they do follow a decent naming convention, like

    FileName_20161211.csv,

    FileName_20161212.csv,

    FileName_20161220.csv

    since i need to process only the "latest" file,the only way i could think of doing it was in a script task: find parse the file names, pull out the datelike-substring, and either assign a variable for the one-true-filename to be processed, or delete/archive the files that don't match the latest; then i'd let a foreach loop process the only file that was left.

    would anyone suggest doing it any different than that?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • That's one way of doing it yes.

    If you didn't want to use a Script Task, you could use a For Loop container and then assign the filename to a variable and extract the datestring from there (which you can convert to a DATE type varaible) and compare there. It might be easier to create than writing c#/vb.net to do, as it would only require a loop with a single Formula Expression node inside.

    Edit:

    I imagine a statement like this would work (I haven't tested this as I don't have SSDT on this laptop, but...)

    --I'm assuming that the filename, without extension, is passed.

    (DT_DATE) LEFT(RIGHT(@User::Filename, 8),4) + "-" + LEFT(RIGHT(@User::Filename, 4),2) + "-" + RIGHT(@User::Filename) > @User::MaxDate ? @User::MaxDate = LEFT(RIGHT(@User::Filename, 8),4) + "-" + LEFT(RIGHT(@User::Filename, 4),2) + "-" + RIGHT(@User::Filename) : @User::MaxDate = @User::MaxDate

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Lowell (12/15/2016)


    I've got a situation where csv files are superseded by another in a given month; they gather on an FTP site for collection at my convenience.

    they do follow a decent naming convention, like

    FileName_20161211.csv,

    FileName_20161212.csv,

    FileName_20161220.csv

    since i need to process only the "latest" file,the only way i could think of doing it was in a script task: find parse the file names, pull out the datelike-substring, and either assign a variable for the one-true-filename to be processed, or delete/archive the files that don't match the latest; then i'd let a foreach loop process the only file that was left.

    would anyone suggest doing it any different than that?

    I'd do this using the script task method. No need for any fancy parsing though, do it like this:

    string[] files = Directory.GetFiles(sourceFolder, "*.csv");

    Array.Sort(files);

    Array.Reverse(files);

    After doing the above, the 'latest' file name should be in files[0], which you can pass to an SSIS variable and use for your subsequent data source.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • both examples were extremely helpful, folks, thank you!

    I did it both ways as my proof of concept; I was already leaning towards Phil's example of a script task, but I've got a lot of folks whose eyes cross when they have to read c# code, so i left my code in place, but disabled.

    since i didn't really need to parse the name, i just modified Thom's example to use the highest filename in the loop;

    my syntax was very similar:

    @[User::MaxFileName] =@[User::CurrentFileName] > @[User::MaxFileName] ?@[User::CurrentFileName]:@[User::MaxFileName]

    my code for my script task:

    try

    {

    //User::ArchiveDirectory,User::FTPFileFilter,User::WorkingDirectory

    //User::ForEachFileFilter

    //files follow a nice naming convention like Filename_20161201_081015.csv

    //so simply sorting and getting the highest file returns the "latest" file received.

    //this would not work fi the datestring in filename was mm/dd/yyyy style

    string ArchiveDirectory = (string)Dts.Variables["ArchiveDirectory"].Value;

    string WorkingDirectory = (string)Dts.Variables["WorkingDirectory"].Value;

    string FTPFileFilter = (string)Dts.Variables["FTPFileFilter"].Value;

    string[] files = System.IO.Directory.GetFiles(WorkingDirectory, FTPFileFilter);

    Array.Sort(files);

    Array.Reverse(files);

    string JustTheFileName = System.IO.Path.GetFileName(files[0]);

    Dts.Variables["ForEachFileFilter"].Value = JustTheFileName;

    //MessageBox.Show(JustTheFileName);

    Dts.TaskResult = (int)ScriptResults.Success;

    }

    catch (Exception ex)

    {

    Dts.Events.FireError(-1, "Main", ex.Message, "", 0);

    Dts.TaskResult = (int)ScriptResults.Failure;

    }

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Phil's definitely is cleaner and faster, but yes does depend if you want to introduce a different language. Personally, I find Phil's very easy to read, and I don't write C# (although it's not exactly that different to VB.net, which I can write). It is only a few lines! :hehe:

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Glad you solved it. Not sure whether you know this, but if you add

    using System.IO;

    into your Using block (towards the top of your code module)

    it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (12/16/2016)


    Glad you solved it. Not sure whether you know this, but if you add

    using System.IO;

    into your Using block (towards the top of your code module)

    it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.

    Yes, I knew that Phil thanks!

    Right now its a force of habit from previous job; that was the style they wanted in their scripts, is it's still sticking with me for now.

    I heard that at that previous place, they had a developer that got frustrated copy pasting code from the interwebz without finding the Using blocks, and he had to hunt for the appropriate library references, so to make him happy, they adopted that style.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (12/16/2016)


    Phil Parkin (12/16/2016)


    Glad you solved it. Not sure whether you know this, but if you add

    using System.IO;

    into your Using block (towards the top of your code module)

    it allows you to use the shorthand version (Directory.GetFiles() rather than System.IO.Directory.GetFiles()), which is a bit easier to read, IMO.

    Yes, I knew that Phil thanks!

    Right now its a force of habit from previous job; that was the style they wanted in their scripts, is it's still sticking with me for now.

    I heard that at that previous place, they had a developer that got frustrated copy pasting code from the interwebz without finding the Using blocks, and he had to hunt for the appropriate library references, so to make him happy, they adopted that style.

    My pleasure. If anyone were to try to persuade me using that argument, they'd better come prepared for a fight!!

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply