How to consider multiple files in source directory, with oldest file first

  • Hello Sir,

    With only one file in source folder, and i hard coded that file name & path in the script task, my package is complete (as below) and works fine

    1) Do Clear Some Temp Tables in Sql Server (Execute Sql Task)

    2) Generate a new formatted File (Script Task: from onse source file adding commas between each value then generate a new file)

    3) move the source file to processed directory

    4) load the newly generated file to temp table in sql server (Data flow task)

    5) Move the newly generated file to SQLPROCESS folder

    6) remove dupicates in temp table and update some values based on quert (using Execute Sql Task)

    7) Copy the data to mail table (using Execute Sql Task)

    8) Log the information to sql server table

    now the trick (new change) is they can have multiple source files in the directory

    if it is, then it needs to consider which one the oldest file (by file creation date) then execute all 8 steps on that file, then next old file for all steps , then next old (i think it needs to handle in step2)

    for example

    -------------

    file1(creation date is May/25/2010 10:23:01 AM)

    file2(creation date is May/24/2010 11:27:01 AM)

    file3(creation date is May/25/2010 01:03:10 PM)

    so in the above case it needs to consider first file2 then file 1, file 3

    Please help me give me your valuable advices, please....

    Thanks in advance

  • just here i am adding my script task code...

    Public Class ScriptMain

    Public Sub Main()

    Try

    Dim line As String

    Dim semiAt As Int64

    Dim fullLine As String

    Dim textFileOutput As String

    Dim textFileErrOutput As String

    Dim isPreviousLineIsSingleRow As Boolean

    Dim ctr As Int64

    Dim failCtr As Int64

    Dts.Variables("InsertedRecCount").Value = ctr

    Dim currentDate As Date = DateTime.Now

    Dim writeFileName As String

    Dim path As String = "C:\Users\asiti\Desktop\ChaseFeed SSIS INPUT\Final\Q2.TXT"

    Dim sr As StreamReader = New StreamReader(path)

    Dim isInitialScrap As Boolean

    Dim temp As String

    ctr = 0

    failCtr = 0

    Dim fileCreateDate As Date = File.GetCreationTime(path)

    Dts.Variables.Item("User::FileCreatedDate").Value = fileCreateDate

    Do

    line = sr.ReadLine()

    If (line Is Nothing) Then Exit Do

    'End If

    If (Not (line.StartsWith(""""))) Then

    Continue Do

    End If

    If (Not (isInitialScrap)) Then

    line = sr.ReadLine()

    line = sr.ReadLine()

    line = sr.ReadLine()

    isInitialScrap = True

    Continue Do

    End If

    semiAt = line.IndexOf(";")

    'Eliminating record that is empty (no value for all columns)

    If ((Len(line.Replace("""", "").Trim()) < 2) And (Len(fullLine) = 0)) Then

    If (semiAt < 0) Then line = sr.ReadLine()

    Continue Do

    End If

    If (semiAt > 0) Then

    If (Len(fullLine) = 0) Then

    line = Trim(line.Replace(vbCr & vbLf, "").Replace(""" """, """,""").Replace(";", ""))

    line = line & vbCrLf

    'ctr = ctr + 1

    textFileOutput += line

    Else

    fullLine += " " & line

    fullLine = Trim(fullLine.Replace(vbCr & vbLf, "").Replace(""" """, """,""").Replace(";", ""))

    fullLine += vbCrLf

    textFileOutput += fullLine

    End If

    ctr = ctr + 1

    isPreviousLineIsSingleRow = True

    Else

    fullLine = line

    End If

    If (isPreviousLineIsSingleRow) Then

    fullLine = ""

    isPreviousLineIsSingleRow = False

    End If

    Loop Until line Is Nothing

    writeFileName = currentDate.Now.ToString

    writeFileName = writeFileName.Replace("/", "-").Replace(":", "-").Replace("M", "M)").Replace(" ", "(").Replace("(P", "P").Replace("(A", "A")

    writeFileName = "Feed " + Trim(writeFileName) + ".txt"

    Dts.Variables("DBFeedFile").Value = writeFileName

    Using writer As StreamWriter = New StreamWriter("C:\Users\asiti\Desktop\ChaseFeed SSIS INPUT\Final\Database Feed\" + writeFileName)

    writer.Write(textFileOutput)

    End Using

    Dts.Variables("FailedRecCount").Value = failCtr

    Dts.Variables("InsertedRecCount").Value = ctr

    sr.Close()

    Dts.TaskResult = Dts.Results.Success

    Catch E As Exception

    MsgBox(E.Message)

    End Try

    End Sub

    End Class

    Thanks & Regards

    asita

  • Could any one please share ideas......

    Thanks

    asita

  • I would possibly take the below approach..

    1. using script task collect all filenames and their time stamps if not part of the fileName

    2. Load into temp table and then order them based on the time stamp in Execute SQL task and Pass each as a parameter which will take each result set in the same order to the For Each Loop task or script task.. so that they are processed in order required for sure

  • You can also use T-SQL (some dedication needed!) to get file listings. Check out this link[/url].

    I am thinking that it should be possible to send file name and creation date to a physical table which you've built for the purpose. Then just process the files in any order desired by feeding the table entries into a foreach loop.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You can sort the files into an object variable in a Script Task, which can then be iterated through in a For Each Loop container (ADO Enumerator). Attached is a modified script that I found (can't take full credit for it) and modified to use to go through production files in order.

    The main vars that you'll need to set up are an object (SortedDataFileTable), a string for the file mask with wild cards (FileMask). You'll also need a connection manager (InputDir, pointing the directory where your files are). In the script, set sortOrder (line 31) to what you need to sort on, using the DataColumn name choices in the CreateFileAttributeCols sub (Name, CreationTime, etc.)

    In the Script Designer itself, add a reference to System.XML. Attached is the script itself for this scenario.

    In the FEL, set it to Rows in the first table (selecting the object variable you created). For the var mappings, set a CurrentFile var on index 0.

    This will assigned a file to the CurrentFile var in ascending CreationDate order.

  • You can use WMI Task for the same.

  • Here it is what i am doing

    thanks to all for your best ideas......

    just schedule the job for every one hour then in my script (second step ) take the oldes file (i paste the code below) then it executes the old file only

    since my feed is acually once a day, so even if it is hourly check (just checks any file exist in the directory if not then dont execute the package)

    so i am thinking it is a little bit ok

    but here in my problem it is ok because i usually got one feed a day, if it is morethan very frequent then i need to look into something else??

    if anyone has any comments please respond

    but i really appreciate your all help

    ======================================== Code

    public class Scriptmain

    Public Sub Main()

    .....................

    ....................

    ....................

    Dim di As New DirectoryInfo("C:\Users\asitti\Desktop\INPUT\")

    Dim aryFi As FileInfo() = di.GetFiles("*.txt")

    Array.Sort(aryFi, AddressOf FileInfoComparison)

    Dim fi As FileInfo

    Dim dirinfo As DirectoryInfo

    Dim allFiles() As FileInfo

    Dim path As String = ""

    'For Each fi In aryFi

    ' strFileSize = (Math.Round(fi.Length / 1024)).ToString()

    ' MsgBox("File Name: {0}" + fi.Name)

    ' MsgBox("File Full Name: {0}" + fi.FullName)

    ' MsgBox("File Size (KB): {0}" + strFileSize)

    ' MsgBox("File Extension: {0}" + fi.Extension)

    ' MsgBox("Last Accessed: {0}" + fi.LastAccessTime.ToString)

    'Next

    path = aryFi(0).FullName.ToString

    Dim sr As StreamReader = New StreamReader(path)

    Dim isInitialScrap As Boolean

    Dim temp As String

    ctr = 0

    failCtr = 0

    Dim fileCreateDate As Date = File.GetCreationTime(path)

    Dts.Variables.Item("User::FileCreatedDate").Value = fileCreateDate

    .................

    ........................

    ...................................

    .........................................

    end sub

    Private Shared Function FileInfoComparison(ByVal fi1 As FileInfo, ByVal fi2 As FileInfo) As Integer

    Return Date.Compare(fi1.CreationTime, fi2.CreationTime)

    End Function

    end class

    =-=======================================

    Thanks alot

    asita

  • how about this

    1. Use WMI Task to get latest file based on the time stamp

    2. use a Data flow task to perform the operation

    In case you need to process multiple files you can keep the Step 1 and 2 in a ForEach loop . Also while adding the data keep filename as the stamp for the record -- easy for you to keep a log and track the records back to a file , if needed.

    3. Once the load is done copy the source file to a seperate location for audit pupose if you are not doing this by any other process.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply