April 2, 2013 at 7:14 am
Hi all,
I have a csv file with data from 2005...and on daily basis the data is getting appended in the csv file.. Im able to use a conditional split to include data for current 2 years and exclude the rest...
But I just wanted to know if something can be done in the script task(control flow) itself, so that the flat file connection manager itself picks up the data for the current year only. I don wanna pass all the rows to the conditional split and exclude the rows...any help on this?
Thanks
April 2, 2013 at 7:25 am
Why not archive the file (or the very old data in the file) rather than letting it grow?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 2, 2013 at 7:28 am
archiving the file is not in my scope...the file is being generated by the third party vendor on a daily basis...im just accessing their back up csv files to feed data for my database.
Is there by anyways to read the data from CSV file for current year or last 30 days or so...instead reading everything and using the split?
thanks
April 8, 2013 at 11:59 pm
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.
See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 9, 2013 at 12:46 am
opc.three (4/8/2013)
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx
Neat idea. Maybe using a Script Component source would mean that this could be done in one hit in a data flow (though I'm not sure how you would gracefully 'escape' from the data flow when the old records are reached).
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 9, 2013 at 10:08 am
Phil Parkin (4/9/2013)
opc.three (4/8/2013)
How are your C# or VB.net skills? I am thinking pre-processing the file might be one way. You could use a Script Task to read the file in reverse order until you found data that has already been imported. Depending on how much data you expect and how much memory you have available you could store the data as you reversed through the file and then write it in one shot when you found old data, or, you could not store the data as you reversed and just changed course to read the file in forward-order to the end when you found old data, writing a new file to disk as you went. Either way you would end up with a file that you could import straightaway using a Data Flow Task. It would be tedious but not terribly difficult if you have some .NET skills.See the second example in this article to see how to read a file in reverse order. And the first example shows you how to write a file byte by byte. It is for .NET 4.0 but these techniques will work in 2.0, it's just that the 2.0 article does not have the nice example to show you how to read a file in reverse using Seek.
http://msdn.microsoft.com/en-us/library/system.io.filestream.seek(v=vs.100).aspx
Neat idea. Maybe using a Script Component source would mean that this could be done in one hit in a data flow (though I'm not sure how you would gracefully 'escape' from the data flow when the old records are reached).
Good thought. I think a Script Component setup as a Source could keep it all in one Data Flow, which might avoid an intermediate file. The only tradeoff I can think of would be having to manually parse the row in the .NET or maybe a Derived Column Transform if pushing a single-column row down. I have not used a Script Component Source in a long time but I think after we pushed all the new data onto the pipeline as new rows and found the first old data then we would just set "DTS task = Success" and the Data Flow would happily move on.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 10, 2013 at 5:22 am
You don't tell us how big the file is.
You might not be able to store it in the internal memory and read it reversed?
Anyhow you can read the file per line and write it to another file.
There must be something in the file which tells the date or year of a record.
If the file is sorted you can use a pretty simple and fast method like:
Imports System
Imports System.IO
Public Class testa
Public Shared Sub test()
Dim line As String
Dim reader As StreamReader = New StreamReader("C:\temp\input.txt")
Dim writer As StreamWriter = New StreamWriter("C:\temp\output.txt")
line = "start"
' Loop over each line in file, While line is Not Nothing.
Do While (Not line Is Nothing)
line = reader.ReadLine
If Mid(line, 1, 9) <= "something" Then ' Adjust
Else
writer.WriteLine(line)
End If
Loop
writer.Close()
reader.Close()
End Sub
End Class
Thanks
Gosta
April 10, 2013 at 5:25 am
Gosta Munktell (4/10/2013)
You don't tell us how big the file is.You might not be able to store it in the internal memory and read it reversed?
You do not need to store it as you read through in in reverse, or you could, both options were mentioned above and size of file and available memory would likely dictate the choice.
Anyhow you can read the file per line and write it to another file.
There must be something in the file which tells the date or year of a record.
If the file is sorted you can use a pretty simple and fast method like:
Imports System
Imports System.IO
Public Class testa
Public Shared Sub test()
Dim line As String
Dim reader As StreamReader = New StreamReader("C:\temp\input.txt")
Dim writer As StreamWriter = New StreamWriter("C:\temp\output.txt")
line = "start"
' Loop over each line in file, While line is Not Nothing.
Do While (Not line Is Nothing)
line = reader.ReadLine
If Mid(line, 1, 9) <= "something" Then ' Adjust
Else
writer.WriteLine(line)
End If
Loop
writer.Close()
reader.Close()
End Sub
End Class
Thanks
Gosta
What do you think using a Flat-File Manager to read the file and then applying the Conditional Split in the Data Flow accomplishes? :Whistling:
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 10, 2013 at 5:51 am
Hi All,
Thanks for the replies.
As you say to keep it simple I use conditional split to filter the records from falt file and load it. But the proble is the flat file reads all the records.. I wanted to restrict the flat file to read only the reqd records and not the entire one..Probably I got to implete some logic within the script component to read olny the reqd data from flat file.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply