How to read data from file (not well formatted) with ssis

  • Hello All,

    could you please assist me how can i read this attched file data to the sql server table

    i need to load this text file data to sql table using ssis, (the file provider cannot change the format or structure)

    here actual row delimiter is ; (some rows split into two rows) and column delimiter is nothing seems to be a single space

    here what i am thinking is (the possible ways could be),

    1) through script task add some column delimiter (such as , or - or . or some thing like) then make everuthing record as a single row(no splits)

    if i am planning on right way please tell me some samples how to read each line and how to add delimiter to it

    or give me is there other easy ways to do this kind of scrap

    Please Help me

    Greatful to you alll

    thanks in advance

    asita

  • I am a programmer first, so I would think of writing a file parsing routine in a .Net language (VB/C#) and then reference the library to retrieve the parsed resultset in SSIS package and put the data in the database tables.

    Now that I've written the above sentence, I think it should be doable through a script task too.

    1.) Create a collection with each item containing the entire row with all values in ""

    2.) Iterate through the collection and then get each individual items in another collection

    3.) Remove the quotes and you have your value.

    Does this make sense?

  • Hello Rjn,

    Thanks for your reply, i know program a little bit , but not my main stream so could you please give some sample or some example or could you please direct me for any sample links

    please

    Thanks in advance

    asita

  • please share any ideas

    please

    thanks

    asita

  • asita (5/17/2010)


    please share any ideas

    please

    thanks

    asita

    A BCP format file and a BULK INSERT in a stored proc would load this in a flash. Are you allowed to make a BCP format file in the same directory as the file you want to load? I'm no SSIS expert, but I believe the BULK TASK (not sure what it's called exactly) can also make use of a BCP format file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • On second thought.... doesn't SSIS allow you to spec a "space" as a delimiter and to also include a quote as a text qualifier?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The problem here is not one of column and row delimiters, rather that some of the source rows are split over two rows and some only one, making a straightforward flat-file connection fail.

    I can think of a few ways to get this done in SSIS, here are two:

    1) Pre-process the file in script to get it in one-row-per-record format and then use standard SSIS tools to process it further.

    2) Set up a flat-file connection with a single text column per row and feed that into a Script Component which can then create the various output variables after doing the parsing.

    (2) will probably appeal to the 'single-pass' enthusiasts out there, but (1) is simpler.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • I am not a core programmer.. i have used similar code as below in the past (got it from a website, not sure where) .. I have modified it a little for your purpose, It wont give you the final output but might be a good starting point.. Someone good at .Net might help you to make it better

    Link with similar code - http://www.bimonkey.com/2009/06/flat-file-source-error-the-column-delimiter-for-column-columnname-was-not-found/ [/URL]

    Text File Properties

    Format - delimited

    Text qualifier - None

    Header/Row delimeter - ;

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    '

    ' Add your code here

    '

    Dim strRow As String

    Dim strColSeperator As String

    Dim rowValues As String()

    rowValues = Row.Data.Split((" """))

    If rowValues.GetUpperBound(0) = 3 Then

    Row.Column1 = rowValues.GetValue(0).ToString()

    Row.Column2 = rowValues.GetValue(1).ToString()

    Row.Column3 = rowValues.GetValue(2).ToString()

    Else

    Row.Column1 = rowValues.GetValue(0).ToString()

    Row.Column2 = rowValues.GetValue(1).ToString()

    Row.Column3 = rowValues.GetValue(2).ToString()

    Row.Column4 = rowValues.GetValue(3).ToString()

    Row.Column5 = rowValues.GetValue(4).ToString()

    Row.Column6 = rowValues.GetValue(5).ToString()

    Row.Column7 = rowValues.GetValue(6).ToString()

    Row.Column8 = rowValues.GetValue(7).ToString()

    Row.Column9 = rowValues.GetValue(8).ToString()

    Row.Column10 = rowValues.GetValue(9).ToString()

    Row.Column11 = rowValues.GetValue(10).ToString()

    Row.Column12 = rowValues.GetValue(11).ToString()

    Row.Column13 = rowValues.GetValue(12).ToString()

    Row.Column14 = rowValues.GetValue(13).ToString()

    Row.Column15 = rowValues.GetValue(14).ToString()

    Row.Column16 = rowValues.GetValue(15).ToString()

    End If

    End Sub

    End Class

  • asita (5/17/2010)


    Thanks for your reply, i know program a little bit , but not my main stream so could you please give some sample or some example or could you please direct me for any sample links

    I can just give you pointers on how to go about it and you'll have to work the details or search for samples (I do not have a ready link to demonstrate what I mean).

    In the script task

    1.) Read the file into a StremReader object

    Dim oFile as System.IO.File

    Dim oRead as System.IO.StreamReader

    oRead = oFile.OpenText(“C:\sample.txt”)

    2.) Use split function on the text in the stream to get an array containing single lines (like: " " "CODE" " " " " " " " " " " " " " " " " " " " " "PAYABLE" "UNITS" " " " " ; )

    And use "split" function further on the single line values you have in a collection. This part of the step may be tricky (it may not be as well, I'm saying because I haven't tried it) as you have space as the separator and then you've spaces within the double quotes.

    3.) Use the values obtained in step above to store as Output columns of the script in a data flow and use the output of script as input for your OLEDB destination component.

  • Hi Phil

    thank you very much to all for your valuable responses.

    but seems to be phill is exactly matches what i am thinking(as of my knowledge also it is simple and stright forward)

    1 step is the best one, but for this i need to use script task in that i have to get each line and had a separator coma (to be safe & easy for file connection manager), then make it split rows as a single row

    could any one please give me some code

    which opens the file and reads each line data

    please

    Thanks in advance

    asita

  • asita (5/18/2010)


    1 step is the best one, but for this i need to use script task in that i have to get each line and had a separator coma (to be safe & easy for file connection manager), then make it split rows as a single row

    What have you tried so far on this idea? You will have to proceed with the package to stumble and then ask a question.

    asita (5/18/2010)


    could any one please give me some code

    which opens the file and reads each line data

    Did you look at @divyanth's code? That gives a pretty good idea on how to proceed with some sample code file. I also gave a sample snippet on how to read the file in script. I would have tried to get started with them.

  • Use script to remove carriage return (CR) line feed (LF) characters. Then use the regular flat file source to load the file. For a sample script, which can help take a look here.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Hello All,

    with all your suggestions, i did like this

    i left at one more thing that is how can i substitute comma between each field,

    at present my data looks like as in below two lines, i can simply open the file and replace " " with "," that solves my problem,

    but how can i automate(so each field separate by comma) in my code please.....

    Data

    ************************

    "ABC" "CY FOUN" "CY FOUN" "AEN" "ALF" "XLF" "FEB10" "4" "TQV" " " ".00" " " "0.00" "12.0" "0.00" "12.0" ;

    "ABC" "EGACY FOUN" "AMERIFOUN" "APL" "DLF" "AAF" "FEB10" "4" "QTV" " " ".00" " " "0.00" "10.0" "0.00" "10.0" ;

    ************************

    Code

    Public Sub Main()

    Try

    ' Create an instance of StreamReader to read from a file.

    Dim sr As StreamReader = New StreamReader("C:\01049_edit.txt")

    Dim line As String

    Dim semiAt As Integer

    Dim fullLine As String

    Dim textFileOutput As String

    Dim isPreviousLineIsSingleRow As Boolean

    ' Read and display the lines from the file until the end

    ' of the file is reached.

    Do

    line = sr.ReadLine()

    'Console.WriteLine(line)

    MsgBox(line)

    semiAt = line.IndexOf(";")

    MsgBox(semiAt)

    If ((semiAt > 0) And (fullLine.Length() = 0)) Then

    fullLine += line

    textFileOutput += fullLine

    isPreviousLineIsSingleRow = True

    Else

    fullLine = line

    End If

    If (isPreviousLineIsSingleRow) Then

    fullLine = ""

    isPreviousLineIsSingleRow = False

    End If

    Loop Until line Is Nothing

    sr.Close()

    Catch E As Exception

    ' Let the user know what went wrong.

    Console.WriteLine("The file could not be read:")

    Console.WriteLine(E.Message)

    End Try

    End Sub

    Please Help me

    Thanks in advance

    asita

  • It appears what CozyRoc is suggesting is to modify the source file to make it a comma delimited file. I have never gone down that path so can't be much help there.

    The code below splits the file content into string array elements based on the ";". Then iterates through each row and removes any CR or LF characters and replaces the " " with a "," and splits it into individual elements.

    Dim testString As String = """ABC"" ""CY FOUN"" ""CY FOUN"" ""AEN"" ""ALF"" ""XLF"" ""FEB10"" ""4"" ""TQV"" "" "" "".00"" "" "" ""0.00"" ""12.0"" ""0.00"" ""12.0"" ;" & vbCr & vbLf & " ""ABC"" ""EGACY FOUN"" ""AMERIFOUN"" ""APL"" ""DLF"" ""AAF"" ""FEB10"" ""4"" ""QTV"" "" "" "".00"" "" "" ""0.00"" ""10.0"" ""0.00"" ""10.0"" ;"

    Dim singleLines As String() = testString.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)

    For Each oneLine As String In singleLines

    ''oneLine contains "ABC" "CY FOUN" "CY FOUN" "AEN" "ALF" "XLF" "FEB10" "4" "TQV" " " ".00" " " "0.00" "12.0" "0.00" "12.0"

        Dim individualValues As String() = oneLine.Replace(vbCr & vbLf, "").Replace(""" """, ",").Split(",".ToCharArray())

    Next

    Next step would be to iterate through the individualValues array and store them in SSIS recordset.

  • Hello Rjv,

    thanks alot for your code, it gave me good idea , and i substituted in my code it is working fine

    so now i have the whole cleaned data in a string variable how can i create a file with the string variable data

    so that i can use that file to input then ssis will read it into the sql table

    mean while, i will also try it with oposite to reading file

    for alll thanks for your valuable suggestions , i really greatful to u alllll

    Thanks & Regards

    asita

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply