Parsing Flat File with SSIS

  • Hey Guys,

    i need to set up a SSIS package that imports data from a {,} delimited Flatfile,

    There is a field that has a comma in between.

    a good row would like like this.

    01,test,test name,date SSIS parses 01|test|test name|Date|

    My bad row looks like this

    04,test1,test name1,testname2,date SSIS parses 01|test1|test name1|test name2|date|

    I need this to be parsed as follows 01|test1|test name1, test name2|date|

    Is this Possible??

    Thanks In advance

  • Not really well. Bad data like this is a problem with most parsing software.

    If you only have one field that the problem can happen in, you are not in horrible shape. An option is to use comma delimiters for all of the fields before the problem field and then put the problem column and every other column after it into one big field. From there, you can use one or more derived column components to parse apart the remainder of the string from the end by the comma delimiter. Since you know you have a specific number of fields before and after the problem field, you can determine that anything after field X from the left and anything after field Y from the right must be field Z - regardless of the number of columns.

  • Hi,

    Actually, SSIS can handle this very well if you include a script task in your process. If you are familiar with writing a bit of VB.net code you can use a script task to "scrub" the file prior to setting up the parsing task(s). Inside the script task, use the streamreader and streamwriter .net objects to read the flat file, replace the bad commas, and write the file back out. you'd then have a clean file that the subsequent tasks in SSIS can handle.

    Thanks,

    Strick

  • you can use script task as source and change the bad formatting

    in Script task read each line through stream reader

    Use the split function of string. Since the split function will give a string array check the length of the array

    In this case if length is 4 then give them to output of script function

    If the length is greater than 4 then use a forloop to concatenate all the columns after second and before the last column.

    Note: This only handles the specific case u mentioned

    Regards
    Venkat
    http://sqlblogging.blogspot.com[/url]

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply