Parsing a flat file: script component or ?

  • Good Afternoon.  I have a flat file that was generated by a “Print to File” using a legacy-system’s reporting tool.  There are occasionally “1”s (page break) in column one, and rows and rows of irrelevant data making up the report header and footer pages.  Unfortunately this is the only way to source this system’s data.

     

    The rows containing the data that I need to extract are easy to identify, and I can write a script component to parse the relevant information.  But there also exists within the flat file a section that contains the meta data (column start position, column length, and column name) for the relevant data rows.

     

    I foresee having to extract data from many of these report files, with varying numbers of columns.  Is there a way in the ETL process that I can use this meta data and avoid coding a parser in a script component?

     

    Thank you for any creative ideas you may have.

  • You could pre process the files using Perl for which would easily filter out the useless info and output to fresh files with meta info in the first line. It would then be simple to read the new flat files in.

    It is a little unclear but this may however be the process you are trying to avoid, although I have found that with Perl's text processing it is quicker for us to handle difficult flat files like this rather than  trying to do all through SSIS directly.

    Hope this helps

     

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply