Breaking down a flat file

  • I have a folder of flatfiles that look like this

    a|b|c|d|e

    1|2|3

    4|5|6

    f|g|h|i|j|l

    l|m|n|o|p|q

    r|s|t|u|v|w

    all letters and numbers from the first 4 rows are different columns, row 1 goes in one table, row 2 and 3 go in another table and the rest of the rows go in a third table.

    rows 4 plus are in the same 6 columns as row 4.

    I know how to delimite the columns with a pipe delimiter but how do I d the end of lines. When I try using {cr}{LF} I get unknown squares and the last column in the first row will not split from the first column in the second row. What am I doing wrong?

  • I could break this down into 4 flatfiles, Its easy to get rid of the the first line to read the next ones but how do you read only the first line?

  • To peel the first row without grabbing the others, have you tried just the {LF} or the {CR} alone as a row delimiter? I've had the former in some of the flat files that I've had to import. If you want to know what they are for sure, try a hex viewer like "V" (http://www.fileviewer.com/). I use it a lot for raw file analysis.

    If you find out what the unknown characters are, and want to replace them, the Replace ASCII Text Utility (Rat) from http://www.gidsoftware.com/free.htm has a command line mode that can be invoked from a SSIS task.

    As a last ditch, if nothing else works, you could create a script task to get the position of the 5th pipe delimiter, and MID backwards from there to separate it from the second row.

  • I have tried {CR} and {LF} alone and nothing. I also got a hex editor and found I hav 0d 0d 0a at the end of the lines. I wasnt sure what to do with that though, can those be replaced?

    "As a last ditch, if nothing else works, you could create a script task to get the position of the 5th pipe delimiter, and MID backwards from there to separate it from the second row. "

    I'm not sure how to do this but its somewhere for me to start. Thanks for the reply.

  • Since it looks like you have {CR}{CR}{LF}, try selecting the Row Delimiter, and putting that in rather than accepting one of the choices in the pulldown. I don't know if it will take it, but I know that the column delimiter pulldown will, I've had to put in custom values in there before for files with non-standard delimiters in them. If it will take the override, it may save you a lot of headache.

  • Easy to get rid of that dangling 0x0D with a derived column transform.

    Create a new column with this as the expression REPLACE(theColumn, HEX(13), "")

    might need this REPLACE(theColumn, (DT_STR, 1, 1252)HEX(13), "")

    or REPLACE(theColumn, "", "")

    One of those.

    Norman

    DTS Package Search

    http://www.dtspackagesearch.com/

  • If you have a small file example (emphasis on "small"), attach it to a post and let's take a crack at it... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply