Breaking down a flat file

Question

Breaking down a flat file

mejo1111

Old Hand

Points: 380
More actions
December 4, 2007 at 8:51 am

#122541

I have a folder of flatfiles that look like this
a|b|c|d|e
1|2|3
4|5|6
f|g|h|i|j|l
l|m|n|o|p|q
r|s|t|u|v|w
all letters and numbers from the first 4 rows are different columns, row 1 goes in one table, row 2 and 3 go in another table and the rest of the rows go in a third table.
rows 4 plus are in the same 6 columns as row 4.
I know how to delimite the columns with a pipe delimiter but how do I d the end of lines. When I try using {cr}{LF} I get unknown squares and the last column in the first row will not split from the first column in the second row. What am I doing wrong?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

mejo1111 Old Hand Points: 380 More actions · Answer 1

I could break this down into 4 flatfiles, Its easy to get rid of the the first line to read the next ones but how do you read only the first line?

Dave23 Hall of Fame Points: 3903 More actions · Answer 2

To peel the first row without grabbing the others, have you tried just the {LF} or the {CR} alone as a row delimiter? I've had the former in some of the flat files that I've had to import. If you want to know what they are for sure, try a hex viewer like "V" (http://www.fileviewer.com/). I use it a lot for raw file analysis.

If you find out what the unknown characters are, and want to replace them, the Replace ASCII Text Utility (Rat) from http://www.gidsoftware.com/free.htm has a command line mode that can be invoked from a SSIS task.

As a last ditch, if nothing else works, you could create a script task to get the position of the 5th pipe delimiter, and MID backwards from there to separate it from the second row.

mejo1111 Old Hand Points: 380 More actions · Answer 3

I have tried {CR} and {LF} alone and nothing. I also got a hex editor and found I hav 0d 0d 0a at the end of the lines. I wasnt sure what to do with that though, can those be replaced?

"As a last ditch, if nothing else works, you could create a script task to get the position of the 5th pipe delimiter, and MID backwards from there to separate it from the second row. "

I'm not sure how to do this but its somewhere for me to start. Thanks for the reply.

Dave23 Hall of Fame Points: 3903 More actions · Answer 4

Since it looks like you have {CR}{CR}{LF}, try selecting the Row Delimiter, and putting that in rather than accepting one of the choices in the pulldown. I don't know if it will take it, but I know that the column delimiter pulldown will, I've had to put in custom values in there before for files with non-standard delimiters in them. If it will take the override, it may save you a lot of headache.

Norman Kelm SSCommitted Points: 1922 More actions · Answer 5

Easy to get rid of that dangling 0x0D with a derived column transform.

Create a new column with this as the expression REPLACE(theColumn, HEX(13), "")

might need this REPLACE(theColumn, (DT_STR, 1, 1252)HEX(13), "")

or REPLACE(theColumn, "", "")

One of those.

Norman

DTS Package Search

http://www.dtspackagesearch.com/

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 6

If you have a small file example (emphasis on "small"), attach it to a post and let's take a crack at it... 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)