November 2, 2007 at 3:09 pm
I have a fixed-width txt file that has two sets of data in it. The first section should insert into one table while the second section should insert into a different table. Can I create a SSIS package to insert into two different tables if the file has different fixed-widths, or do I have to seperate this into 2 files?
-----------------------------------------
FIELD START TYPE LENGTH
A 1 STRING 9 --SECTION 1 BEGINS HERE
B 10 STRING 10
C 20 STRING 5
D 1 STRING 2 --SECTION 2 BEGINS HERE
E 3 STRING 15
F 18 STRING 15
-----------------------------------------
November 2, 2007 at 3:34 pm
Personally, different data should be in seperate files.
November 2, 2007 at 5:15 pm
I totally agree; however, this is a vendor txt file that is being dropped on our FTP and I have no control over how it is formatted or built.:crazy:
Does anyone have any ideas how I can accomplish this? The only way I can think of is to either create seperate files or to add the second set of data as columns on the first set of data and just ignore the irrelevent data via column mapping. Both of these methods require manually intervention. I am looking for something more automated.
Thanks,
Adam
November 2, 2007 at 5:25 pm
Is there anything in the data to indicate when the data changes from one format to the other?
November 2, 2007 at 11:06 pm
Yes, It actually says "total record - Last record on file", before the second set of data starts.
the second set of data is actually meta data. It sums up the important stuff such as the total number of rows in the file, the sum of the invoice etc.
November 5, 2007 at 9:11 am
November 5, 2007 at 2:08 pm
I think your easiest method would be to write a script that splits it into two files based on the logic you describe (look for the separator text between the two sections) before you suck it into SSIS.
This assumes that you have a reliable method for identifying the split point in the file, or a reliable method for splitting the file based on row length if your rows are mixed together.
Back in my Unix days, I'd have done this using sed or awk, or perhaps with a perl script. You can get these tools on Windows, or you could use pretty much any scripting language to perform this task.
If you absolutely need to bring it into SSIS first, then you'll have to define your columns in such a way that you can piece it back together properly for either data set, then you could use a conditional split transform to send the data to the proper table, including derived column transforms as necessary to put the right columns back together again the way they should be.
For example, say you have columns like this, where X represents one column and y represents the next:
Row type 1:
XXXXYYYYXXXX
Row type 2:
XXXYYYXXX
You'd need to define columns like:
ColA: 1-3
ColB: 4-4
ColC: 5-6
ColD: 7-8
ColE: 9-9
ColF: 10-12
Use a conditional split to evaluate the total length of each row, then, based on the total length of the row, you could used two derived column transforms (one in each data stream) to concatenate these column "parts" back into the right set of columns:
Row type 1:
firstcol = ColA + ColB
secondcol = ColC + ColD
thirdcol = ColE+ColF
Rowtype 2:
firstcol = ColA
secondcol = ColB+ColC
thirdcol = ColD+ColE
I don't know if this is your best option, but it is one way to do it.
November 5, 2007 at 9:48 pm
What would help would be to see some sample data, just as it is in the original file, just reduce the number of rows to something manageable.
November 7, 2007 at 12:22 am
I agree on splitting the file first. But this also can be done in SSIS.
Read the file as if it just has one large text column. Pass it through a Script transformation in which you add one extra column. This column indicates the type of the record. The script transformation will only set this record type.
Then, using a conditional split based on the record type, you can separate the two record types into two new flat files.
The flat files can then be processed separately.
November 7, 2007 at 10:17 am
Thanks guys for the post :D. I will take a look at the suggested methods. Lynn, per your request, I am attaching a sample of the text file.
Note: The second set of data is only one row, but has different fixed widths. This row will be used as meta data for a web application.
March 14, 2011 at 1:14 pm
I completely agree with R Kool.
Use SSIS to have three data flows in the control flow.
1st. Create a data flow to split the file into 2 flat files.
2nd Handle the first file.
3rd Handle the second file.
It would probably be best to put the 3 dataflows into a container so that if any of the dataflows fail you don't have any partial loads.
IRS files have this format all the time. Usually an identifier in the the first column to determine what record type the record is. Also make sure you are not losing the header record ID in the details. Sometimes their isn't a direct link in the record and it is known just because it is under the master record.
Good Luck!
March 14, 2011 at 1:41 pm
Valuable input, but this thread is > 3 years old, so I imagine that the OP has either solved the problem or given up by now 🙂
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 14, 2011 at 1:45 pm
Ha. I bet it has. New to this today. Was looking for an answer to a similar question and had to put my two sense in before realizing it was an ancient post.
March 14, 2011 at 2:03 pm
Yeah, I saw you were new so I went easy on you;-)
By the way, OP = 'original poster' and 'Adam' sounds more like a 'he' than an 'it' to me:-D
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 15, 2011 at 1:14 am
Phil Parkin (3/14/2011)
Yeah, I saw you were new so I went easy on you;-)
Ah keithlangleymba, you're in luck! Phil is notorious for making people cry in this forum 😉
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply