August 14, 2010 at 12:10 pm
Hello everybody
Need help solving the following problem. I Have a flat file that has id and data related to that id separated with a pipe delimiter
IDData for ID
101Some thing related to 101
102adsfasdf
103afaf
101adfasdf
102adsfasdf
104qeaqera
102Aefasdfa
105Dadfaf
How can I separate the data based on ID to different tables in SSIS or to some different flat files. Also is it possible to name that file with that id automatically. Say, I want all the records with id 101 in one table, 102 in another table and so on. Could be possible that file has large number of records. And load to some destination or some other file, that is irrelevant.
Basically what I think is that we can use conditional split for this purpose, but donβt know what condition I should use for this purpose.
If it can be done by any other way also please do let me know.
August 16, 2010 at 4:34 am
Ravi,
Note: This description is worth implementing only if we have limited number of distinct ID's say at maximum I would bet on 10.
Incase, we have more, please refer later section.
Your design will look like below:
As per the description provided, we need to conditional split the data on the basis of ID's into multiple streams.Assign each stream to a destination. Job done.
If you have many distinct ID's in that case the best approach would be conditional split data not on the basis of ID's or else you should complement your split with multiple columns....
Hope this helps.
Raunak J
August 16, 2010 at 4:48 am
This is what I came up with after reading your question. Its works well for large number of source ids and is based on assumption that you want to create a new file for each Id in source.
Create following connections:
1. Source flat file connection named SourceConn
2. Target flat file connection named TargetConn (this will be made in dynamic coming steps)
Create following variables:
1. Name: Id_List Type: Object
2. Name: Id Type: String
Now:
Step 1: Create a Data flow task (call it DFT1): Read all ids (first column) from source file into a recordset destination (use @Id_List here).
Step 2: For each Loop Container: Loop through the recordset you just created in DFT1 and update string variable @Id each time.
Step 3: Inside for each loop, add new data flow task(DFT2): Read data again from same source file, use a conditional split with only one output with condition SourceId == @Id and insert into a flat file destination (Use TargetConn created above).
Step 4: Edit expressions for TargetConn. Set Connection string expression to "C:\\Temp\\"+ @[User::Id] +".txt"
Step 5: Save and Run the package. It will create a new file for each Id with Id as name.
I am not sure this is the optimal way of doing it. But this worked when I did it. So something that works is better than nothing. π
_____________________________________________________________
[font="Tahoma"]'Some people still are alive simply because it is illegal to shoot them... o_O π '[/font]
August 16, 2010 at 4:51 am
Consider this you have 100 distinct id's...by your approach i wil land up having 100 text files and not 100 data flow tasks to load in to SQL tables
This is bad...really bad:-P:-P:-P
Raunak J
August 16, 2010 at 4:57 am
Raunak,
You are right. It is so bad that server will go mad :w00t:
I have read something like this on some other forum before. Original poster Ravi must be doing this for some system which stores data per customer separately. Sometimes I don't understand the need for this really. :unsure:
_____________________________________________________________
[font="Tahoma"]'Some people still are alive simply because it is illegal to shoot them... o_O π '[/font]
August 16, 2010 at 4:59 am
Well the best approach should be to club multiple columns and split the data accordingly, what say brother???
Raunak J
August 16, 2010 at 5:03 am
Of course!! Yes.
_____________________________________________________________
[font="Tahoma"]'Some people still are alive simply because it is illegal to shoot them... o_O π '[/font]
August 17, 2010 at 5:56 am
Raunak
Thanks for the help, but can u please explain a bit more on the later part you mentioned about when having
" If you have many distinct ID's in that case the best approach would be conditional split data not on the basis of ID's or else you should complement your split with multiple columns...."
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply