February 25, 2009 at 7:03 pm
Hi All, i am Krish and i am new to this forum as well as SSIS. I am in learning stage and i have scenario like this:
I have 5 customer files(Flat) in my C:\ drive. My file has data like this
Cust_US(File name) 1790(No.of records)
111023 EDW123 james blake TX US
111024 EDW124 Andy Roddick LA US
111024 EDW125 Serena Williams CA US
My all 5 files structure is same. First row will be name of the file and record count.
now i want to load my first record in one table that is Audit table and the other data into my Customer dimension table. I am using Foreach Loop to load without first record. I was succeded doing that. I was stuck at splittin the file and load it into two different tables.
I have no idea how to identify first recod, becuase the first row structure and rest of the data structure is different.
I am still working on this scenario trying to identify the ways.
Can someone help me in this scenario. I really appreciate someones help here
Thanks in advance.
Thanks
Krish
February 25, 2009 at 10:11 pm
If you've got a source file with dissimilar structure from one line of data to the next, you're most likely going to have to use a Script Component to address this. You can configure the Script Component as a source and, using the System.IO namespace, read through the file and send the data to one of 2 (or more, if necessary) outputs.
If you're not familiar with the Script Component, feel free to post back and I'll work up a few examples for you.
hth,
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
February 26, 2009 at 1:50 am
do you know the data transformation 'conditional split' ?
February 26, 2009 at 5:50 am
The conditional split alone won't get you there. The number and types of columns are different within the file.
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
February 26, 2009 at 6:38 am
Hi Tim, thanks for the reply. I did not use script component so far. I tried with conditional transformation seems like that does not work for my scenario. Can you please post few examples how to use script component for this scenario. Mean time i will try to work on script component.
Thanks
Krish
February 26, 2009 at 9:23 am
Hi, i do know we have Conditional split transformation. I tried with that but i wasn't successed.
As per Tim's suggession i am trying using script component to acheive my scenario.
I defined two outputs using Script component, one for Audit and another for Customer and i am trying to connect these two outputs to two OLE DB destinations.
Everything looks good, but when i run the Package, i see all the data going through either one output.
Here is the script i used(I copied it from another place)
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits UserComponent
Public Overrides Sub input0_ProcessInputRow(ByVal Row As Input0Buffer)
Row.Col1 = Tokenise(Row.Column0, ",", 1)
Row.Col2 = Tokenise(Row.Column0, ",", 2)
End Sub
Public Overrides Sub input0_ProcessInputRow(ByVal Row As Input0Buffer)
Row.Col3 = Tokenise(Row.Column0, ",", 3)
Row.Col4 = Tokenise(Row.Column0, ",", 4)
Row.Col5 = Tokenise(Row.Column0, ",", 5)
Row.Col6 = Tokenise(Row.Column0, ",", 6)
End Sub
Private Function Tokenise(ByVal input As String, ByVal delimiter As String, ByVal token As Integer) As String
Dim tokenArray As String()
tokenArray = input.Split(delimiter.ToCharArray) 'Split the string by the delimiter
If tokenArray.Length < token Then 'Protect against a request for a token that doesn't exist
Return ""
Else
Return tokenArray(token - 1)
End If
End Function
End Class
The script returs error message like we Multiple definitions of Public Overrides.
Can someone help here..
February 26, 2009 at 10:45 am
Krish,
I'm finishing up an article for publication on this site to demonstrate an easy way to handle this. I'll send you a preview copy later tonight or tomorrow which should get you there.
hth,
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
February 26, 2009 at 11:22 am
Thanks a lot Tim. I really appreciate your help on this.
February 26, 2009 at 9:33 pm
Hi Tim, added to above reply i tried another way to do this but i wasn't succeded. Here is the way i tried
My falt file looks like this(for refference)
File_Name = Cust_US
----------------------
Cust_US(File name) 1790(No.of records)
111023 EDW123 james blake TX US
111024 EDW124 Andy Roddick LA US
111024 EDW125 Serena Williams CA US
I defined a data flow in my foreach Loop
In the data flow i defined my flat file connection
Then i added a conditional split transormation
in conditional split i wrote an expression like Substring(Column1,1,2,) == 'Cu" (this is case1)
Then i defined my OLE DB destination for above case1 and i mapped it to Audit table.
When i run the package Audit table loaded successfully with 5 records(Becauase i have 5 files).
Then for the second condition in the Conditaionl split tranformation i defined a script component to defined the columns. Idefined new columns in Advanced table in Script component like this
col1
col2
col3
col4
col5
col6
I wrote the script to identify all the newly defined column in Script page and i mapped to my target Customer table.
But when i run the package, it was failing becuase it is still taking Cust_US as first record.
Hope this makes claear to you and everyone.
Thanks
February 26, 2009 at 10:51 pm
What you'll need to do is to skip the conditional split altogether; you'll use the Script component to separate the record types.
First, create your Script Component as a "Source". Next, and this is most important, is to create a second output in the list of Outputs in the settings for the Script Component. You'll see that there is already an output (probably named "Output 0" in the list; click "Add Output" and add a new one to the list. You can name these whatever you want to make your script component easier to use. You'll also see when you unfold the properties for each Output that you can define the columns for each. Go ahead and configure the columns for your 2 different output types; this will be fairly self explanatory.
Once you've got your outputs configured, open the script editor. You'll need to open the file using the StreamReader object (be sure to include the Imports System.IO directive). You can use the StreamReader object to read the file one line at a time using the ReadLine() method. You can test each line in the file, and send it to the appropriate output.
To add a line to either of the outputs, call the AddRow() method first. For example, if you use the default output name of "Output 0", you'll call Output0Buffer.AddRow() to add a new row to the output. Then, use Output0Buffer. with each column name. Note that you'll do this for each line in the file, handling each line in turn and sending it to the appropriate output.
I know this is a lot of information to throw at you, particularly if you're not yet comfortable with the Script Component. Feel free to post back if any of this needs clarification. I'll be publishing an article shortly on this topic, which will be complete with screenshots and examples.
hth,
Tim
Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices
February 27, 2009 at 1:42 pm
Tim, thanks for the reply. To get clear idea, i am waiting for your article. Can you please post it whenever you are ready. Mean time i will try the way you said.
Thanks
March 2, 2009 at 9:10 am
Hi tim, good morning! Can we use same flat file source multiple times for Header record and for detail records? And i want to use this in two different dataflows connected each other. One data flow loads only one record i.e. Header record(with a conditional split) and another data flow loads detailed records(Selected in source file connection properties as skip first row).
Can i folllow the above method using Foreach loop container??
Thanks
Raj
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply