May 23, 2008 at 7:24 am
Using Derived column I am creating a new column called journal_line_nbr
The journal line number is a sequential number starting with 1 and advancing to the next number for each row of a journal entry.
What is the syntax/expression for this in Derived column? None of the functions appear to have this.
May 23, 2008 at 7:28 am
I think you will need to create a package variable that you increment in order to do this as SSIS processes row by row not by the set. This is really something you should do in your source statement if at all possible. So if SQL Server 2005 is your source you could use the Row_Number function in your source statement.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 23, 2008 at 7:37 am
Hi Jack,
My source is an Excel file going into a SQL 2005 table. So will the variable function still work?
May 23, 2008 at 7:43 am
You can use a simple script component to number the rows as well. This assumes that the input table is "nicknamed" input0, and that the script component outputs a derived column called RID (which you would add manually to the output components of the script).
The script looks like:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits UserComponent
Private _RowNumber As Integer = 1
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Row.RID = _RowNumber
_RowNumber = _RowNumber + 1
'
' Add your code here
'
End Sub
End Class
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 23, 2008 at 7:50 am
Just want to clear this up as i am still learning SSIS.
So after my Derived Column transformation task I want to connect a script component transformation task?
The code you provide indicated 'enter your code here' would I have to define the columns again?
Sorry if these questions are trival...
May 23, 2008 at 8:16 am
No - you wouldn't have to add anything else. Just like a derived column does, this script component would "add" a new column to the list of available items. (The 'add your code here' is the default text you get when you go into a script component, which I neglected to remove.)
Like you said - just wire up the output from the derived column to the script component (i.e. drag the green arrow from the derived column component onto the script component.)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 23, 2008 at 8:26 am
I was just going to go where Matt took you. To explain what Matt is saying, you should replace your Derived Column with a script component. If I understand your needs I correctly I would do the following:
1. Create 2 variables scoped to the dataflow. journal_line_nbr as int32 and LastJournalEntry typed to the Unique ID of the journal entry
2. Add a Script Component selecting a type of Transformation
3. On the Input Columns tab of the script component select the JournalEntry column
4. On the Inputs ans Output tab of the script component, Expand Output 0, Select Output Columns, and Click the Add Column button to add the journal_line_nbr column to the output using Int as the datatype
5. On the script tab of the component enter your 2 variables in the ReadWrite variable line separated by a comma (variable1, variable2)
6. Click Design Script and create a script like this:
[font="Courier New"]Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
' this checks to see if we are in the same journal entry
' Row.journal_entry is the current journal entry
' Variables.LastJournalEntry is the variable we defined
If Row.journal_entry = Variables.LastJournalEntry Then
' increment the journal line number variable
Variables.journal_line_nbr += 1
Else
' new journal entry so reset journal_line_nbr variable
Variables.journal_line_nbr = 1
' set the LastJournalEntry variable to current row
Variables.LastJournalEntry = Row.journal_entry
End If
' Row.journal_line_nbr is the output column created so
' set it to the Variable
Row.journal_line_nbr = Variables.journal_line_nbr
End Sub
[/font]
Now your output from the script will contain all the columns in your excel sheet + the journal_line_nbr.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 23, 2008 at 8:50 am
Jack/Matt,
I really do appreciate this, I am understanding how this works I have two questions:
1. Do I need to create a blank column in Derived column task for the journal number? I think the answer is yes otherwise the script transformation task cannot transform the column (this is just to clarify and for my learning)
2. Jack on your example you provide two variables the number and the journal entry (like the journal ID or identifier) I am assuming you placed this variable in here so that it could 'loop' through journal IDs?
If I am correct, then what would the modified script look like if the output is to go to one journal ID/entry only?
May 23, 2008 at 9:20 am
1. Define them as outputs in the script component. After adding the script component, right, go into Edit, then inputs and outputs, and add the output names you want. then use them in the code as Jack described.
2. Essentially - Jack's code example allows for two variables (one being the sheet and one being the row), and mine just deals with one (just a row number).
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
May 23, 2008 at 9:24 am
jsheldon (5/23/2008)
1. Do I need to create a blank column in Derived column task for the journal number? I think the answer is yes otherwise the script transformation task cannot transform the column (this is just to clarify and for my learning)
If all you are creating is a new column for journal_line_nbr then you do not need a derived column task. The script component transformation task as I outlined creates the column.
jsheldon (5/23/2008)
2. Jack on your example you provide two variables the number and the journal entry (like the journal ID or identifier) I am assuming you placed this variable in here so that it could 'loop' through journal IDs?
I had assumed that you had data like this:
journal entry data
1 test1
1 test2
2 test1
3 test1
3 test2
3 test3
and you wanted a result like this:
journal entry data journal line nbr
1 test1 1
1 test2 2
2 test1 1
3 test1 1
3 test2 2
3 test3 3
This is what my code would return.
jsheldon (5/23/2008)
If I am correct, then what would the modified script look like if the output is to go to one journal ID/entry only?
I think this may be what you are looking for:
[font="Courier New"]Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
' increment the journal line number variable
Variables.journal_line_nbr += 1
' Row.journal_line_nbr is the output column created so
' set it to the Variable
Row.journal_line_nbr = Variables.journal_line_nbr
End Sub
[/font]
If all you want is a sequential number for each row in the excel sheet why not add an identity column to your destination table? It's a lot simpler than what you are trying to do in code.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 23, 2008 at 10:14 am
If your package is inserting the data into a staging area why not just add an identity column to the staging table for row number? You might also just insert into a transient table that can be truncated each time. Not sure what the requirements are but this might prove to perform better than adding a script task...
Ben Sullins
bensullins.com
Beer is my primary key...
May 23, 2008 at 11:48 am
Hi Jack,
Ok I am getting this thanks..one more question my script if I gloss over it states 'journal_line_nbr is not a member of 'ScriptComponent_....'
To answer your question, why not make an identity column on the destination table. That would make life too easy 😉
The data is a telephone activity going into a transaction table of an ERP system. If I add or even modify the table schema then the loads and postings will not function. I don't mess with third-party vendor databases...
May 23, 2008 at 11:55 am
To answer your question, why not make an identity column on the destination table. That would make life too easy 😉
The data is a telephone activity going into a transaction table of an ERP system. If I add or even modify the table schema then the loads and postings will not function. I don't mess with third-party vendor databases...
Right but do you not write the data to a staging area first? If not then yeah use a script task or derived column but if you are writing it to a staging area I would just modify the staging table for this purpose...
Good Luck!
Ben Sullins
bensullins.com
Beer is my primary key...
May 23, 2008 at 12:01 pm
The staging area is the table it is going to...bad ERP desgin what can I say.
So my raw source goes to project_jrnl_load_trans and then the process moves it to the project_jrnl_line table
Let me ask this, within my excel source data task can I add an identity column? That would be much easier than this script task.
May 23, 2008 at 12:23 pm
jsheldon (5/23/2008)
Let me ask this, within my excel source data task can I add an identity column? That would be much easier than this script task.
If your configured properly then yes you can...here is what I did locally to test it:
1.) Added an OLE DB Source to my data flow.
2.) Selected a SQL 2005 DB Server for my connection manager.
3.) Chose 'SQL Command' as my Data access mode.
4.) Used the RowNumber() function combined with the OPENROWSET() connection method to compose a query which pulled all the data in from the Excel sheet adding in the RowNumber.
--In order for this to work your SQL 2005 Surface Area Configuration must allow the OPENROWSET() connection method.
--You'll most like want to define this SQL Statement dynamically through a package configuration or some other method.
Here is the query:
SELECT
ROW_NUMBER() OVER (order by [VendorAccountNo]) AS [RowNumber]
,[VendorAccountNo]
,[LocalAccountNo]
,[CompanyCd]
,[CUSIP]
FROM
OPENROWSET('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=C:\temp\MySheet.xls;HDR=0; IMEX=1;','SELECT * FROM [Sheet1$]')
Hope this helps!
Ben Sullins
bensullins.com
Beer is my primary key...
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply