January 29, 2010 at 6:55 pm
SSIS 2008 (I don't see a SSIS 2008 forum, I hope this is the right place)
So I have a data flow that pumps millions of records from a SQL Server table to a flat file. Works great. Now I have a new requirement. One of the columns is DollarAmount. Now I need to write a footer row to the file that has the MAX(DollarAmount) and the MIN(DollarAmount).
My first instinct is to include a script component in the data flow. Then in the ProcessInputRow check each DollarAmount, if it's less than the current min, or greater than the current max, update the current min/max user variable:
if(Row.DollarAmount > CurMaxDollarAmt)
CurMaxDollarAmt = Row.DollarAmount
Is this a good approach? Would this sort of thing go in the ProcessInputRow, OR does it belong in the PostExecute? This is the only approach I can think of where I don't have to execute some form of the query multiple times. Once for the data, and then again for the min/max.
I'm having trouble updating a user var from the ProcessInputRow. I could have sworn I read that this was a common practice. Here's the error:
"The collection of variables locked for read and write access is not available outside of PostExecute."
.
January 30, 2010 at 2:55 pm
You have to include a class attribute of type double and use the algorithm you have correctly described to find the highest current value in the ProcessInputRow method. Store the determined highest value in package variable in the PostExecute method.
February 1, 2010 at 12:51 am
To find the min/max value(s) you can use a Multicast transform and feed one flow to an Aggregate Task.
Write the output to a temporary file (eg RAW).
Add a second Data Flow Task, read from the temporary file and append (uncheck Overwrite data in the file on the second Flat File Destination) the footer to your flat file created in the first step. .
February 1, 2010 at 1:12 am
wschampheleer (2/1/2010)
To find the min/max value(s) you can use a Multicast transform and feed one flow to an Aggregate Task.Write the output to a temporary file (eg RAW).
Add a second Data Flow Task, read from the temporary file and append (uncheck Overwrite data in the file on the second Flat File Destination) the footer to your flat file created in the first step. .
Tidy solution Willem, but with millions of rows of data being involved, perhaps this also introduces quite an overhead?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 1, 2010 at 2:18 am
Phil Parkin (2/1/2010)
Tidy solution Willem, but with millions of rows of data being involved, perhaps this also introduces quite an overhead?
From my own experience, the overhead of the mutlicast and aggregator is very limited (although I have no idea how it compares to your script task). The millions of rows follow the same route in both scenarios - no differences here. The second data flow task only reads/writes one row, so it's impact is negligible. Speaking of which, how you would append the footer to the file when using the script task?
February 1, 2010 at 2:34 am
I do not have the experience to argue either way - just thought I'd raise it as a point to be considered.
You would not use the Script Task to do the append, but rather the PostExecute method of the Script Component.
Phil
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply