Find Min/Max values in an export

  • SSIS 2008 (I don't see a SSIS 2008 forum, I hope this is the right place)

    So I have a data flow that pumps millions of records from a SQL Server table to a flat file. Works great. Now I have a new requirement. One of the columns is DollarAmount. Now I need to write a footer row to the file that has the MAX(DollarAmount) and the MIN(DollarAmount).

    My first instinct is to include a script component in the data flow. Then in the ProcessInputRow check each DollarAmount, if it's less than the current min, or greater than the current max, update the current min/max user variable:

    if(Row.DollarAmount > CurMaxDollarAmt)

    CurMaxDollarAmt = Row.DollarAmount

    Is this a good approach? Would this sort of thing go in the ProcessInputRow, OR does it belong in the PostExecute? This is the only approach I can think of where I don't have to execute some form of the query multiple times. Once for the data, and then again for the min/max.

    I'm having trouble updating a user var from the ProcessInputRow. I could have sworn I read that this was a common practice. Here's the error:

    "The collection of variables locked for read and write access is not available outside of PostExecute."

    .

  • You have to include a class attribute of type double and use the algorithm you have correctly described to find the highest current value in the ProcessInputRow method. Store the determined highest value in package variable in the PostExecute method.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • You can check this script for ideas.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • To find the min/max value(s) you can use a Multicast transform and feed one flow to an Aggregate Task.

    Write the output to a temporary file (eg RAW).

    Add a second Data Flow Task, read from the temporary file and append (uncheck Overwrite data in the file on the second Flat File Destination) the footer to your flat file created in the first step. .

    Regards,

    Willem
    http://wschampheleer.wordpress.com[/url]

  • wschampheleer (2/1/2010)


    To find the min/max value(s) you can use a Multicast transform and feed one flow to an Aggregate Task.

    Write the output to a temporary file (eg RAW).

    Add a second Data Flow Task, read from the temporary file and append (uncheck Overwrite data in the file on the second Flat File Destination) the footer to your flat file created in the first step. .

    Tidy solution Willem, but with millions of rows of data being involved, perhaps this also introduces quite an overhead?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (2/1/2010)


    Tidy solution Willem, but with millions of rows of data being involved, perhaps this also introduces quite an overhead?

    From my own experience, the overhead of the mutlicast and aggregator is very limited (although I have no idea how it compares to your script task). The millions of rows follow the same route in both scenarios - no differences here. The second data flow task only reads/writes one row, so it's impact is negligible. Speaking of which, how you would append the footer to the file when using the script task?

    Regards,

    Willem
    http://wschampheleer.wordpress.com[/url]

  • I do not have the experience to argue either way - just thought I'd raise it as a point to be considered.

    You would not use the Script Task to do the append, but rather the PostExecute method of the Script Component.

    Phil

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply