Change multiple rows to Distinct in Data Flow Task

  • This is a bit of a weird one.

    I have SQL data being sent to a mainframe that has to be unpivoted and sent in 8 rows per Contract to the mainframe. In an error-checking task, I have to check 20+ fields for bad values before the final load. Because of this, my source SQL proc is pulling back the 8 rows per contract in the data flow task. However, when an error is found on one field, I want it to parse down to 1 distinct row for the error, not 8 rows for the same contract & error.

    Example:

    ContractNum LastName FirstName

    1234 Tarvin NULL

    FirstName & LastName can't be blank. They are required. So I have a conditional split checking for that. Then it splits into two Derived Column transformations which add FieldName and FieldValue.

    My final "error" flat file consists of the following fields (so the business unit knows what they have to correct):

    ContractNum FieldName FieldValue

    1234 FirstName NULL

    However, because the initial source is pulling 8 rows, the same information gets repeated 8 times over in my flat file (as below).

    ContractNum FieldName FieldValue

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    I cannot change the source. There are distinct values in each of the 8 rows and those distinct values need to be error-checked. So I need a way of taking those 8 rows AFTER the error discovery and making it DISTINCT down to 1 row.

    I don't see a way of doing that save the OLE DB Command transformation, but I can't figure out a good command to use for this.

    Any thoughts?

    Thanks in advance.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

  • Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

  • All eight rows will show the same output. But package design was already completed a couple of weeks ago. I'm hoping to avoid major redesign given the dependencies on this particular Data Flow Task.

    Any other possibilities?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • -- edit -- see below.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Add a process at the end to remove the dups from the flat file ... ?

    Not elegant, but avoids the major redesign.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • After getting all the error rows just keep an aggregate transformation task.

    Do a group by on all the three fields in the aggregate transformation task (Just check the column name and set the operation as group by for each column selected). So that it will group all the repeating rows as a single distinct row and then send it to OLEDB destination.

  • sriram979 (4/14/2009)


    After getting all the error rows just keep an aggregate transformation task.

    &lt blink &gt I didn't even think of that one. Thanks! I'll give it a try.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply