Change multiple rows to Distinct in Data Flow Task

Question

Change multiple rows to Distinct in Data Flow Task

Brandie Tarvin

SSC Guru

Points: 173068
More actions
April 13, 2009 at 5:16 am

#129299

This is a bit of a weird one.
I have SQL data being sent to a mainframe that has to be unpivoted and sent in 8 rows per Contract to the mainframe. In an error-checking task, I have to check 20+ fields for bad values before the final load. Because of this, my source SQL proc is pulling back the 8 rows per contract in the data flow task. However, when an error is found on one field, I want it to parse down to 1 distinct row for the error, not 8 rows for the same contract & error.
Example:
ContractNum LastName FirstName
1234 Tarvin NULL
FirstName & LastName can't be blank. They are required. So I have a conditional split checking for that. Then it splits into two Derived Column transformations which add FieldName and FieldValue.
My final "error" flat file consists of the following fields (so the business unit knows what they have to correct):
ContractNum FieldName FieldValue
1234 FirstName NULL
However, because the initial source is pulling 8 rows, the same information gets repeated 8 times over in my flat file (as below).
ContractNum FieldName FieldValue
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
1234 FirstName NULL
I cannot change the source. There are distinct values in each of the 8 rows and those distinct values need to be error-checked. So I need a way of taking those 8 rows AFTER the error discovery and making it DISTINCT down to 1 row.
I don't see a way of doing that save the OLE DB Command transformation, but I can't figure out a good command to use for this.
Any thoughts?
Thanks in advance.
Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Girish Bhat Ten Centuries Points: 1323 More actions · Answer 1

Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

Girish Bhat Ten Centuries Points: 1323 More actions · Answer 2

Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

Brandie Tarvin SSC Guru Points: 173068 More actions · Answer 3

All eight rows will show the same output. But package design was already completed a couple of weeks ago. I'm hoping to avoid major redesign given the dependencies on this particular Data Flow Task.

Any other possibilities?

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Phil Parkin SSC Guru Points: 246964 More actions · Answer 4

-- edit -- see below.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Phil Parkin SSC Guru Points: 246964 More actions · Answer 5

Add a process at the end to remove the dups from the flat file ... ?

Not elegant, but avoids the major redesign.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

sriram979 Old Hand Points: 300 More actions · Answer 6

After getting all the error rows just keep an aggregate transformation task.

Do a group by on all the three fields in the aggregate transformation task (Just check the column name and set the operation as group by for each column selected). So that it will group all the repeating rows as a single distinct row and then send it to OLEDB destination.

Brandie Tarvin SSC Guru Points: 173068 More actions · Answer 7

sriram979 (4/14/2009)
After getting all the error rows just keep an aggregate transformation task.

&lt blink &gt I didn't even think of that one. Thanks! I'll give it a try.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.