Appending values from second row to first row

Question

Appending values from second row to first row

MattieMich7

Old Hand

Points: 367
More actions
October 23, 2013 at 9:43 am

#395713

I have .csv source file that looks like this:
Type,ColA,ColB
Test,ValueA,ValueB
Test,ValueA,ValueB
AP,ValueA,ValueB
Test,ValueA,ValueB
For every row where Type="Test", I want to ignore it if it's followed by a row where Type<>"Test". So in my example, I'd want to ignore rows 2 and 3. My plan was to create a script component where for each row in the file, it would grab the first value from the subsequent row and populate a new output column (i.e. NextRowTypeValue). The output would look something like this:
Type,ColA,ColB,NextRowTypeValue
Test,ValueA,ValueB,Test
Test,ValueA,ValueB,AP
AP,ValueA,ValueB,Test
Test,ValueA,ValueB...
I've found many script examples for capturing previous row values (i.e. http://ericwisdahl.wordpress.com/2009/08/02/ssis-retain-values-from-previous-rows/), but not for capturing subsequent row values. Any ideas? Thanks.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Nevyn SSChampion Points: 14210 More actions · Answer 1

I suspect that this requirement would be way easier to do loading the data into a staging table and then using T-SQL.

Maybe some more veteran SSIS users have a quick script solution but if I had to do this without a staging table I'd be applying row numbers, then sorting descending, then loading previous, then sorting ascending. And T-SQL would be way better than that.

MattieMich7 Old Hand Points: 367 More actions · Answer 2

Nevyn (10/23/2013)
I suspect that this requirement would be way easier to do loading the data into a staging table and then using T-SQL.
Maybe some more veteran SSIS users have a quick script solution but if I had to do this without a staging table I'd be applying row numbers, then sorting descending, then loading previous, then sorting ascending. And T-SQL would be way better than that.

Thanks for your reply. That makes sense, was just hoping there was a more efficient way to do it via a script component. I was able to get it working by doing as you suggested...apply row number, sort descending, etc.

Nevyn SSChampion Points: 14210 More actions · Answer 3

There might be a better way. Im new to script transforms.

As I said, I'd load it into a staging table and then do it via tsql. Easier and would probably run faster.

Phil Parkin SSC Guru Points: 246993 More actions · Answer 4

This can definitely be done using a Script Component. It should be asynchronous, because not every input row will have a corresponding output row.

I did not reply sooner because your logic puzzled me.

For every row where Type="Test", I want to ignore it if it's followed by a row where Type<>"Test"

Applying this logic to your sample data (with added row numbers for clarity):

Input

1 Test,ValueA,ValueB

2 Test,ValueA,ValueB

3 AP,ValueA,ValueB

4 Test,ValueA,ValueB

Output

1 Test,ValueA,ValueB

3 AP,ValueA,ValueB

4 Test,ValueA,ValueB

Only row 2 is ignored, because it is the only one which matches your exclusion rule.

You have said that rows 2 and 3 will be ignored (why row 3, its type <> 'Test'?). And why does your sample output include all of the rows?

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

MattieMich7 Old Hand Points: 367 More actions · Answer 5

Only row 2 is ignored, because it is the only one which matches your exclusion rule.
You have said that rows 2 and 3 will be ignored (why row 3, its type <> 'Test'?). And why does your sample output include all of the rows?

You are correct, I was wrong in saying both 2 and 3 will be ignored. My sample output included all the rows because I had planned on using a conditional split on that output to exclude the appropriate rows. In other words, my sample output was not the "final output", it was just getting the data in a state where I could perform my conditional logic. I apologize if I did not word my question properly.

Anyway, I'd be interested in hearing your approach with the script component. Thanks.

Phil Parkin SSC Guru Points: 246993 More actions · Answer 6

Some rough pseudo-code:

rowA = readRow;

rowB = readRow;

do

if (rowA.Type != "Test" || rowB.Type != "Test")

ignore

else

output rowA;

rowA = rowB;

rowB = readRow;

until (EOF)

-Edit: fix logic error in pseudo-code & simplify

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel