How to generate a sequence for a particular group of rows in SSIS

Question

How to generate a sequence for a particular group of rows in SSIS

Raj-1120270

SSC Veteran

Points: 281
More actions
July 24, 2009 at 5:00 pm

#133220

Hi There,
I have a SSIS package which joins a flat file and 3 more database tables.
From Merge Join Output, If my data is like
ColA ColB colC ColD
123 4 xyz 12
123 4 xyz 12
23456A 5 abc 10
23456A 5 abc 10
23456A 5 abc 10
23456A 5 abc 10
Then I need to Replace ColB with values like
ColA ColB colC ColD
123 1 xyz 12
123 2 xyz 12
23456A 1 abc 10
23456A 2 abc 10
23456A 3 abc 10
23456A 4 abc 10
ColB should be auto incremented based on ColA. If COlA has value 123 two times then ColB should be 1 for first row and 2 for the second row.
If you look at the value of COlA "23456A" has been repeate for 4 times so my ColB should be 1 for first row n 2 for second rown and 3 for third row and 4 for fourth row.
I know how to achieve this one using T-SQL. However It is in between of the SSIS package so I can't write T-SQL.
IF I write T-SQL the query should be like this
select ColA,ColC, COlD , Row_Number() over (partition by colA order by ColA) as COlB from sometable
will give me the result....Can some one please help me how to get this one using SSIS. These four columns works as Primary key to my destination table. It was working fire about a week ago when there were no duplicates. Now I found some duplicate rows which will throw me an error primary key violation. So, I need to replace colB with Incrementing nmber...
Plz Help me... I really appriciate any suggestion
THans In Advance

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323755 More actions · Answer 1

not an expert on SSIS, but as you identified,the row_number function can do what you want, for sure...maybe stick everything in a table and do the row_number in a Script Task?

working code:

CREATE TABLE #Example(

ColA varchar(50),

ColB int,

ColC varchar(50),

ColD int )

INSERT INTO #Example

SELECT '123', 4,'xyz', 12 UNION ALL

SELECT '123', 4,' xyz', 12 UNION ALL

SELECT '23456A', 5,' abc', 10 UNION ALL

SELECT '23456A', 5,' abc', 10

Select

ColA,

row_number() OVER (partition by ColA Order by ColA) As ColB,

ColC,

ColD from #Example

--Results:

/*

ColA ColB ColC ColD

------- ----- ----- -----

123 1 xyz 12

123 2 xyz 12

23456A 1 abc 10

23456A 2 abc 10

23456A 3 abc 10

23456A 4 abc 10

*/

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

drew.allen SSC Guru Points: 77014 More actions · Answer 2

I did a Google search and quickly came up with a number of web pages that tell you how to do this. There is a third-party transformation, but that doesn't have the flexibility that you need. The other pages I looked at tell how to do this with a script task.

http://support.microsoft.com/kb/908460

http://weblogs.sqlteam.com/jamesn/archive/2008/02/13/60509.aspx

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

dspohn23 SSC-Addicted Points: 453 More actions · Answer 3

Use a Sort Transformation to first make sure your rows are sorted properly (ie, sorted by ColA).

Then use a Script Component to "reassign" ColB.

In the script component you can create a variable then holds the value for the previous ColA value. Also create a variable that is a counter (initially 1).

Check if ColA for the current row is the same as ColA for the previous row. If so, then increment the counter by 1 assign the counter value to ColB. Otherwise if the Current Row ColA is not equal to the Previous Row ColA then set the Counter = 1 and assign the counter value to ColB.

I can post an actual code example tomorrow, when I am in the office.

Hope this is what you are looking for.

TSQL Tutorials

dspohn23 SSC-Addicted Points: 453 More actions · Answer 4

Choose ColA and ColB as Input Columns

Heres a Script Component code example:

Public Class ScriptMain

Inherits UserComponent

Dim Counter As Integer = 1

Dim LastColA As String = ""

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

If LastColA Row.ColA.ToString Then

Counter = 1

End If

Row.ColB = Counter

LastColA = Row.ColA.ToString

Counter += 1

End Sub

End Class

TSQL Tutorials

Raj-1120270 SSC Veteran Points: 281 More actions · Answer 5

Raj-1120270

SSC Veteran

Points: 281

July 27, 2009 at 8:30 am

#1030551

I really appriciate your help

Thanks,

Raj

jannie.engelbrecht-1072033 Grasshopper Points: 16 More actions · Answer 6

Please elaborate on how you get the variable populated for the previouse row "LastColA".

Phil Parkin SSC Guru Points: 247228 More actions · Answer 7

The Input0_ProcessInputRow sub is called for every row in the input data.

As long as the input data is in the required order, this allows the LastColA variable to be populated (and remain in memory, because it is defined outside of the sub and therefore remains in scope) every time a row is processed. It is merely holding what was in the previous row to be read - it does not go and seek anything.

jannie.engelbrecht-1072033 Grasshopper Points: 16 More actions · Answer 8

Thanks for the reply. I'm no developer and therefore steer clear of the script tasks and components, but in this case I need a solution fast. I have created a sample package for this example, and using just the script pasted, my output is:

ColA,ColB,ColC,ColD

123,1,xyz,12

12346B,1,dfe,13

23456A,1,abc,10

ColB is set to 1 always,I therefore still need to get LastColA variable populated. I can google how to populate a variable via the script component, but do I do this in a new component task prior to this one or can this be performed in the start of this script. Any sample code is much appreaciated.

Phil Parkin SSC Guru Points: 247228 More actions · Answer 9

Based on the sample data and code, I cannot see why it would not work for you, as you appear to be doing the right things.

Or, as you are in a hurry, perhaps you should consider use of the Row_Number() solution, as exemplified by Lowell, which can be achieved directly in SQL Server, to remove some of the SSIS complexities.