January 4, 2012 at 8:05 am
Hi All,
Hope you can help me out with the following issue.
Requirement:
Find the most Frequent date (mode) in the column and store it in a variable to populate a column with this constant value.
Data Source: flat file
Aggregate Transform: This is used to perform a count and group by on the date column: Eg Output
From(Date Column) Count
01-01-2012 1
02-01-2012 15
03-01-2012 3
Script Component: This will then pick out the highest value for the 'Count' column and store the related date value in a variable called 'BillDate'. In this case it will store '02-01-2012'.
This variable will then be used to populate an entire new column which will be joined to the existing dataset.
Code:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
Inherits UserComponent
Public MaxCount As Integer
Public StrBillDate As String
Dim myvar As IDTSVariables100
Public Overrides Sub PreExecute()
MyBase.PreExecute()
'
' Add your code here for preprocessing or remove if not needed
'
MaxCount = 0
End Sub
Public Overrides Sub PostExecute()
MyBase.PostExecute()
'
' Add your code here for postprocessing or remove if not needed
' You can set read/write variables here, for example:
' Me.Variables.MyIntVar = 100
'
Me.Variables.BillDate = StrBillDate
'MessageBox.Show(Me.Variables.BillDate)
End Sub
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
'
' Add your code here
'
If MaxCount < Row.InputCountAll Then
MaxCount = Row.InputCountAll
StrBillDate = Row.InputFromBillDate
Row.DateOutput = StrBillDate
End If
End Sub
Package Variable: BillDate
I have attached a screen print which may help.
Any help would be appreciated.
Many thanks,
January 5, 2012 at 9:08 am
I realize that your data source is a flat file, but could you import the flat file into a SQL Server table or temp table. Then you could do something as simple as:
SELECT TOP 1 COUNT(*), DateColumn
FROM dbo.TEMP_TABLE_FROM_FLAT_FILE
GROUP BY DateColumn
ORDER BY COUNT(*) DESC;
(not great naming conventions, but you get the idea)
This would get you the mode date; put this in an execute SQL Task, and can send that to a package variable if you desire.
HTH,
Rob
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply