March 13, 2013 at 1:20 am
Currently working on a project for a client which involves me building a matching solution in SSIS. One of the packages I am building needs a conditional lookup. i.e. Do a lookup on a column if it is not blank/null. There are total of 5 columns. If all 5 are blank then don't do the lookup. If 3 are filled do the lookup on the 3 fields etc. Anybody have experience implementing this sort of logic in ssis? I have built a package where I have implanted the soom look up logic. The attached image shows my data flow.
the problem with this solution is that if an incoming row has values in both the fingerprintID field and ISRC field. A lookup is done on only the fingerprintID as it is the first condition in the conditional split .
March 13, 2013 at 7:17 am
Have you tried writing a SQL scalar function to accomplish it? That would be the way I'd begin.
March 13, 2013 at 7:21 am
I am not sure what you mean. Could you expand a little please
March 13, 2013 at 7:48 am
Something like:
create function dbo.Conditional_Lookup
(
@vID int
)
RETURNS varchar(255)
as
BEGIN
declare @vOutput varchar(255)
declare @v1 varchar(255)
declare @v2 varchar(255)
select @v1 = IsNull(MyField, '') from MyTable where ID=@vID
if (@v1 = '')
begin
select @v1 = IsNull(MyField2, '') from MyTable where ID=@vID
end
...
set @vOutput = @v1
RETURN @vOutput
END
I'm not trying to reproduce your specific logic in this example function. The point being conveyed is that a scalar function can contain your complex conditional logic AND be part of your select list; i.e.,
select dbo.Conditional_Lookup(myTable.ID), myTable.Field2, etc.
If this still doesn't make sense, please take a few minutes to review scalar-functions.
March 13, 2013 at 8:09 am
Hey thanks for that. That has given me some ideas. Will go away and test.
March 13, 2013 at 3:26 pm
It will get very ugly, but this can be done with a conditional split, multiple outputs, lookups, and union alls. Not ideal by any means, but can be done. I'd be interested to know how the returned values differ between, say, a 3/5 column lookup versus a 5/5 column lookup - from a business perspective.
March 14, 2013 at 3:44 am
Have you thought about doing the lookups in sequence, regardless of blank value or not, and setting the 'Specify how to handle rows with no matching entries' property of the lookup to 'Ignore failure'?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
March 14, 2013 at 6:42 pm
After a while I have decided to create the lookup functionality in T-SQL. I have achieved this by using left outer joins and returning the same column for each join with a different alias. My code is as follows:
WITH testCTE (AutoMatchID, TITLE, [FINGERPRINTID], LabelCode, SetCatalogueNumber, ISAN, ReferenceNumber, DSPID,CATALOGUENUMBER,RESOURCEID, [INTERNATIONALSTANDARDRECORDINGCODE],[INTERNATIONALSTANDARDWORKCODE],[EANBARCODE],ParameterSetID ,RECID,[TableID])
AS
(
--- USAGEJOURNALTRANS_ONLINE
SELECT B.AutoMatchID
, A.TITLE
, NULL AS [FINGERPRINTID]
, NULL AS LabelCode
, NULL AS SetCatalogueNumber
, NULL AS ISAN
, NULL AS ReferenceNumber
, A.DSPID
, A.CATALOGUENUMBER
, A.RESOURCEID
, A.[INTERNATIONALSTANDARDRECORDINGCODE]
, A.[INTERNATIONALSTANDARDWORKCODE]
, A.[EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_ONLINE] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102787
UNION ALL
--USAGEJOURNALTRANS_CL
SELECT B.AutoMatchID
, A.TITLE
, NULL AS [FINGERPRINTID]
, NULL AS LabelCode
, A.SetCatalogueNumber
, NULL AS ISAN
, NULL AS ReferenceNumber
, NULL AS DSPID
, A.CATALOGUENUMBER
, NULL AS RESOURCEID
, NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]
, NULL AS [INTERNATIONALSTANDARDWORKCODE]
, NULL AS [EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_CL] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102791
UNION ALL
--USAGEJOURNALTRANS_FOREIGN
SELECT B.AutoMatchID
, A.TITLE
, NULL AS [FINGERPRINTID]
, NULL AS LabelCode
, NULL AS SetCatalogueNumber
, A.ISAN
, A.ReferenceNumber
, NULL AS DSPID
, A.CATALOGUENUMBER
, NULL AS RESOURCEID
, A.[INTERNATIONALSTANDARDRECORDINGCODE]
, A.[INTERNATIONALSTANDARDWORKCODE]
, A.[EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_FOREIGN] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102793
UNION ALL
--USAGEJOURNALTRANS_INDUSTRY
SELECT B.AutoMatchID
, A.TITLE
, NULL AS [FINGERPRINTID]
, A.LabelCode
, NULL AS SetCatalogueNumber
, NULL AS ISAN
, A.ReferenceNumber
, NULL AS DSPID
, A.CATALOGUENUMBER
, NULL AS RESOURCEID
, NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]
, NULL AS [INTERNATIONALSTANDARDWORKCODE]
, NULL AS [EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_INDUSTRY] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102785
UNION ALL
--USAGEJOURNALTRANS_LIVEEVENTS
SELECT B.AutoMatchID
, A.TITLE
, A. [FINGERPRINTID]
, NULL AS LabelCode
, NULL AS SetCatalogueNumber
, NULL AS ISAN
, NULL AS ReferenceNumber
, NULL AS DSPID
, NULL AS CATALOGUENUMBER
, NULL AS RESOURCEID
, NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]
, NULL AS [INTERNATIONALSTANDARDWORKCODE]
, NULL AS [EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_LIVEEVENTS] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102783
UNION ALL
--USAGEJOURNALTRANS_MEDIA
SELECT B.AutoMatchID
, A.TITLE
, A. [FINGERPRINTID]
, NULL AS LabelCode
, NULL AS SetCatalogueNumber
, NULL AS ISAN
, NULL AS ReferenceNumber
, NULL AS DSPID
, A. CATALOGUENUMBER
, NULL AS RESOURCEID
, A. [INTERNATIONALSTANDARDRECORDINGCODE]
, NULL AS [INTERNATIONALSTANDARDWORKCODE]
, NULL AS [EANBARCODE]
, B.ParameterSetID
, B.RECID
, B.[TableID]
FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_MEDIA] A INNER JOIN [AutoMatch].[dbo].[AID] B
ON A.RECID = B.RECID
AND B.[TableID] = 102781
)
(
SELECT A.*
, COALESCE(B.[CREATIONID],'') AS F_CreationID
, COALESCE(C.[CREATIONID],'') AS Ind_CreationID
, COALESCE(D.[CREATIONID],'') AS ISWC_CreationID
, COALESCE(E.[CREATIONID],'') AS ISRC_CreationID
, COALESCE(F.[CREATIONID],'') AS ISAN_CreationID
, COALESCE(G.[CREATIONID],'') AS EAN_CreationID
FROM testCTE A
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] B
ON B.[CRNUMBER] = A.[FINGERPRINTID] AND B.[CRNUMBERTYPEID] = 'fingerprin'
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] C
ON c.[CRNUMBER] = A.[FINGERPRINTID] AND C.[CRNUMBERTYPEID] = 'Ind Conv'
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] D
ON D.[CRNUMBER] = A.[INTERNATIONALSTANDARDWORKCODE] AND D.[CRNUMBERTYPEID] = 'ISWC'
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] E
ON E.[CRNUMBER] = A.[INTERNATIONALSTANDARDRECORDINGCODE] AND E.[CRNUMBERTYPEID] = 'ISRC'
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] F
ON F.[CRNUMBER] = A.[ISAN] AND F.[CRNUMBERTYPEID] = 'ISAN'
LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] G
ON G.[CRNUMBER] = A.[EANBARCODE] AND G.[CRNUMBERTYPEID] = 'EAN'
)
I would like to run the data through script component and loop through the _CreationID columns per row which are not null or blank and check that the column values are the same. However I am struggling to write the vb code int the script editor. I think the process should be
loop through columns whose names contain _creationID and are not blank/null
load values into an array
check if all elements in array are have the same value
if the elements have the same value create new output column with value 'matched'
if the elements are not matched create new output column with value 'non_matched'
My code thus far is:
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo
Dim previousRow As String
For Each columnValue In Row.GetType().GetProperties()
If (columnValue.Name.EndsWith("_CreationID") And columnValue.GetValue(Row, Nothing).ToString() = "") Then
Columnval = columnValue.GetValue(Row, Nothing).ToString()
End If
Next
'
End Sub
End Class
Any ideas?
March 18, 2013 at 4:27 pm
I managed to get this working with some tweaking. I used a derived column transform to create a field that concatenates the lookup fields with a semi colon delimeter. The expression was as follows:
(ISNULL(F_CreationID) ? "" : F_CreationID) + ";" + (ISNULL(Ind_CreationID) ? "" : Ind_CreationID) + ";" + (ISNULL(ISWC_CreationID) ? "" : ISWC_CreationID) + ";" + (ISNULL(ISRC_CreationID) ? "" : ISRC_CreationID) + ";" + (ISNULL(ISAN_CreationID) ? "" : ISAN_CreationID) + ";" + (ISNULL(EAN_CreationID) ? "" : EAN_CreationID)
I then script transfom. I the loaded the concatenated column into an array. I then deleted elements and checked if the elements were equal and then redirected outputs appropriately. The script was as follows.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
'
' Add your code here
'
'Call is IsAlphaNumeric function to check if field contains only semi colons. If true then there is no match
If IsAlphaNumeric(Row.concatCreationID) Then
Dim vals() As String = Strings.Split(Row.concatCreationID, ";")
'Creat an split sting by delimeter and load array
Dim ListVals As List(Of String) = vals.ToList()
'Load array contents to list. List is chosen so we can easily add and remore elements
Dim g As Integer
'remove non=empty elements from list
For g = ListVals.Count - 1 To 0 Step -1
If ListVals(g) = "" Then
ListVals.RemoveAt(g)
End If
Next
'If list contains only 1 item send it directl to Match output
If ListVals.Count = 1 Then
Row.MatchCreationID = ListVals(0)
Row.DirectRowToMatch()
'If list contains more than 1 element and List check returns true i.e. elements are the same
'direct row to Match output
ElseIf ListVals.Count > 1 And Listcheck(ListVals) Then
Row.MatchCreationID = ListVals(0)
Row.DirectRowToMatch()
Else
Row.DirectRowToSuggestions()
End If
Else
Row.DirectRowToNoMatch()
End If
End Sub
Public Function IsAlphaNumeric(ByVal strToCheck As String) As Boolean
Dim pattern As Regex = New Regex("[^;*]")
Return pattern.IsMatch(strToCheck)
End Function
Public Function Listcheck(ByVal ListToCheck As List(Of String)) As Boolean
Listcheck = True
For I As Integer = 0 To ListToCheck.Count
If ListToCheck(0) <> ListToCheck(I) Then
Listcheck = False
Exit For
End If
Next
End Function
Seems to have done the trick. Many thanks for the help
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply