Conditional Lookup

  • Currently working on a project for a client which involves me building a matching solution in SSIS. One of the packages I am building needs a conditional lookup. i.e. Do a lookup on a column if it is not blank/null. There are total of 5 columns. If all 5 are blank then don't do the lookup. If 3 are filled do the lookup on the 3 fields etc. Anybody have experience implementing this sort of logic in ssis? I have built a package where I have implanted the soom look up logic. The attached image shows my data flow.

    the problem with this solution is that if an incoming row has values in both the fingerprintID field and ISRC field. A lookup is done on only the fingerprintID as it is the first condition in the conditional split .

  • Have you tried writing a SQL scalar function to accomplish it? That would be the way I'd begin.

  • I am not sure what you mean. Could you expand a little please

  • Something like:

    create function dbo.Conditional_Lookup

    (

    @vID int

    )

    RETURNS varchar(255)

    as

    BEGIN

    declare @vOutput varchar(255)

    declare @v1 varchar(255)

    declare @v2 varchar(255)

    select @v1 = IsNull(MyField, '') from MyTable where ID=@vID

    if (@v1 = '')

    begin

    select @v1 = IsNull(MyField2, '') from MyTable where ID=@vID

    end

    ...

    set @vOutput = @v1

    RETURN @vOutput

    END

    I'm not trying to reproduce your specific logic in this example function. The point being conveyed is that a scalar function can contain your complex conditional logic AND be part of your select list; i.e.,

    select dbo.Conditional_Lookup(myTable.ID), myTable.Field2, etc.

    If this still doesn't make sense, please take a few minutes to review scalar-functions.

  • Hey thanks for that. That has given me some ideas. Will go away and test.

  • It will get very ugly, but this can be done with a conditional split, multiple outputs, lookups, and union alls. Not ideal by any means, but can be done. I'd be interested to know how the returned values differ between, say, a 3/5 column lookup versus a 5/5 column lookup - from a business perspective.

  • Have you thought about doing the lookups in sequence, regardless of blank value or not, and setting the 'Specify how to handle rows with no matching entries' property of the lookup to 'Ignore failure'?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • After a while I have decided to create the lookup functionality in T-SQL. I have achieved this by using left outer joins and returning the same column for each join with a different alias. My code is as follows:

    WITH testCTE (AutoMatchID, TITLE, [FINGERPRINTID], LabelCode, SetCatalogueNumber, ISAN, ReferenceNumber, DSPID,CATALOGUENUMBER,RESOURCEID, [INTERNATIONALSTANDARDRECORDINGCODE],[INTERNATIONALSTANDARDWORKCODE],[EANBARCODE],ParameterSetID ,RECID,[TableID])

    AS

    (

    --- USAGEJOURNALTRANS_ONLINE

    SELECT B.AutoMatchID

    , A.TITLE

    , NULL AS [FINGERPRINTID]

    , NULL AS LabelCode

    , NULL AS SetCatalogueNumber

    , NULL AS ISAN

    , NULL AS ReferenceNumber

    , A.DSPID

    , A.CATALOGUENUMBER

    , A.RESOURCEID

    , A.[INTERNATIONALSTANDARDRECORDINGCODE]

    , A.[INTERNATIONALSTANDARDWORKCODE]

    , A.[EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_ONLINE] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102787

    UNION ALL

    --USAGEJOURNALTRANS_CL

    SELECT B.AutoMatchID

    , A.TITLE

    , NULL AS [FINGERPRINTID]

    , NULL AS LabelCode

    , A.SetCatalogueNumber

    , NULL AS ISAN

    , NULL AS ReferenceNumber

    , NULL AS DSPID

    , A.CATALOGUENUMBER

    , NULL AS RESOURCEID

    , NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]

    , NULL AS [INTERNATIONALSTANDARDWORKCODE]

    , NULL AS [EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_CL] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102791

    UNION ALL

    --USAGEJOURNALTRANS_FOREIGN

    SELECT B.AutoMatchID

    , A.TITLE

    , NULL AS [FINGERPRINTID]

    , NULL AS LabelCode

    , NULL AS SetCatalogueNumber

    , A.ISAN

    , A.ReferenceNumber

    , NULL AS DSPID

    , A.CATALOGUENUMBER

    , NULL AS RESOURCEID

    , A.[INTERNATIONALSTANDARDRECORDINGCODE]

    , A.[INTERNATIONALSTANDARDWORKCODE]

    , A.[EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_FOREIGN] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102793

    UNION ALL

    --USAGEJOURNALTRANS_INDUSTRY

    SELECT B.AutoMatchID

    , A.TITLE

    , NULL AS [FINGERPRINTID]

    , A.LabelCode

    , NULL AS SetCatalogueNumber

    , NULL AS ISAN

    , A.ReferenceNumber

    , NULL AS DSPID

    , A.CATALOGUENUMBER

    , NULL AS RESOURCEID

    , NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]

    , NULL AS [INTERNATIONALSTANDARDWORKCODE]

    , NULL AS [EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_INDUSTRY] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102785

    UNION ALL

    --USAGEJOURNALTRANS_LIVEEVENTS

    SELECT B.AutoMatchID

    , A.TITLE

    , A. [FINGERPRINTID]

    , NULL AS LabelCode

    , NULL AS SetCatalogueNumber

    , NULL AS ISAN

    , NULL AS ReferenceNumber

    , NULL AS DSPID

    , NULL AS CATALOGUENUMBER

    , NULL AS RESOURCEID

    , NULL AS [INTERNATIONALSTANDARDRECORDINGCODE]

    , NULL AS [INTERNATIONALSTANDARDWORKCODE]

    , NULL AS [EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_LIVEEVENTS] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102783

    UNION ALL

    --USAGEJOURNALTRANS_MEDIA

    SELECT B.AutoMatchID

    , A.TITLE

    , A. [FINGERPRINTID]

    , NULL AS LabelCode

    , NULL AS SetCatalogueNumber

    , NULL AS ISAN

    , NULL AS ReferenceNumber

    , NULL AS DSPID

    , A. CATALOGUENUMBER

    , NULL AS RESOURCEID

    , A. [INTERNATIONALSTANDARDRECORDINGCODE]

    , NULL AS [INTERNATIONALSTANDARDWORKCODE]

    , NULL AS [EANBARCODE]

    , B.ParameterSetID

    , B.RECID

    , B.[TableID]

    FROM [DAX_ACC].[dbo].[USAGEJOURNALTRANS_MEDIA] A INNER JOIN [AutoMatch].[dbo].[AID] B

    ON A.RECID = B.RECID

    AND B.[TableID] = 102781

    )

    (

    SELECT A.*

    , COALESCE(B.[CREATIONID],'') AS F_CreationID

    , COALESCE(C.[CREATIONID],'') AS Ind_CreationID

    , COALESCE(D.[CREATIONID],'') AS ISWC_CreationID

    , COALESCE(E.[CREATIONID],'') AS ISRC_CreationID

    , COALESCE(F.[CREATIONID],'') AS ISAN_CreationID

    , COALESCE(G.[CREATIONID],'') AS EAN_CreationID

    FROM testCTE A

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] B

    ON B.[CRNUMBER] = A.[FINGERPRINTID] AND B.[CRNUMBERTYPEID] = 'fingerprin'

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] C

    ON c.[CRNUMBER] = A.[FINGERPRINTID] AND C.[CRNUMBERTYPEID] = 'Ind Conv'

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] D

    ON D.[CRNUMBER] = A.[INTERNATIONALSTANDARDWORKCODE] AND D.[CRNUMBERTYPEID] = 'ISWC'

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] E

    ON E.[CRNUMBER] = A.[INTERNATIONALSTANDARDRECORDINGCODE] AND E.[CRNUMBERTYPEID] = 'ISRC'

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] F

    ON F.[CRNUMBER] = A.[ISAN] AND F.[CRNUMBERTYPEID] = 'ISAN'

    LEFT OUTER JOIN [ReferenceData].[dbo].[CRALTNUMBERS] G

    ON G.[CRNUMBER] = A.[EANBARCODE] AND G.[CRNUMBERTYPEID] = 'EAN'

    )

    I would like to run the data through script component and loop through the _CreationID columns per row which are not null or blank and check that the column values are the same. However I am struggling to write the vb code int the script editor. I think the process should be

    loop through columns whose names contain _creationID and are not blank/null

    load values into an array

    check if all elements in array are have the same value

    if the elements have the same value create new output column with value 'matched'

    if the elements are not matched create new output column with value 'non_matched'

    My code thus far is:

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    Dim rowType As Type = Row.GetType()

    Dim columnValue As PropertyInfo

    Dim previousRow As String

    For Each columnValue In Row.GetType().GetProperties()

    If (columnValue.Name.EndsWith("_CreationID") And columnValue.GetValue(Row, Nothing).ToString() = "") Then

    Columnval = columnValue.GetValue(Row, Nothing).ToString()

    End If

    Next

    '

    End Sub

    End Class

    Any ideas?

  • I managed to get this working with some tweaking. I used a derived column transform to create a field that concatenates the lookup fields with a semi colon delimeter. The expression was as follows:

    (ISNULL(F_CreationID) ? "" : F_CreationID) + ";" + (ISNULL(Ind_CreationID) ? "" : Ind_CreationID) + ";" + (ISNULL(ISWC_CreationID) ? "" : ISWC_CreationID) + ";" + (ISNULL(ISRC_CreationID) ? "" : ISRC_CreationID) + ";" + (ISNULL(ISAN_CreationID) ? "" : ISAN_CreationID) + ";" + (ISNULL(EAN_CreationID) ? "" : EAN_CreationID)

    I then script transfom. I the loaded the concatenated column into an array. I then deleted elements and checked if the elements were equal and then redirected outputs appropriately. The script was as follows.

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    '

    ' Add your code here

    '

    'Call is IsAlphaNumeric function to check if field contains only semi colons. If true then there is no match

    If IsAlphaNumeric(Row.concatCreationID) Then

    Dim vals() As String = Strings.Split(Row.concatCreationID, ";")

    'Creat an split sting by delimeter and load array

    Dim ListVals As List(Of String) = vals.ToList()

    'Load array contents to list. List is chosen so we can easily add and remore elements

    Dim g As Integer

    'remove non=empty elements from list

    For g = ListVals.Count - 1 To 0 Step -1

    If ListVals(g) = "" Then

    ListVals.RemoveAt(g)

    End If

    Next

    'If list contains only 1 item send it directl to Match output

    If ListVals.Count = 1 Then

    Row.MatchCreationID = ListVals(0)

    Row.DirectRowToMatch()

    'If list contains more than 1 element and List check returns true i.e. elements are the same

    'direct row to Match output

    ElseIf ListVals.Count > 1 And Listcheck(ListVals) Then

    Row.MatchCreationID = ListVals(0)

    Row.DirectRowToMatch()

    Else

    Row.DirectRowToSuggestions()

    End If

    Else

    Row.DirectRowToNoMatch()

    End If

    End Sub

    Public Function IsAlphaNumeric(ByVal strToCheck As String) As Boolean

    Dim pattern As Regex = New Regex("[^;*]")

    Return pattern.IsMatch(strToCheck)

    End Function

    Public Function Listcheck(ByVal ListToCheck As List(Of String)) As Boolean

    Listcheck = True

    For I As Integer = 0 To ListToCheck.Count

    If ListToCheck(0) <> ListToCheck(I) Then

    Listcheck = False

    Exit For

    End If

    Next

    End Function

    Seems to have done the trick. Many thanks for the help

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply