Optional Foreign Key Lookup Design Pattern

  • Hi,

    SSIS 2008 R2 packages

    The scenario is I am importing data from a file into a table which has a foreign key field which is optional. The foreign key is an integer surrogate key in a reference table.

    The file contains the alternate key value, e.g. for products it would be the name of the product (ProdA, ProdB, ProdC) as opposed to the product key (100,101,102). Hence the need for the Lookup Transformation.

    Only some rows in the import file contain a Product Name value.

    And... if a product name is supplied and is not a valid product name no errors should occur. I have logging system where I can capture issues like this which are handled differently than errors.

    Therefore I have the following potential values for the 'Product Name' in the import:

    * No Product Name

    * Valid Product Name

    * Invalid Product Name

    I currently handle this by:

    1. Use Conditional Split to process the records which have the foreign key value separately

    2. Use Lookup to key the key for the 'Product Name', with redirect no matches to my logging process

    3. Use Union All to combine the Lookup Matches and the Record with no 'Product Name'

    Assuming I can't have an external process which validates and manipulate the import files before hand, does this seem like a valid approach.

    Thanks

    Steve

  • Steve

    I'm finding it hard to visualise this. Please will you show us the table structure and sample data, the SQL to create the foreign key constraint, and some of the data from the file?

    Thanks

    John

  • Hi John thanks for the response.

    The following is made up structures and data:

    -- table of products

    CREATE TABLE xyzProducts (pkProducts INT NOT NULL ,ProductName VARCHAR(20) NOT NULL )

    ALTER TABLE xyzProducts ADD CONSTRAINT [pkkyProducts] PRIMARY KEY (pkProducts)

    INSERT INTO xyzProducts (pkProducts, ProductName)

    VALUES ( 1,'ProductA' )

    ,(2,'Product B' )

    ,(3,'Product C' )

    SELECT * FROM dbo.xyzProducts xp

    -- Tables of customer bill

    -- They will also be billed monthly fee

    -- but may or may not have purchased a product

    CREATE TABLE xyzCustomers (

    pkCustomers INT IDENTITY NOT NULL

    ,CustName VARCHAR(20) NOT NULL

    ,MonthlyFee INT NOT NULL

    ,fkExtraProduct INT NULL)

    ALTER TABLE xyzCustomers ADD CONSTRAINT relCustomerExtraProduct FOREIGN KEY (fkExtraProduct) REFERENCES xyzProducts ([pkProducts])

    INSERT INTO dbo.xyzCustomers ( CustName, MonthlyFee, fkExtraProduct)

    VALUES( 'Bob', 10, 1)

    ,( 'Jane', 10, NULL)

    ,( 'Harry', 10, 2)

    SELECT * FROM dbo.xyzCustomers xc

    --IF OBJECT_ID('xyzCustomers') IS NOT NULL DROP TABLE xyzCustomers

    --IF OBJECT_ID('xyzProducts') IS NOT NULL DROP TABLE xyzProducts

    So, assuming I would be inserting the above data the file, with an additional row which has an invalid product name:

    CustName,MonthlyFee,Product

    Bob,10,ProductA

    Jane,10,

    Harry,10,ProductB

    Betty,20,ProductX

    I hope that helps.

    Thanks

    Steve

  • Steve

    I think what I would do is this:

    (1) Import your file into a staging table

    (2) Move the rows with no product or valid product to the destination table

    (3) Handle the remaining rows with your error process

    I hope that makes sense, and that I haven't oversimplified it in my mind.

    John

  • SteveD SQL (11/1/2016)


    Hi,

    SSIS 2008 R2 packages

    The scenario is I am importing data from a file into a table which has a foreign key field which is optional. The foreign key is an integer surrogate key in a reference table.

    The file contains the alternate key value, e.g. for products it would be the name of the product (ProdA, ProdB, ProdC) as opposed to the product key (100,101,102). Hence the need for the Lookup Transformation.

    Only some rows in the import file contain a Product Name value.

    And... if a product name is supplied and is not a valid product name no errors should occur. I have logging system where I can capture issues like this which are handled differently than errors.

    Therefore I have the following potential values for the 'Product Name' in the import:

    * No Product Name

    * Valid Product Name

    * Invalid Product Name

    I currently handle this by:

    1. Use Conditional Split to process the records which have the foreign key value separately

    2. Use Lookup to key the key for the 'Product Name', with redirect no matches to my logging process

    3. Use Union All to combine the Lookup Matches and the Record with no 'Product Name'

    Assuming I can't have an external process which validates and manipulate the import files before hand, does this seem like a valid approach.

    Thanks

    Steve

    Steve, this pattern makes sense to me.

    I have one suggestion for you to consider, and that is to default the 'empty' product names to a known value 'MissingName' (or whatever), before doing the lookup. Ensure that 'MissingName' appears in your Product dimension. Doing this means that your lookup becomes simpler (match or no match) and avoids the UNION ALL, so may be faster.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • @john-2 - thanks for your suggestion but I try to avoid using staging tables when I can.

    @phil - Sounds like a good suggestions, I will definitely give it a try.

    Thanks to you both

    Steve

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply