Efficient way to insert from foreign key lookups.

  • This is a simple problem to solve, but looking for an efficient way to do the following.

    An SSIS task imports data from a flat file and inserts the data into a staging table. The staging table holds the data in its raw form.

    A second process then selects the data from the staging table, looking up the foreign key id's for raw data values, and then inserts the data into the live table.

    SQL - Only key columns shown for clarity

    -- Staging Table

    CREATE TABLE Staging

    (Information VARCHAR(10),

    MachineName VARCHAR(10),

    Status VARCHAR(10))

    -- Live Table:

    CREATE TABLE LiveTable

    (Information VARCHAR(10),

    MachineID TINYINT, -- Foreign key of Machine table ID

    StatusId TINYINT) -- Foreign key of Status table ID

    -- Machine Table:

    CREATE TABLE Machine

    (MachineId TINYINT IDENTITY (1,1),

    MachineName VARCHAR(10),

    --- -Lots of other columns of machine information

    )

    -- Status Table

    CREATE TABLE Status

    (StatusId TINYINT IDENTITY (1,1),

    Status VARCHAR(10)

    )

    Example of Staging data could be:

    Information = "A123456"

    Machine = "Machine1"

    Status = "Success"

    The insert into the live table should look up the id for machine 1, and the id of status success and insert the foreign key values into the live table for the row.

    There could be 1000's of rows for the output of machine 1 all with different status's - (all pre set in the Status table, i.e success, failure, rerun) and the same for lots of other machines held in the machine table.

    What is the best to insert this data all in one go, rather than reading each row of the staging table one by one, looking up the foreign key values depending on the machine and status values, then inserting the data.

    I was thinking along the lines of:

    INSERT INTO dbo.LiveTable

    (Information, MachineID, StatusId)

    SELECT Staging.Information, dbo.Machine.MachineId, dbo.Status.StatusId

    FROM dbo.Staging

    JOIN Machine ON Machine.MachineName = Staging.MachineName

    JOIN STATUS ON Status.Status = Staging.Status

    But I notice the problem with this is, it doubles up the inserts!

  • ?

    INSERT INTO dbo.LiveTable

    (Information, MachineID, StatusId)

    SELECT Staging.Information, dbo.Machine.MachineId, dbo.Status.StatusId

    FROM dbo.Staging

    JOIN Machine ON Machine.MachineName = Staging.MachineName

    JOIN STATUS ON Status.Status = Staging.Status

    WHERE NOT EXISTS

    (

    SELECT NULL

    FROM dbo.LiveTable AS LT

    WHERE

    LT.Information = Staging.Information AND

    LT.MachineID = dbo.Machine.MachineId AND

    LT.StatusId = dbo.Status.StatusId

    )


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • INSERT INTO dbo.LiveTable

    (Information, MachineId, StatusId)

    SELECT s.Information,

    (SELECT m.MachineId

    FROM dbo.Machine m

    WHERE

    m.MachineName = s.MachineName) AS MachineID,

    (SELECT st.StatusId

    FROM dbo.Status st

    WHERE

    st.Status = s.Status ) AS StatusId

    FROM dbo.Staging s

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply