Type 2 SCD without merge

  • I 'lost' the code that was used to do type 2 SCD without a merge statement. I'm still hoping to find it but in the meantime I was wondering if other people had done the same.

    I don't want to use the merge statement due to the need to output to a temp table for the change information. I know I could, I know a lot of people do, but the code I lost was an interesting and elegant way to avoid it. Merge can be rather resource intensive and skews the transactions since it shows up as both an insert and update or so I remember.

    Anyone have this in their code library?

  • C'mon. Someone must want to figure out this process.

    From what I remember we did an insert of all new records (changed and truly new). Then there was an update to expire the old record. But for the life of me I can't remember the syntax of the joins.

  • something like this is what i slapped together; does this look right?

    UPDATE MyTarget

    SET

    MyTarget.[IsCurrent] =0,

    MyTarget.[ValidToDate] = getdate()

    FROM [StagingDataDump] MySource

    INNER JOIN [DataDump] MyTarget

    ON MySource.[PK] = MyTarget.[PK]

    AND MySource.[Col2] = MyTarget.[Col2]

    INSERT INTO [DataDump](ColumnList,IsCurrent,ValidFromDate,ValidToDate)

    SELECT MySource .ColumnList,1 AS IsCurrent,getdate() AS ValidFromDate, '12/31/2999' AS ValidToDate

    FROM [StagingDataDump] MySource

    INNER JOIN [DataDump] MyTarget

    ON MySource.[PK] = MyTarget.[PK]

    AND MySource.[Col2] = MyTarget.[Col2]

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • That's backward from what I was remembering but it looks like it might do the trick. Let me test it and then give you praise for doing what I couldn't. 😎

  • Thanks Lowell. I needed that kick out of the rut of 'how did I do this before?' to see how to best approach it here.

    I went even further in separating out the steps. I like a methodical and easily understood approach and what I've done makes it easier to troubleshoot issues.

    My basic steps:

    Add a field in the staging table to identify the records as New or Changed.

    Update the field comparing the tables.

    Expire changed records.

    Insert new records.

    It's not as compact as a merge but it's also more precise in knowing what records are doing what.

  • I know this is really old, but I have to rewrite my MERGE script because its taking 5 hours to run. Did you every re-recreate the script and if so, can you share it.  Thank

    Never mind, just asked BING Chat GPT, and I got the required template ...

    Let’s assume we have a source table source_table and a target dimension table dim_table.

    The source_table has columns ID, Attribute1, Attribute2,

    the dim_table has columns ID, Attribute1, Attribute2, StartDate, and EndDate.

    -- Step 1: Close records that are not in the source table

    UPDATE dim_table

    SET EndDate = GETDATE()

    WHERE ID NOT IN (SELECT ID FROM source_table)

    AND EndDate IS NULL

    -- Step 2: Insert new records from the source table

    INSERT INTO dim_table (ID, Attribute1, Attribute2, StartDate, EndDate)

    SELECT ID, Attribute1, Attribute2, GETDATE(), NULL

    FROM source_table

    WHERE ID NOT IN (SELECT ID FROM dim_table WHERE EndDate IS NULL)

    -- Step 3: Close old records and insert changed records as new ones

    DECLARE @ChangedRecords TABLE (ID INT)

    INSERT INTO @ChangedRecords (ID)

    SELECT s.ID

    FROM source_table s

    JOIN dim_table d ON s.ID = d.ID

    WHERE d.EndDate IS NULL AND (s.Attribute1 <> d.Attribute1 OR s.Attribute2 <> d.Attribute2)

    UPDATE dim_table

    SET EndDate = GETDATE()

    WHERE ID IN (SELECT ID FROM @ChangedRecords)

    AND EndDate IS NULL

    INSERT INTO dim_table (ID, Attribute1, Attribute2, StartDate, EndDate)

    SELECT s.ID, s.Attribute1, s.Attribute2, GETDATE(), NULL

    FROM source_table s

    WHERE s.ID IN (SELECT ID FROM @ChangedRecords)

    • This reply was modified 1 year, 2 months ago by  Blade69.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply