Upadating current tables from holding tables

  • Hi guys and gals, i'm new here so please be gentle. I'm not a novice but aint an expert in T-SQL either and was looking for some advice please...

    What I am trying to do is create a Stored Procedure that when called, copies records from my holding tables to my current tables.

    Both the holding tables and current tables are identical.

    My first thought was just to copy the holding table and replace the current table. However, there are literally millions of rows in each table and i'm feeling this could take some time.

    What I wanted to do is copy just the changes (and creations) from the holding tables into the current tables. Luckily any amendment to the holding tables has a 'Created Date' field which is populated with Getdate()

    This is where I get the problem.

    records in the holding table can be both created or updated (but thankfully none are ever deleted)

    What method is best to create a quick and simple procedure that copies all the records with an 'Created Date' greater than 'x' from the holding table and either INSERTS or UPDATES into the current table?

    Each table can have 40 or 50 columns to write out for each INSERT and UPDATE, and this I guess will be time consuming to code, i was hoping there was a INSERT * or UPDATE * alternative...

    Any advice much appreciated

    Some examples to help explain things better

    HOLD TABLE (created date, Primary Key, data item 1, data item 2)

    01/01/2000, PK1, data A, data B

    05/11/2011, PK2, data C, data D

    05/11/2011, PK3, data E, Data F

    CURRENT TABLE (created date, Primary Key, data item 1, data item 2)

    01/01/2000, PK1, data A, data B

    01/01/2005, PK2, data A, data B

    PK2 record would need to be updated

    PK3 record would need to be inserted

    Thanks

  • A MERGE statement is what you're looking for.

    You cannot use anything like *, you have to explicitly specify the column names for both the INSERT and UPDATE parts. You can easily build the column lists by querying the metadata views (INFORMATION_SCHEMA.COLUMNS or sys.columns).

    The basic syntax would be something like this (untested as you haven't provided sample DDL):

    MERGE HOLDTABLE AS SOURCE

    USING CURRENTTABLE AS TARGET

    ON ( SOURCE.myPK = TARGET.myPK )

    WHEN MATCHED AND ( SOURCE.last_updated >= TARGET.Last_Updated )

    THEN

    UPDATE SET

    col1 = SOURCE.col1 ,

    col2 = SOURCE.col2 ,

    col3 = SOURCE.col3,

    ...

    WHEN NOT MATCHED

    THEN INSERT ( col1, col2, col3 )

    VALUES ( SOURCE.col1 ,

    SOURCE.col2 ,

    SOURCE.col3,

    ...

    )

  • Domestos (10/5/2011)


    Hi guys and gals, i'm new here so please be gentle. I'm not a novice but aint an expert in T-SQL either and was looking for some advice please...

    What I am trying to do is create a Stored Procedure that when called, copies records from my holding tables to my current tables.

    Both the holding tables and current tables are identical.

    My first thought was just to copy the holding table and replace the current table. However, there are literally millions of rows in each table and i'm feeling this could take some time.

    What I wanted to do is copy just the changes (and creations) from the holding tables into the current tables. Luckily any amendment to the holding tables has a 'Created Date' field which is populated with Getdate()

    This is where I get the problem.

    records in the holding table can be both created or updated (but thankfully none are ever deleted)

    What method is best to create a quick and simple procedure that copies all the records with an 'Created Date' greater than 'x' from the holding table and either INSERTS or UPDATES into the current table?

    Each table can have 40 or 50 columns to write out for each INSERT and UPDATE, and this I guess will be time consuming to code, i was hoping there was a INSERT * or UPDATE * alternative...

    Any advice much appreciated

    Some examples to help explain things better

    HOLD TABLE (created date, Primary Key, data item 1, data item 2)

    01/01/2000, PK1, data A, data B

    05/11/2011, PK2, data C, data D

    05/11/2011, PK3, data E, Data F

    CURRENT TABLE (created date, Primary Key, data item 1, data item 2)

    01/01/2000, PK1, data A, data B

    01/01/2005, PK2, data A, data B

    PK2 record would need to be updated

    PK3 record would need to be inserted

    Thanks

    The insert would not be too hard. You just need to do a left join or an exists.

    Something like:

    insert into current (createdDate, PrimaryKey, DataItem1, DataItem2)

    select CreatedDate, primarykey, dataItem1, dataItem2

    from Hold h

    left join Current c on h.primarykey = c.primarykey

    where c.primarykey is null

    The update is harder because you would actually have to compare the columns. Perhaps you can assume that if a new record has been inserted that something has changed in which case you just do the update.

  • Thanks HowardW,

    Could you quickly tell me what the

    WHEN MATCHED AND

    and

    WHEN NOT MATCHED

    statements are matching on? all the fields? or just a primary key?

    also do I need a

    [Code="sql"] AND ( SOURCE.last_updated >= TARGET.Last_Updated ) [/code]

    after the NOT MATCHED statement?

  • Domestos (10/5/2011)


    Thanks HowardW,

    Could you quickly tell me what the

    WHEN MATCHED AND

    and

    WHEN NOT MATCHED

    statements are matching on? all the fields? or just a primary key?

    It's using the ON clause. e.g.:

    ON ( SOURCE.myPK = TARGET.myPK )

    also do I need a

    [Code="sql"] AND ( SOURCE.last_updated >= TARGET.Last_Updated ) [/code]

    after the NOT MATCHED statement?

    Nope. The NOT MATCHED portion is only used for inserts, so it will insert any rows where that row does not exist in the target table - there is no TARGET.last_updated to compare against when it doesn't exist.

    I'd recommend having a good look at the MERGE arcicle in BOL (I linked to it above) to make sure you understand it.

  • Great thx, i'll give it a go and let you know how it goes.

  • After some messing about, lots of strange errors and some frantic head scratching i discovered that I was on SQL Server 2005 , and MERGE doesnt work with SQL Server 2005.

    Went back to the drawing board and came up with this solutions to my problem,

    UPDATE [dbo].[ETL_HL7_TEMP_imp_patient_tbl]

    SET [address1] = [SOURCE].address1

    ,[address2] = [SOURCE].address2

    ,[address3] = [SOURCE].address3

    ...

    ,[change_date] = [SOURCE].change_date

    OUTPUT INSERTED.pasid INTO @tblPasid

    FROM dbo.ETL_HL7_TEMP_imp_patient_tbl [TARGET]

    INNER JOIN [dbo].[ETL_HL7_HOLD_imp_patient_tbl] [SOURCE] ON [TARGET].pasid=[SOURCE].pasid

    WHERE [SOURCE].[change_date]>@update_from

    ;

    INSERT INTO [dbo].[ETL_HL7_TEMP_imp_patient_tbl]

    ([address1]

    ,[address2]

    ...

    ,[change_date])

    SELECT [address1]

    ,[address2]

    ,[address3]

    ...

    ,[change_date]

    FROM [dbo].[ETL_HL7_HOLD_imp_patient_tbl]

    WHERE [change_date]>@update_from AND pasid NOT IN (SELECT pasid from @tblPasid)

  • Yes, it's SQL Server 2008 only. In fairness, you did post your question in the SQL Server 2008 forum 🙂

    Thanks for coming back and sharing your solution!

  • HowardW (10/6/2011)


    In fairness, you did post your question in the SQL Server 2008 forum 🙂

    Thanks for coming back and sharing your solution!

    Ohh yer - but then I thought I was on 2008 anyway, but im only on SSMS 2008 :doze:

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply