September 15, 2021 at 8:17 am
Hi All,
We have a current ETL process that takes a delta and uses MERGE to decide whether the data needs to be updated. The current process uses an audit table (effectively a hand-built CDC) to drive the delta but this has proved to be unreliable. Most tables do have a ROWSTAMP field so we are going to use this to drive the delta instead. (applying CDC is not an option unfortunately)
We have about 120 tables to migrate each run, The current process has a separate stored procedure for each table that has a hard-coded list of fields doing a null = null or value - value comparison E.g. (and this is a fairly narrow table...)
AND
(
SRC.[Asset_ID] <> TGT.[Asset_ID] OR (SRC.[Asset_ID] IS NULL AND TGT.[Asset_ID] IS NOT NULL) OR (SRC.[Asset_ID] IS NOT NULL AND TGT.[Asset_ID] IS NULL)
OR SRC.[Attribute_ID] <> TGT.[Attribute_ID] OR (SRC.[Attribute_ID] IS NULL AND TGT.[Attribute_ID] IS NOT NULL) OR (SRC.[Attribute_ID] IS NOT NULL AND TGT.[Attribute_ID] IS NULL)
OR SRC.[IsDeleted_BT] <> TGT.[IsDeleted_BT] OR (SRC.[IsDeleted_BT] IS NULL AND TGT.[IsDeleted_BT] IS NOT NULL) OR (SRC.[IsDeleted_BT] IS NOT NULL AND TGT.[IsDeleted_BT] IS NULL)
OR SRC.[Location_ID] <> TGT.[Location_ID] OR (SRC.[Location_ID] IS NULL AND TGT.[Location_ID] IS NOT NULL) OR (SRC.[Location_ID] IS NOT NULL AND TGT.[Location_ID] IS NULL)
OR SRC.[Position_ID] <> TGT.[Position_ID] OR (SRC.[Position_ID] IS NULL AND TGT.[Position_ID] IS NOT NULL) OR (SRC.[Position_ID] IS NOT NULL AND TGT.[Position_ID] IS NULL)
OR SRC.[Type_ID] <> TGT.[Type_ID] OR (SRC.[Type_ID] IS NULL AND TGT.[Type_ID] IS NOT NULL) OR (SRC.[Type_ID] IS NOT NULL AND TGT.[Type_ID] IS NULL)
OR SRC.[Condition_ID] <> TGT.[Condition_ID] OR (SRC.[Condition_ID] IS NULL AND TGT.[Condition_ID] IS NOT NULL) OR (SRC.[Condition_ID] IS NOT NULL AND TGT.[Condition_ID] IS NULL)
OR SRC.[Condition_AsAt_DT] <> TGT.[Condition_AsAt_DT] OR (SRC.[Condition_AsAt_DT] IS NULL AND TGT.[Condition_AsAt_DT] IS NOT NULL) OR (SRC.[Condition_AsAt_DT] IS NOT NULL AND TGT.[Condition_AsAt_DT] IS NULL)
OR SRC.[Priority_ID] <> TGT.[Priority_ID] OR (SRC.[Priority_ID] IS NULL AND TGT.[Priority_ID] IS NOT NULL) OR (SRC.[Priority_ID] IS NOT NULL AND TGT.[Priority_ID] IS NULL)
OR SRC.[Manufacturer_ID] <> TGT.[Manufacturer_ID] OR (SRC.[Manufacturer_ID] IS NULL AND TGT.[Manufacturer_ID] IS NOT NULL) OR (SRC.[Manufacturer_ID] IS NOT NULL AND TGT.[Manufacturer_ID] IS NULL)
OR SRC.[Pattern_ID] <> TGT.[Pattern_ID] OR (SRC.[Pattern_ID] IS NULL AND TGT.[Pattern_ID] IS NOT NULL) OR (SRC.[Pattern_ID] IS NOT NULL AND TGT.[Pattern_ID] IS NULL)
OR SRC.[Material_ID] <> TGT.[Material_ID] OR (SRC.[Material_ID] IS NULL AND TGT.[Material_ID] IS NOT NULL) OR (SRC.[Material_ID] IS NOT NULL AND TGT.[Material_ID] IS NULL)
OR SRC.[Colour_ID] <> TGT.[Colour_ID] OR (SRC.[Colour_ID] IS NULL AND TGT.[Colour_ID] IS NOT NULL) OR (SRC.[Colour_ID] IS NOT NULL AND TGT.[Colour_ID] IS NULL)
OR SRC.[Comment_Lines_VC] <> TGT.[Comment_Lines_VC] OR (SRC.[Comment_Lines_VC] IS NULL AND TGT.[Comment_Lines_VC] IS NOT NULL) OR (SRC.[Comment_Lines_VC] IS NOT NULL AND TGT.[Comment_Lines_VC] IS NULL)
OR SRC.[Risk_ID] <> TGT.[Risk_ID] OR (SRC.[Risk_ID] IS NULL AND TGT.[Risk_ID] IS NOT NULL) OR (SRC.[Risk_ID] IS NOT NULL AND TGT.[Risk_ID] IS NULL)
OR SRC.[Group_ID] <> TGT.[Group_ID] OR (SRC.[Group_ID] IS NULL AND TGT.[Group_ID] IS NOT NULL) OR (SRC.[Group_ID] IS NOT NULL AND TGT.[Group_ID] IS NULL)
OR SRC.[Image_Reference_VC] <> TGT.[Image_Reference_VC] OR (SRC.[Image_Reference_VC] IS NULL AND TGT.[Image_Reference_VC] IS NOT NULL) OR (SRC.[Image_Reference_VC] IS NOT NULL AND TGT.[Image_Reference_VC] IS NULL)
OR SRC.[Media_Reference_VC] <> TGT.[Media_Reference_VC] OR (SRC.[Media_Reference_VC] IS NULL AND TGT.[Media_Reference_VC] IS NOT NULL) OR (SRC.[Media_Reference_VC] IS NOT NULL AND TGT.[Media_Reference_VC] IS NULL)
OR SRC.[Current_Value_VC] <> TGT.[Current_Value_VC] OR (SRC.[Current_Value_VC] IS NULL AND TGT.[Current_Value_VC] IS NOT NULL) OR (SRC.[Current_Value_VC] IS NOT NULL AND TGT.[Current_Value_VC] IS NULL)
OR SRC.[Quantity_DC] <> TGT.[Quantity_DC] OR (SRC.[Quantity_DC] IS NULL AND TGT.[Quantity_DC] IS NOT NULL) OR (SRC.[Quantity_DC] IS NOT NULL AND TGT.[Quantity_DC] IS NULL)
OR SRC.[Fitted_Renewed_Date_DT] <> TGT.[Fitted_Renewed_Date_DT] OR (SRC.[Fitted_Renewed_Date_DT] IS NULL AND TGT.[Fitted_Renewed_Date_DT] IS NOT NULL) OR (SRC.[Fitted_Renewed_Date_DT] IS NOT NULL AND TGT.[Fitted_Renewed_Date_DT] IS NULL)
OR SRC.[Default_Expected_Life_IN] <> TGT.[Default_Expected_Life_IN] OR (SRC.[Default_Expected_Life_IN] IS NULL AND TGT.[Default_Expected_Life_IN] IS NOT NULL) OR (SRC.[Default_Expected_Life_IN] IS NOT NULL AND TGT.[Default_Expected_Life_IN] IS NULL)
OR SRC.[Expected_Life_IN] <> TGT.[Expected_Life_IN] OR (SRC.[Expected_Life_IN] IS NULL AND TGT.[Expected_Life_IN] IS NOT NULL) OR (SRC.[Expected_Life_IN] IS NOT NULL AND TGT.[Expected_Life_IN] IS NULL)
OR SRC.[Default_Replacement_Type_ID] <> TGT.[Default_Replacement_Type_ID] OR (SRC.[Default_Replacement_Type_ID] IS NULL AND TGT.[Default_Replacement_Type_ID] IS NOT NULL) OR (SRC.[Default_Replacement_Type_ID] IS NOT NULL AND TGT.[Default_Replacement_Type_ID] IS NULL)
OR SRC.[Replacement_Type_ID] <> TGT.[Replacement_Type_ID] OR (SRC.[Replacement_Type_ID] IS NULL AND TGT.[Replacement_Type_ID] IS NOT NULL) OR (SRC.[Replacement_Type_ID] IS NOT NULL AND TGT.[Replacement_Type_ID] IS NULL)
OR SRC.[Default_Replacement_Pattern_ID] <> TGT.[Default_Replacement_Pattern_ID] OR (SRC.[Default_Replacement_Pattern_ID] IS NULL AND TGT.[Default_Replacement_Pattern_ID] IS NOT NULL) OR (SRC.[Default_Replacement_Pattern_ID] IS NOT NULL AND TGT.[Default_Replacement_Pattern_ID] IS NULL)
OR SRC.[Replacement_Pattern_ID] <> TGT.[Replacement_Pattern_ID] OR (SRC.[Replacement_Pattern_ID] IS NULL AND TGT.[Replacement_Pattern_ID] IS NOT NULL) OR (SRC.[Replacement_Pattern_ID] IS NOT NULL AND TGT.[Replacement_Pattern_ID] IS NULL)
OR SRC.[Default_Replacement_Expected_Life_IN] <> TGT.[Default_Replacement_Expected_Life_IN] OR (SRC.[Default_Replacement_Expected_Life_IN] IS NULL AND TGT.[Default_Replacement_Expected_Life_IN] IS NOT NULL) OR (SRC.[Default_Replacement_Expected_Life_IN] IS NOT NULL AND TGT.[Default_Replacement_Expected_Life_IN] IS NULL)
OR SRC.[Replacement_Expected_Life_IN] <> TGT.[Replacement_Expected_Life_IN] OR (SRC.[Replacement_Expected_Life_IN] IS NULL AND TGT.[Replacement_Expected_Life_IN] IS NOT NULL) OR (SRC.[Replacement_Expected_Life_IN] IS NOT NULL AND TGT.[Replacement_Expected_Life_IN] IS NULL)
OR SRC.[Initial_Override_Replacement_Date_DT] <> TGT.[Initial_Override_Replacement_Date_DT] OR (SRC.[Initial_Override_Replacement_Date_DT] IS NULL AND TGT.[Initial_Override_Replacement_Date_DT] IS NOT NULL) OR (SRC.[Initial_Override_Replacement_Date_DT] IS NOT NULL AND TGT.[Initial_Override_Replacement_Date_DT] IS NULL)
OR SRC.[Subsequent_Override_Replacement_Date_DT] <> TGT.[Subsequent_Override_Replacement_Date_DT] OR (SRC.[Subsequent_Override_Replacement_Date_DT] IS NULL AND TGT.[Subsequent_Override_Replacement_Date_DT] IS NOT NULL) OR (SRC.[Subsequent_Override_Replacement_Date_DT] IS NOT NULL AND TGT.[Subsequent_Override_Replacement_Date_DT] IS NULL)
OR SRC.[Default_Replacement_Cost_DC] <> TGT.[Default_Replacement_Cost_DC] OR (SRC.[Default_Replacement_Cost_DC] IS NULL AND TGT.[Default_Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Default_Replacement_Cost_DC] IS NOT NULL AND TGT.[Default_Replacement_Cost_DC] IS NULL)
OR SRC.[Replacement_Cost_DC] <> TGT.[Replacement_Cost_DC] OR (SRC.[Replacement_Cost_DC] IS NULL AND TGT.[Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Replacement_Cost_DC] IS NOT NULL AND TGT.[Replacement_Cost_DC] IS NULL)
OR SRC.[Default_Replacement_Cost_AsAt_DT] <> TGT.[Default_Replacement_Cost_AsAt_DT] OR (SRC.[Default_Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Default_Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Replacement_Cost_AsAt_DT] <> TGT.[Replacement_Cost_AsAt_DT] OR (SRC.[Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Default_Cyclical_Cost_DC] <> TGT.[Default_Cyclical_Cost_DC] OR (SRC.[Default_Cyclical_Cost_DC] IS NULL AND TGT.[Default_Cyclical_Cost_DC] IS NOT NULL) OR (SRC.[Default_Cyclical_Cost_DC] IS NOT NULL AND TGT.[Default_Cyclical_Cost_DC] IS NULL)
OR SRC.[Cyclical_Cost_DC] <> TGT.[Cyclical_Cost_DC] OR (SRC.[Cyclical_Cost_DC] IS NULL AND TGT.[Cyclical_Cost_DC] IS NOT NULL) OR (SRC.[Cyclical_Cost_DC] IS NOT NULL AND TGT.[Cyclical_Cost_DC] IS NULL)
OR SRC.[Default_Cyclical_Cost_AsAt_DT] <> TGT.[Default_Cyclical_Cost_AsAt_DT] OR (SRC.[Default_Cyclical_Cost_AsAt_DT] IS NULL AND TGT.[Default_Cyclical_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Cyclical_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Cyclical_Cost_AsAt_DT] IS NULL)
OR SRC.[Cyclical_Cost_AsAt_DT] <> TGT.[Cyclical_Cost_AsAt_DT] OR (SRC.[Cyclical_Cost_AsAt_DT] IS NULL AND TGT.[Cyclical_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Cyclical_Cost_AsAt_DT] IS NOT NULL AND TGT.[Cyclical_Cost_AsAt_DT] IS NULL)
OR SRC.[Cyclical_Date_DT] <> TGT.[Cyclical_Date_DT] OR (SRC.[Cyclical_Date_DT] IS NULL AND TGT.[Cyclical_Date_DT] IS NOT NULL) OR (SRC.[Cyclical_Date_DT] IS NOT NULL AND TGT.[Cyclical_Date_DT] IS NULL)
OR SRC.[Override_Cyclical_Date_DT] <> TGT.[Override_Cyclical_Date_DT] OR (SRC.[Override_Cyclical_Date_DT] IS NULL AND TGT.[Override_Cyclical_Date_DT] IS NOT NULL) OR (SRC.[Override_Cyclical_Date_DT] IS NOT NULL AND TGT.[Override_Cyclical_Date_DT] IS NULL)
OR SRC.[Default_Cyclical_Period_IN] <> TGT.[Default_Cyclical_Period_IN] OR (SRC.[Default_Cyclical_Period_IN] IS NULL AND TGT.[Default_Cyclical_Period_IN] IS NOT NULL) OR (SRC.[Default_Cyclical_Period_IN] IS NOT NULL AND TGT.[Default_Cyclical_Period_IN] IS NULL)
OR SRC.[Cyclical_Period_IN] <> TGT.[Cyclical_Period_IN] OR (SRC.[Cyclical_Period_IN] IS NULL AND TGT.[Cyclical_Period_IN] IS NOT NULL) OR (SRC.[Cyclical_Period_IN] IS NOT NULL AND TGT.[Cyclical_Period_IN] IS NULL)
OR SRC.[Last_Cyclical_Cost_ID] <> TGT.[Last_Cyclical_Cost_ID] OR (SRC.[Last_Cyclical_Cost_ID] IS NULL AND TGT.[Last_Cyclical_Cost_ID] IS NOT NULL) OR (SRC.[Last_Cyclical_Cost_ID] IS NOT NULL AND TGT.[Last_Cyclical_Cost_ID] IS NULL)
OR SRC.[Last_Planned_Cost_ID] <> TGT.[Last_Planned_Cost_ID] OR (SRC.[Last_Planned_Cost_ID] IS NULL AND TGT.[Last_Planned_Cost_ID] IS NOT NULL) OR (SRC.[Last_Planned_Cost_ID] IS NOT NULL AND TGT.[Last_Planned_Cost_ID] IS NULL)
OR SRC.[User_ID] <> TGT.[User_ID] OR (SRC.[User_ID] IS NULL AND TGT.[User_ID] IS NOT NULL) OR (SRC.[User_ID] IS NOT NULL AND TGT.[User_ID] IS NULL)
OR SRC.[Created_Date_DT] <> TGT.[Created_Date_DT] OR (SRC.[Created_Date_DT] IS NULL AND TGT.[Created_Date_DT] IS NOT NULL) OR (SRC.[Created_Date_DT] IS NOT NULL AND TGT.[Created_Date_DT] IS NULL)
OR SRC.[Feature_ID] <> TGT.[Feature_ID] OR (SRC.[Feature_ID] IS NULL AND TGT.[Feature_ID] IS NOT NULL) OR (SRC.[Feature_ID] IS NOT NULL AND TGT.[Feature_ID] IS NULL)
OR SRC.[Damage_ID] <> TGT.[Damage_ID] OR (SRC.[Damage_ID] IS NULL AND TGT.[Damage_ID] IS NOT NULL) OR (SRC.[Damage_ID] IS NOT NULL AND TGT.[Damage_ID] IS NULL)
OR SRC.[Accessibility_ID] <> TGT.[Accessibility_ID] OR (SRC.[Accessibility_ID] IS NULL AND TGT.[Accessibility_ID] IS NOT NULL) OR (SRC.[Accessibility_ID] IS NOT NULL AND TGT.[Accessibility_ID] IS NULL)
OR SRC.[Access_Restrictions_VC] <> TGT.[Access_Restrictions_VC] OR (SRC.[Access_Restrictions_VC] IS NULL AND TGT.[Access_Restrictions_VC] IS NOT NULL) OR (SRC.[Access_Restrictions_VC] IS NOT NULL AND TGT.[Access_Restrictions_VC] IS NULL)
OR SRC.[Special_Instructions_VC] <> TGT.[Special_Instructions_VC] OR (SRC.[Special_Instructions_VC] IS NULL AND TGT.[Special_Instructions_VC] IS NOT NULL) OR (SRC.[Special_Instructions_VC] IS NOT NULL AND TGT.[Special_Instructions_VC] IS NULL)
OR SRC.[Default_Subsequent_Replacement_Cost_DC] <> TGT.[Default_Subsequent_Replacement_Cost_DC] OR (SRC.[Default_Subsequent_Replacement_Cost_DC] IS NULL AND TGT.[Default_Subsequent_Replacement_Cost_DC] IS NOT NULL) OR (SRC.[Default_Subsequent_Replacement_Cost_DC] IS NOT NULL AND TGT.[Default_Subsequent_Replacement_Cost_DC] IS NULL)
OR SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] <> TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] OR (SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NULL AND TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NOT NULL) OR (SRC.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NOT NULL AND TGT.[Default_Subsequent_Replacement_Cost_AsAt_DT] IS NULL)
OR SRC.[Max_Smooth_Date_DT] <> TGT.[Max_Smooth_Date_DT] OR (SRC.[Max_Smooth_Date_DT] IS NULL AND TGT.[Max_Smooth_Date_DT] IS NOT NULL) OR (SRC.[Max_Smooth_Date_DT] IS NOT NULL AND TGT.[Max_Smooth_Date_DT] IS NULL)
)
The business is unwilling to allow me to use a 3rd party SSIS plugins like Konesan's Hash task so I am looking for alternatives that mean we don't have to maintain a complex and labour intensive set of stored procedures. I have two broad strategies which I am considering.
I like 1 but I think it could be a real performance hog as for each row in the ETL process it is effectively going to have to rebuild the dynamic SQL but means the process is more robust because it is not sensitive to changes in schema for the source tables (They would however give a different HASH result so all would get updated on first delta)
We are currently running on-prem VMs on SQL 2017 for the OLTP and Azure hosted VMs on SQL 2016 for the DWH, however there is a strong possibility that the DWH will move to SQL Azure in the near future just in case that changes the proposed solution.
September 15, 2021 at 9:50 am
Not answering your question, but a quick note about your existing SQL, as it can be made considerably more concise, using the following structure:
SELECT *
FROM src
JOIN trg
ON src.Id = trg.Id
WHERE NOT EXISTS
(
SELECT src.col1, src.col2 INTERSECT SELECT trg.col1, trg.col2
);
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 15, 2021 at 2:41 pm
#1 is (way) too much overhead.
#2 is better. You want static code that is generated dynamically.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 27, 2021 at 11:12 pm
check the following link(s) if it helps:
1) https://michaeljswart.com/2012/05/t-sql-merge-statement-syntax-generator/
2) https://www.pigeonsql.com/single-post/Dynamic-SQL-Merge-Script-for-Generate-Merge-statement
CDC would have been better as there are only 4 operation code to find which rows are altered viz. Insert, Delete, Before Update and After Update. Definitely, each table needs a separate data flow in SSIS.
Good luck.
=======================================================================
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply