This article is focused on beginners who have already started writing SQL queries and are now diving deep towards more data processing and complex queries in SQL. When we talk about data processing, an important concept that comes to our mind is performing ETL workloads to a data warehouse. ETL is a very complex topic and is out of the scope of this article, however, some of the activities in ETL can be achieved using the MERGE statement in SQL. Let us walk through it and learn how to use it in real life.
The MERGE statement in SQL is a special type of query in SQL Server that is capable to handle inserts, updates, and deletes at the same time. It is sometimes necessary during developing ETL programs that we need to check for the data that already exists in the database or not and then perform an update or insert based on the result returned. In such cases, you can leverage the MERGE statement as it is easier to write the code and also provides enhanced readability.
The above figure illustrates how the MERGE statement works. It is basically a comparison of set operation performed between the source and the destination tables. The MERGE statement can perform the above three checks.
- Not matched by Source - Records found in the destination but not in source.
- Matched Rows - Records found in both source and destination.
- Not matched by Destination - Records found in the source but not in the destination.
Hands-On Demonstration
I think doing some hands-on work with the MERGE statement will explain how it actually works and you will be able to do it yourself. I will create some basic tables and insert data into those. We will try to move data from the source tables and target tables using the MERGE statement.
CREATE TABLE SourceTable( ProductIDINT, ProductNameVARCHAR(50), PriceDECIMAL(9,2) ) GO INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(1,'Car',100) INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(2,'Pen',80) INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(3,'Mobile',50) INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(4,'Laptop',300) GO CREATE TABLE TargetTable( ProductIDINT, ProductNameVARCHAR(50), PriceDECIMAL(9,2) ) GO INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(1,'Car',100) INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(2,'Pen',180) INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(5,'Washing Machine',50) INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(6,'Keyboard',300) GO SELECT * FROM SourceTable SELECT * FROM TargetTable
We can see the results from this code below.
Now, we have inserted data into the source and target tables, so let us write the MERGE query to synchronize the data between both the tables.
MERGE TargetTable AS Target USING SourceTableAS Source ON Source.ProductID = Target.ProductID WHEN NOT MATCHED BY Target THEN INSERT (ProductID,ProductName, Price) VALUES (Source.ProductID,Source.ProductName, Source.Price);
It is evident from the below figure that upon executing the MERGE statement, the records with ProductID 3 and 4 have been inserted into the target table since they were not present. We have leveraged the condition when records are not matched by the target to perform an insert.
Now, let us extend the above functionality and write our code to perform updates on the target table when the records match between both the source and the target tables.
MERGE TargetTable AS Target USING SourceTableAS Source ON Source.ProductID = Target.ProductID -- For Inserts WHEN NOT MATCHED BY Target THEN INSERT (ProductID,ProductName, Price) VALUES (Source.ProductID,Source.ProductName, Source.Price) -- For Updates WHEN MATCHED THEN UPDATE SET Target.ProductName= Source.ProductName, Target.Price= Source.Price;
In the image below, we see the record with ProductID 2, which had a price of 180.00, has been updated to 80.00. This value is obtained from the source table and performed as the ProductID between both the tables match with each other.
Finally, let us also include the code to perform actions when records are not matched by the source table.
MERGE TargetTable AS Target USING SourceTableAS Source ON Source.ProductID = Target.ProductID -- For Inserts WHEN NOT MATCHED BY Target THEN INSERT (ProductID,ProductName, Price) VALUES (Source.ProductID,Source.ProductName, Source.Price) -- For Updates WHEN MATCHED THEN UPDATE SET Target.ProductName= Source.ProductName, Target.Price= Source.Price -- For Deletes WHEN NOT MATCHED BY Source THEN DELETE;
I think with this explanation, you have a good bit of understanding about how the SQL MERGE statement works and get started writing your own queries with it.
Next Steps
- Read more about the SQL MERGE statement from the official documentation.