Recently Microsoft introduced a new verision of SQL Server that included the MERGE DML statement, which is easy to use, and performance-wise, very useful. Using this command we can replace multiple DML statements with one statement (Ex: INSERT, UPDATE, or DELETE all in to one)
The situations where this is most often required are?
- Data synchronizing
- Data validation
- Data transformation (DTS - extract, transform, load)
- When Performing INSERT, UPDATE, or DELETE operations on a target table based on the results of a source table
Let's say we have a source object and a destination object. And we need to match all destination data to source object's data. What can we do? There are a few choices
The first is that we can delete all from the destination object and insert all from source object(Seems to be this is the easy way, but it's not best programming practice.) A few things to consider with this approach.
- You will need to face foreign key relationship problems if destination object has these set relationships.Example of you many have two types of customers it's old customer details and new customer details and you are going to delete old customers and insert new ones but if you have relationship with some other table (Ex: Invoice) you can't directly delete old customers and insert new ones.
- Sometimes you may delete millions of records and insert the same amounts of rows; it must be very costly operation because it does involve huge CPU, memory ,more time and after finish the operation you have to reorganize or recreate customer indexes because this operation affect the file fragmentation. Because you may have to perfume DELETE and INSERT both commands.,
- If the source and destination objects are already the same, no need to do any transactions Let's say you have same set of data in two different table its old customer details and new customer details then you may not need to do anything but if you delete and insert above data it's unnecessarily work .
- After verifying source and destination, just need to perform one single insert, update or delete.Example of you may have only one customer details need to change. And other records are totally same.in that case if you delete all and insert them again, you are using database and server resources (CPU, memory) unnecessarily. And it's huge performance issue.
We could implement MERGE DML statements, is the best way to do it. The diagram below shows how this might work.
Merge gives you some additional benefits.
- In earlier versions(SQL SREVRE 2000 and 2005) there was no need of performing separate single SQL batch wise INSERTs, UPDATEs, or DELETEs. Example of, you are writing procedure,
IF (SOURCE RECORDS EXIST IN DESTINATION)
UPDATE
ELSE IF (SOURCE RECORDS NOT EXIST IN DESTINATION)
INSERT
ELSE IF (DESTINATION RECORDS NOT EXIST IN SOURCE)
DELETE
For applying above method you have to validate records and need to execute three separate INSERT, UPDATE or DELETE Statements. But if you apply merge you can execute all above three statements at once.And you can reduce complexity of your code or procedure. - After performing MERGE statements, matching rows are updated and if the rows does not exist, they are inserted and else, rows in destination will be deleted and data is touched only once apart from executing three single statements.
- After finishing the MERGE operation we can monitor updated, deleted or inserted rows.As Example 2, with using "OUTPUT $ACTION" you can get deleted rows and inserted rows separately and it's some kind of idea what does really happened our destination table.
Following example 1 will shows you the basic picture of MERGE dml operation. You can assume "main table" is the source and "audit table" is the destination. Acceding to temporary Data you can see matching rows in main table and audit table (item 2 and 3) are updated and if the row does not exist in audit table(item 1 and 4), they are inserted and else, rows in audit table(item 5 and 6) will be deleted.And easily customize MERGE Syntax your self.
Example 1:
USE AdventureWorks2008
GO
--main table
CREATE TABLE tblItemMaster(itemNumber int, itemDesc varchar(25));
GO
--audit table
CREATE TABLE tblItemMasterAudit(itemNumber int, itemDesc varchar(25));
GO
--insert data to main table
INSERT INTO tblItemMaster VALUES(1,'Item A');
INSERT INTO tblItemMaster VALUES(2,'Item B');
INSERT INTO tblItemMaster VALUES(3,'Item C');
INSERT INTO tblItemMaster VALUES(4,'Item D');
GO
--insert data to audit table
INSERT INTO tblItemMasterAudit VALUES(5,'Item need to delete');
INSERT INTO tblItemMasterAudit VALUES(6,'Item need to delete II');
INSERT INTO tblItemMasterAudit VALUES(2,'Item need to update');
INSERT INTO tblItemMasterAudit VALUES(3,'Item need to update II');
GO
--after implement MERGE function
--insert,update and delete
MERGE tblItemMasterAudit AS A
USING tblItemMaster AS I
ON (A.itemNumber = I.itemNumber)
WHEN MATCHED
THEN UPDATE SET A.itemDesc = I.itemDesc
WHEN NOT MATCHED
THEN INSERT VALUES(I.itemNumber, I.itemDesc)
WHEN SOURCE NOT MATCHED
THEN DELETE;
GO
SELECT * FROM tblItemMaster;
SELECT * FROM tblItemMasterAudit;
This wills also same example like Example 1 but additional we were combined OUTPUT clause to MERGE Command.Iff we add the OUTPUT clause, we return a row for every row in destination that is updated, deleted, or inserted. Each column shows the number of item that was inserted or deleted.And another good thing is this, you and INSERT all OUTPUT clauses returns records to physical or temp table and check it one by one manually. That's why this is good method to audit data( as Example 2) because you can group them it using action_desc column and it will returns no of rows deleted, inserted or updated.
Example 2:
USE AdventureWorks2008
GO
--after implement MERGE function with output
--insert,update and delete
MERGE tblItemMasterAudit AS A
USING tblItemMaster AS I
ON (A.itemNumber = I.itemNumber)
WHEN MATCHED
THEN UPDATE SET A.itemDesc = I.itemDesc
WHEN NOT MATCHED
THEN INSERT VALUES(I.itemNumber, I.itemDesc)
WHEN SOURCE NOT MATCHED
THEN DELETE
OUTPUT $ACTION AS [ACTION], Inserted.itemNumber AS insertitemNumber,
Inserted.itemDesc AS insertitemDesc,
deleted.itemNumber AS deleteitemNumber,
deleted.itemDesc AS deleteitemDesc;
GO --after implement merge function with output assign to temp table
--insert,update and delete
--create table variable
DECLARE @outPutTableVar TABLE
(
action_desc varchar(50),
insertitemNumber int,
insertitemDesc varchar(100),
deleteitemNumber int,
deleteitemDesc varchar(100)
); MERGE tblItemMasterAudit AS A
USING tblItemMaster AS I
ON (A.itemNumber = I.itemNumber)
WHEN MATCHED
THEN UPDATE SET A.itemDesc = I.itemDesc
WHEN NOT MATCHED
THEN INSERT VALUES(I.itemNumber, I.itemDesc)
WHEN SOURCE NOT MATCHED
THEN DELETE
OUTPUT $ACTION AS [ACTION] , Inserted.itemNumber AS insertitemNumber,
Inserted.itemDesc AS insertitemDesc,
deleted.itemNumber AS deleteitemNumber,
deleted.itemDesc AS deleteitemDesc INTO @outPutTableVar;
--get records to table variable
SELECT * FROM @outPutTableVar;
GO
Let's say some situation you may need to do only "UPDATE, INSERT" and no need to delete any records as Example 3. And this logic also you can apply to MERGE command.same as Example 4, 5 will explain "UPDATE, DELETES" and "INSERT, DELETE" statements combined with MERGE command. Finely you can see the virtual result of the destination table (Example 3 Picture)
Example 3:
- with only update and insert
--update,insert MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN MATCHED THEN UPDATE SET A.itemDesc = I.itemDesc WHEN NOT MATCHED THEN INSERT VALUES(I.itemNumber, I.itemDesc); GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
- with only update and delete
Example 4:
--update,delete MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN MATCHED THEN UPDATE SET A.itemDesc = I.itemDesc WHEN SOURCE NOT MATCHED THEN DELETE; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
- with only insert and delete
Example 5:
--insert,delete MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN NOT MATCHED THEN INSERT VALUES(I.itemNumber, I.itemDesc) WHEN SOURCE NOT MATCHED THEN DELETE; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
Following example 6 will shows you how to apply one single UPDATE statement combined with MERGE command. And advantages of this additionally you may not need to check "if records exist then update else end" like process. Same as example 7, 8 explain single transaction of INSERT and DELETE combined with MERGE command,
- with only update
Example 6:
--only update MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN MATCHED THEN UPDATE SET A.itemDesc = I.itemDesc; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
- with only insert
Example 7:
--only insert MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN NOT MATCHED THEN INSERT VALUES(I.itemNumber, I.itemDesc); GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
- with only delete
Example 8:
--only delete MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN SOURCE NOT MATCHED THEN DELETE; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit; GO
MERGE dml statement capitalized with TOP clause & AND operator,
- with TOP clause
Example 9:
--With using TOP clause MERGE TOP (2) tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN MATCHED THEN UPDATE SET A.itemDesc = I.itemDesc WHEN NOT MATCHED THEN INSERT VALUES(I.itemNumber, I.itemDesc) WHEN SOURCE NOT MATCHED THEN DELETE; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit;
- with AND operator
Example 10:
--With using AND operator MERGE tblItemMasterAudit AS A USING tblItemMaster AS I ON (A.itemNumber = I.itemNumber) WHEN MATCHED THEN UPDATE SET A.itemDesc = I.itemDesc WHEN NOT MATCHED THEN INSERT VALUES(I.itemNumber, I.itemDesc) WHEN SOURCE NOT MATCHED AND (A.itemNumber>5) THEN DELETE; GO SELECT * FROM tblItemMaster; SELECT * FROM tblItemMasterAudit;
Limitations
- MERGE DML operation only implements with single target table or view.Example of you can't insert, delete or update multiple tables using MERGE statement at once.
- Source table or MERGE source must has to be without duplicated records.
Conclusion
Microsoft done huge investment to T-SQL in SQL SERVER 2008 as result of that they introduce MERGE DML Statement. Using this we can combine INSERT, UPDATE, or DELETE operations on a destination table Based on results of a source table. After finishing the execution, we can audit all the transactions completed on the rows as above Example 2 with using OUTPUT clause. And if you are writing huge procedure including lots of IF ELSE statements and validating always RECORDS EXIST OR NOT then this is the time change it and Apply MERGE command.