MERGE STATEMENT - what can be my biggest problem with it?

Question

MERGE STATEMENT - what can be my biggest problem with it?

VoldemarG

Hall of Fame

Points: 3883
More actions
September 29, 2019 at 12:26 am

#3684356

Why would I definitely not want to use MERGE statements - in what situations/scenarios? - and resort to old way of multiple updates/inserts.
I looked at a couple of sites like this: (and it looks vague... other than 'Merge statement can cause massive deadlocking' (but even then, when it may and when it may not cause deadlocking, how can i predict?..)
https://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
What kind of worst issue have you experienced with using MERGE statement?
THANKS!
Likes to play Chess

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Steve Collins SSCrazy Points: 2157 More actions · Answer 1

Merge statements are good for "make these rows look like those rows" situations. There are plenty of other uses too. Unlike UPDATE, DELETE, etc. MERGE is not a single atomic operator. The three things I would say are, 1) always use SET XACT_ABORT ON, and 2) always use a common table expression to locate records in the target table (there's no WHERE clause in a merge statement), and 3) make sure if there are triggers on the target table they only execute if rows are actually inserted/deleted.

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

heb1014 Hall of Fame Points: 3790 More actions · Answer 2

The only problem I've had with merge is when more than 1 row in the source matches the target. Example:

I have a source queuing up records in stage as records are updated.
The source fills up stage faster than my process to load into the target such that an entity record has more than 1 row in stage.

If I issue a merge without handling the points mentioned, I get an error (below). The way I handle it is to use the ROW_NUMBER function. I partition on entity Id and sort by insert date. That way I always process the oldest row in stage for each entity record.

I hope that makes sense.

The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

VoldemarG Hall of Fame Points: 3883 More actions · Answer 3

Thanks! So if there is more than 1 record matching in SOURCE then MERGE picks a random one to insert?

Do you use ROW_NUMBER inside of CTE in the Source?

Likes to play Chess

heb1014 Hall of Fame Points: 3790 More actions · Answer 4

If there are more than matches, SQL Server throws an error.

Here is what I do:

The rows load into stage. At this point I don't care how many rows an individual source record has. Each row represents a change to the source record.
In stage, there is a column named LoadGUID (uniqueidentifier). The column defaults to NULL.
When my loading process executes, the first step is to "grab" a batch of rows from stage by assigning a LoadGUID (code below).
Next, the LoadGUID is passed to the appropriated entity load procedure and the batch of rows is processed.

update a set a.LoadGUID = @LoadGUID
from (
select *, row_number() over (partition by CandidateId order by StageLoadDateTime asc) as RowNum
from Stage.DimCandidates 
where LoadGUID is null
) a
where a.RowNum = 1;

blakemcneill SSC Enthusiast Points: 162 More actions · Answer 5

It always made me nervous to use MERGE since MS put a disclaimer on the KB page for MERGE (https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017).

I work in a OLTP shop. We have had issues where someone creates a complex MERGE statement that runs hundreds of times per second and it caused a lot of blocking as well as consumes a large number of resources on the server.

While I like the idea of MERGE, it's still safer to use normal DML statements IMO.

Jonathan AC Roberts SSCoach Points: 18354 More actions · Answer 6

I've found performance to be the biggest problem with a MERGE. If you can replace a merge with an update followed by an insert it will give about 60% better performance. Microsoft even describe this on the BOL for MERGE: https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017

Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it doesn't exist, or updating a row if it matches. When simply updating one table based on the rows of another table, improve the performance and scalability with basic INSERT, UPDATE, and DELETE statements.