October 28, 2013 at 10:06 am
Gatekeeper (10/28/2013)
Good write-up. I used to love MERGE when it was first introduced. I was bit on a few bugs but worked through them. However, Andy wrote a nice write-up describing open issues with MERGE. I've known/ran into a few of them on his list which has further pushed me away from implementing any new MERGE code.
I've never taken the time to quantify all the open Connect items on MERGE so thanks for the link to Aaron's good article on the subject.
Even before seeing Aaron's list of faults, I had decided that MERGE wasn't worth the possible headaches especially with what it does in triggers. It just doesn't take that long to write a 3 part "merge" or a 2 part "upsert" and since I like to test each part, it's not likely that I'll ever use MERGE in SQL Server. That's not a recommendation to you folks. That's just my personal opinion of MERGE.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 28, 2013 at 6:16 pm
Phil Parkin (10/27/2013)
Thanks for the concise article Dwain - an interesting read as usual.
Hey Phil! Good to hear from you and thanks for the compliment.
Haven't seen you out and about on the forums much of late. Glad to see you are still with us.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 6:17 pm
twin.devil (10/28/2013)
dwain.c (10/27/2013)
guy.stephens (10/27/2013)
Much as we love the Merge statement in my workplace, we seem to be running into deadlocking problems that we haven't had when using the UPDATE/INSERT .. LEFT JOIN approach.We haven't investigated too deeply but are tending back to UPDATE/INSERT on large datasets. Would be glad to hear from anyone else who has experienced this.
I personally have not experienced this issue, however with a 28% performance hit it wouldn't surprise me.
I would say that if you are really expecting to INSERT/UPDATE 100s of 1000s of rows, it might be a good idea to stick to the traditional.
I guess the key words here are "it depends."
+1
As for the article it is a nice one and good for someone to understand the working of MERGE statement, and especially for the decide whether to with MERGE or Not.
Thanks for the +1. I would say that it is important to remember it is just a guideline that may not be applicable to all cases and situations. Test, test and test again, to be sure where your case or situation falls within the narrow guidelines that I ran across.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 6:19 pm
andrew.fenna (10/28/2013)
This can be wrote as:--- MERGE
MERGE #Target t
USING (@ID, @Value
) s (ID, Value)
ON s.ID = t.ID
WHEN MATCHED
THEN UPDATE
SET Value = s.Value
WHEN NOT MATCHED
THEN INSERT
( ID, Value )
VALUES
( s.ID
, s.Value
);
Andrew - I think it would be problematic to draw any conclusions off of a MERGE constructed this way as at most 1 row is in the source. Hard to garner any performance statistics without a lot of rows being affected.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 6:22 pm
Mark-101232 (10/28/2013)
Interesting article!Paul White posted some useful information about MERGE performance
Mark - Thanks for the highly relevant thread. Paul's analysis is clearly informative, as his analyses usually are.
I'm surprised I missed it since it came out only a few months ago, considering how much I troll the forums (does that make me a forum troll? :-P), but I think it may have come out when I was tied up with a project in Papua New Guinea.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 6:25 pm
tim_edwards (10/28/2013)
Great article!Anyone else getting a "The resource could not be found" error when trying to download the Test Harness files?
Just in case you or anyone are having this issue, I've attached the .sql files again here.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 6:29 pm
asdf-146317 (10/28/2013)
Usually merge is modifying a tiny portion of the target rows. I do not think your test case is a valid real-world example since it would only happen one time. Subsequent merges on changes would be a more accurate test.I ran your test harness, only moving the location of the source and target tables outside of tempdb, and was not able to reproduce your results. Merge performed better in a normally logged database.
In response to yours and an earlier post, I'll try running my test harness using permanent user tables. I will also try it on SQL 2008, whereas the original tests were run on SQL 2012.
In the meantime, would you mind posting your results in a comparable format to what I did, and also tell us what your testing machine specs were? It might be interesting to see if there is some commonality to the cases where performance is a wash.
This is something I'll try to do tonight (in about 12 hours or so), so please bear with me.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 28, 2013 at 8:09 pm
Had some deadlocks and massive performance hits with large datasets that involved varchar(MAX) fields..
Added a plan guide with OPTION (MERGE JOIN) and the performance went back to good..
Plan is to change them over to IUD's in the near future, but I still like merge in many situations.
October 28, 2013 at 8:12 pm
A question about your results...
What were the hardware specs of the system you ran the tests on? What kind of CPU? How much RAM? Size, speed and number of HDDs in the disk sub-system/RAID?
Some of my customers have database servers that are processor and memory heavy, but are under-powered in the disk sub-system. I'd like to have an idea how my programmer switching to use MERGE would affect them. A lot of those customers unfortunately don't have a DBAs that I can coordinate tests with or allow remote access into their systems in order for me to perform benchmarks tests either.
EDIT:
Nevermind. I just saw the line at the bottom of the article that shows the hardware specs used. Thanks for such an informative article!
October 29, 2013 at 2:48 am
Nice article Dwain, thanks.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
October 29, 2013 at 3:12 am
Very good article and nicely explained analysis.
October 29, 2013 at 3:33 am
Koen and sqlnaive - Thank you sirs for the feedback.
Now on to rerunning my test harness with a couple of variations.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 29, 2013 at 5:29 am
It seems that there is quite a bit more variation than I thought. I have 2 machines available for testing:
1. The Lenovo reported at the end of the article, which has installed SQL 2012 and SQL 2008R2 (both 64 bit versions).
2. A Dell Inspiron Core i5 2.4 GHz with 8GB of memory running Windows 7 (64 bit) and SQL 2008R2 (64 bit).
Focusing on comparing MERGE to the UPDATE/INSERT case and ignoring the others.
Using the original Lenovo machine (4GB memory)
And non-temporary tables on SQL 2012:
CPU Reads Writes Duration
MERGE 4618 4515337 5226 6772
INSERT/UPDATE 3541 3075083 7019 3956
MERGE 4633 4515221 5214 5117
INSERT/UPDATE 3994 3075081 7007 4726
MERGE 4587 4515222 4949 6627
INSERT/UPDATE 3760 3075081 7010 4119
MERGE 4664 4515221 5212 5384
INSERT/UPDATE 3650 3075081 7009 4040
MERGE 4493 4515220 5213 5025
INSERT/UPDATE 3573 3075081 7008 3851
Timings show that MERGE consumed 24% more CPU and 40% more elapsed time.
Using temporary tables on SQL 2008R2.
CPU Reads Writes Duration
MERGE 5757 4525802 5193 10041
INSERT/UPDATE 4009 2922120 4404 4401
MERGE 5757 4525760 5173 6206
INSERT/UPDATE 4009 2922120 4405 4430
MERGE 5647 4525752 5202 6218
INSERT/UPDATE 4103 2922124 4404 4523
MERGE 5601 4525757 5136 6292
INSERT/UPDATE 4040 2922120 4404 4320
MERGE 5803 4525755 5162 6149
INSERT/UPDATE 3853 2922121 4404 4607
Timings show that MERGE consumed 43% more CPU and 57% more elapsed time.
Using non-temporary tables on SQL 2008R2.
CPU Reads Writes Duration
MERGE 6271 4527210 2855 6995
INSERT/UPDATE 4867 2928630 4408 5497
MERGE 6271 4529746 2610 6961
INSERT/UPDATE 4899 2931221 4396 5601
MERGE 6302 4529750 3157 7421
INSERT/UPDATE 4914 2931220 4406 5601
MERGE 6303 4529750 2029 6792
INSERT/UPDATE 5023 3087491 5814 5603
MERGE 6334 4529750 2603 6782
INSERT/UPDATE 4992 3087517 7016 7647
Timings show that MERGE consumed 27% more CPU and 17% more elapsed time.
Using the Dell machine (8GB memory)
Using temporary tables on SQL 2008R2.
CPU Reads Writes Duration
MERGE 4664 4028115 4620 5708
INSERT/UPDATE 3994 2922546 4008 5499
MERGE 4929 4028112 4857 6022
INSERT/UPDATE 3978 2922545 3990 5318
MERGE 4805 4029027 3800 6371
INSERT/UPDATE 4555 3080953 3991 6696
MERGE 4836 4028115 4769 5790
INSERT/UPDATE 4587 3080958 3966 6579
MERGE 5101 4028114 4686 6411
INSERT/UPDATE 4197 2922545 3993 6085
Timings show that MERGE consumed 13% more CPU and elapsed time was about the same.
Using non-temporary tables on SQL 2008R2.
CPU Reads Writes Duration
MERGE 6739 4029813 2955 14291
INSERT/UPDATE 5944 2926033 5648 23672
MERGE 4820 4029885 2902 9253
INSERT/UPDATE 4150 2926035 4393 15135
MERGE 4836 4029881 2633 11900
INSERT/UPDATE 4212 2926024 4398 12546
MERGE 4821 4029882 2669 5795
INSERT/UPDATE 4119 2926024 4398 14905
MERGE 4961 4029886 2613 5733
INSERT/UPDATE 4274 2926021 4404 5869
Timings show that MERGE consumed 15% more CPU but elapsed time was about 35% improved (although with a significant amount of variation).
Sorry if I mislead anybody but it appears MERGE may not be so bad if you aren't too memory constrained. It seems what I said is true particularly when using MERGE, which is to do some testing under your particular environmental conditions before drawing any serious conclusions.
Edit: Added the attachment, which is Test Harness #1 revised to use non-temporary tables.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 29, 2013 at 6:31 am
adnan.korkmaz (10/28/2013)
Nice comparison.On the other hand, it could be better to see physical table performance comparison instead of / in addition to temporary tables.
There's unlikely to be much difference.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
October 29, 2013 at 6:39 am
Thanks Dwain for another informative and well-written article. It wouldn't surprise me if the discussion provoked a significant rewrite with more use cases, based on posts so far.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
Viewing 15 posts - 16 through 30 (of 53 total)
You must be logged in to reply to this topic. Login to reply