The MERGE statement compares source and target data, and then inserts into, updates, and deletes from the target table, all in a single statement. This statement was introduced in SQL Server 2008. I liked it, because it allows you to replace a set of multiple queries with just one single query. And while a statement with that many options necessarily has a more complex syntax, I still believe that, in most cases, a single MERGE statement is easier to read, write, and maintain, than a combination of at least an INSERT and an UPDATE, often a DELETE, and sometimes first a SELECT into a temporary table if the source is complex.
MERGE had issues
Sadly, it turned out that there were issues with the MERGE statement. Too many issues. Aaron Bertrand compiled a list, and that list is to this day often pointed to as a reason to not use MERGE. (Which, by the way, was not the conclusion Aaron himself wrote!)
Two years ago, Michael J Swart decided to take a new look at that list, to see how many of those issues were still relevant at that time. He found that several were fixed. He rejected some that were merely feedback on documentation. And he dismissed items that were not actually about MERGE. And then he still had quite a list. Enough for him to end hist post with the advice to better just avoid MERGE altogether.
Following a recent discussion with some MVPs, I decided to take a renewed look into this list. Partly because more bugs might have been fixed with SQL Server 2022 CU7. And partly because, just because someone (no matter how smart) says something, it isn’t automatically so. (And partly because I just wanted an excuse to spend a few hours diving into the details of execution plans – but don’t tell anyone that, or they might think I’m weird).
I looked at each of the issues that Michael mentions, and after that I will also address two more generic concerns from Aaron’s post, and one that was voiced by some of the people in the discussion we had.
Is MERGE safe now?
I found only one bug that I consider to be absolutely critical. And one that is serious. Is that bad? Yes, of course it is. But since the circumstances of those bugs are known, they can be avoided.
Based on my findings, I no longer support the “avoid MERGE” mantra that I, and many others, have been echoing the past years. My new generic helicopter view advice now is “avoid using the DELETE action in MERGE, and be careful when MERGE has a temporal table as its target”. (Not quite as catchy, I know – but far more accurate!)
Read on as I address each of the issues from Michael’s post (and a few others) in turn, and I hope you will agree.
Issue #1: Simple recovery
Summary: MERGE can fail with a fatal ‘assertion error’ message when run in a database in simple recovery. A workaround is to enable trace flag 692.
Verification: There are no repro steps given in the feedback item. The text includes a link to another item, but that link is broken and a keyword search for words found in that link comes up blank. I was not able to reproduce this, Based on the fact that I could not find other reports of this, I have to assume that this either only happens under very unusual circumstances, or that it is fixed but Microsoft forgot to update the feedback item.
Conclusion: This is either already fixed, or requires circumstances that are extremely rare; the effect is a failed statement but no incorrect results; and there is a known workaround. I do not agree with the suggestion to avoid using MERGE under simple recovery based on this report. Something about a child and bathwater applies here.
My verdict: Rare bugs such as these should of course be investigated and fixed by Microsoft, But they should not be used to discourage usage of a good feature.
Issue #2: CDC enabled tables
(Note that this item is not visible in Michael’s blog post, due to a missing > in the HTML. But you can see it if you right-click and select “View source”)
Summary: When CDC is enabled, rows updated by MERGE may be logged as delete / insert rather than before update / after update.
Verification: There are no repro steps given in the feedback item, but I was able to create a script (see here) that reliably demonstrates this issue. However, I could also show, in the same script, that the exact same thing happens when an UPDATE statement is used. And that the problem disappears, for both MERGE and UPDATE, when the estimated number of rows affected is reduced far enough that the optimizer chooses a narrow update plan instead of a wide plan
I have not searched in the documentation to see whether converting an update to a delete / insert in the CDC logging is documented as accepted behaviour or whether it is indeed a bug.
Conclusion: This issue may or may not be a bug, but it is definitely not specific to MERGE. Avoiding MERGE won’t help you. Rewrite to use UPDATE, and you still get the same effect.
My verdict: This issue affects all data modifications with a wide update plan. It should not be used to discourage usage of MERGE.
Issue #3: Memory optimized tables.
(Note that this issue is listed in Michael’s post with an incorrect link, due to an HTML error. The link below is the correct one)
Summary: MERGE does not accept memory-optimized tables as a target.
Verification: Very easy to verify. Just read the documentation. Here and here, Microsoft explicitly says that memory-optimized tables cannot be the target of a MERGE. And here, you can see that MERGE is not in the list of supported features for natively compiled stored procedures.
If you do try it anyway, you get a very clear, easy to understand error message, saying that the feature is not supported.
Conclusion: This is not a bug, but a feature request. An understandable one, and one I support, although the real issues with MERGE should be fixed first. However, I consider it disingenuous to include this documented limitation in a list of risks for using MERGE.
My verdict: A feature request should not be used to discourage usage of MERGE. That’s like advising people to avoid backup compression because it isn’t supported on Standard Edition, or to avoid Always Encrypted because it doesn’t support DISTINCT queries.
Issue #4: Fulltext indexes
Summary: When a MERGE statements updates a text column that is fulltext-indexed, and it is part of a partitioned table, and that table has a nonclustered primary key, then (and only then!) the fulltext index does not get updated.
Verification: The feedback item mentions that the bug was reproduced on three versions of SQL Server: 2012, 2014, and 2016. However, there are no feedback steps given, so I was unable to check whether it affects newer versions as well.
I have tried to reproduce this on SQL Server 2022, by creating a partitioned table with a nonclustered primary key and a full-text indexed column, and then modifying the data with a MERGE statement. I tried this both with and without also creating a clustered index on the table. In all cases, the results of a fulltext search changed to reflect the modification after a few seconds. And that few second delay is normal, since processing of changed in fulltext indexes is a background process.
Conclusion: Perhaps this issue has been fixed. Or perhaps there are even more criteria than the three listed above, so that the bug only surfaces under very rare circumstances. Either way, the blanket suggestion to avoid using MERGE when you have fulltext indexes goes way too far. Remember that at least the three criteria mentioned above must be met: fulltext indexes, partitioned table, and nonclustered primary key. So if you want to err on the side of caution, then the advice should be to make sure that your partitioned tables with fulltext indexes do not have a nonclustered primary key if you use them as the target of a MERGE. And even then, you are probably safe, since I was unable to reproduce the bug.
My verdict: The required circumstances for the bug are already a pretty rare combination, and we can be sure that either there are even more conditions before it kicks in, or it has just been fixed without marking the bug report as closed. I do not believe that this is a valid reason to argue against the use of MERGE. Most partitioned tables with a fulltext index will have a clustered primary key anyway, and if not, then the primary key can probably be changed to become the clustered index.
Issue #5: Columnstore indexes
Summary: When a table in tempdb that has a columnstore index is simultaneously the target of a MERGE and the insert-target of its OUTPUT clause, the statement may fail with an assertion error.
Verification: There is a full repro script in this forum post. Run it in tempdb, and it is immediately obvious that the bug does indeed still exist. However, run the same script in a user database, and it is just as obvious that it runs fine there, despite the weirdly odd syntax used in the MERGE statement.
Conclusion: If you really need to run a MERGE statement with an OUTPUT clause that inserts the results of that OUTPUT into the same table that is the target table, then either make sure the target table has no columnstore index, or run the code in any database other than tempdb.
My verdict: The issue is real, but it is an extreme fringe case that can be very easily worked around. I don’t consider this a good reason to advise discourage usage of MERGE (even with columnstore indexes).
Issue #6: Partitioned tables
Link to (old) Connect item. (Uses Wayback machine, so active links in the text don’t work)
Summary: When the target of MERGE is a partitioned table that has just one single partition, the statement may fail with an assertion error.
Verification: The active links in the Wayback machine don’t work and the original Connect site has long gone, so there is no way to access the details of the Connect item. Even if feedback steps were provided, we cannot access them.
I have not tried to reproduce this issue, since I don’t see it as a relevant issue anyway. The whole point of partitioned tables is to have more than one partition. If a table has just a single partition, then you should not define that table as partitioned.
(Note: When I discussed this with a few people before writing this blog, someone commented that single-partition partitioned tables may be required to enable quick data update through partition switching, using the ALTER TABLE … SWITCH syntax. However, this is not correct. ALTER TABLE … SWITCH works without issues regardless of whether source and/or target are partitioned or not.
Conclusion: I have tried to reproduce this on SQL Server 2022, by creating a partitioned table with just a single partition, and then modifying its data with a MERGE statement. The statement executed without assertion error, and the data was modified as expected. So it is likely that this bug has been fixed. Although there is of course the theoretic possibility that the repro required more conditioned to be ticked before the error happens.
But even in that case, the effect is a failed statement, but no incorrect results. There is also an easy workaround that I would recommend even without this issue: don’t use a partitioned table if you’re going to have just a single partition anyway. So once more, I cannot support Michael’s suggestion (avoid using MERGE with partitioned tables), since this bug is probably fixed, and when not, does not cause wring results, and has two easy workarounds: change the target table to be non-partitioned, or create an empty second partition.
My verdict: Rare bugs such as these should of course be investigated and fixed by Microsoft. Perhaps that has already happened. Either way, they should not be used to discourage usage of a good feature.
Issue #7: Unique (filtered) indexes
Summary: It seems that Michael misread this item when he compiled his list. He includes it as an issue with unique filtered indexes. However, the item refers to what it calls a similar issue with unique filtered indexes, that has since been fixed, but claims that they found a case where this issue happens with nonfiltered unique indexes as well.
(The link to Paul White’s post no longer works because Paul relocated his blog; the new location is here).
Verification: The feedback item does not include repro steps. My own attempts to reproduce this have so far failed.
Analysis: The fix for the original issue, with unique filtered indexes, was to completely disable the option for the optimizer to have a Clustered Index Merge or Table Merge operator target additional nonclustered indexes as well. This forces the optimizer to always use a wide update plan for the target of the MERGE, which avoids this issue.
Since the report explicitly mentions that this happens when the optimizer chooses a narrow instead of a wide update strategy, I assume that that fix solved this feedback item as well. This is further supported by the observation that there are no new reports of this happening.
Conclusion: Unless someone can provide me with a repro, I assume that this is fixed, but Microsoft forgot to mark the feedback item as fixed.
My verdict: This bug is fixed, since SQL Server 2019. It is not a reason to discourage usage of MERGE.
Issue #8: Temporal tables
Summary: When the target of a MERGE is a temporal table, and the associated history table has a nonclustered index defined on it, you may get error: “Attempting to set a non-NULL-able column’s value to NULL”.
Verification: The second feedback item includes a link to this forum discussion. All the way down, in the last comment (April 2, 2020), Erland Sommarskog provides a clean repro script that I used to demonstrate that this bug still exists in SQL Server 2022, CU7.
Analysis: While playing with Erland’s repro script, I noticed something interesting. If you remove (or don’t create) index pk_ix, the MERGE statements runs just fine. If you do that and request the execution plan plus run-time statistics, the top-left of the execution plan for the MERGE statement looks like this:
The Clustered Index Merge on the right targets the actual table, but the Clustered Index Insert targets the history table. Its task is to store old versions of rows that were updated by the MERGE statement. (See also here). But in this case, the actual table has just been created empty, so the MERGE only inserted new rows. And yet the data in the execution plan with run-time statistics shows that this Clustered Index Insert received and returned over a thousand rows.
Why were those rows not inserted? I have to be honest: I don’t know. What I assume is that the Clustered Index Insert has a property, in the internal representation of the execution plan, that makes it skip rows that represent inserts (ActionColumn1008 = 4), so that only old data for updated rows (ActionColumn1008 = 1) is inserted in the history table. But that property is not included in the execution plan XML, and hence also not exposed in any way in the graphical execution plan.
Going back to the original code, with the additional nonclustered index that makes the query fail, the top left of the execution plan (without runtime statistics this time, because the query fails) looks like this:
As you can see, the optimizer chose a wide update plan for the clustered and nonclustered indexes on the history table. And this fails. The in my opinion most likely explanation for this error is that the optimizer failed to add the same invisible filter on new rows to the (nonclustered) Index Insert that it did create on the Clustered Index Insert. But since that property is not exposed in the execution plan, not even in the XML, there is no way to verify this.
Conclusion: Temporal tables are an important feature in SQL Server. There is nothing in the documentation to suggest that MERGE should not be used on them. So I expect this to work. This issue is clearly a bug, and should in my opinion be fixed, and the fix should be backported to supported versions.
My verdict: Until this bug is fixed, MERGE can not safely be used when the target is a temporal table, unless you make sure to never create a nonclustered index on the history table.
Issue #9: DELETE operations on tables used in indexed views
Summary: When the target of a MERGE is used in an indexed view, and the MERGE includes a DELETE action, then the indexed view might not be updated to reflect the deleted rows.
Verification: The feedback item includes a link to this repro script. I copied it to test it on my local SQL Server 2022 instance, and I got the same incorrect results that are indicated in the script. This bug has not been fixed yet.
Analysis: The MERGE statement in the repro script combines an UPDATE and a DELETE action. When I modify the script to do only the DELETE action, the execution plan for the MERGE looks like this:
The relevant part of this execution plan is the second input to the Sequence operator. Where the top input determines results and updates the target table, this second input then reuses the same results (using a Table Spool) to change the materialized data in the indexed view as needed. The optimizer generates the execution plan this way because it recognizes that the MERGE has the potential to make modifications that affect the indexed view, and hence it ensures that in that case the corresponding changes are made. And indeed, the rest of the script shows that in this case, all results are correct.
Next, I modified the MERGE statement to only do the UPDATE action. Now we get this execution plan:
In this case, there are no actions to modify the data in the indexed view. And that makes sense. The indexed view only uses id and v, and the UPDATE action of the MERGE only affects column v2, so the data in the indexed view will never be affected. The optimizer recognizes this, and correctly avoids needless extra work to maintain an indexed view that cannot change as a result of this statement.
But the point of a MERGE statement is of course to combine these actions, as is the case in the original repro script. So what does the execution plan look like in this case?
And there we have the culprit! The optimizer fails to include the operators that are needed to maintain the data in the indexed view. And that is strange. As we saw when we used just the DELETE action, the optimizer does know that deleting rows from the table can affect data in the indexed view. But as soon as the same statement also has an UPDATE action, the optimizer seems to forget that.
I also tested some other modifications to the query, to see what happens when a MERGE combines only an INSERT and DELETE action, or when it combines INSERT, UPDATE, and DELETE actions. In all those cases, the operators to maintain the indexed view were present in the execution plan, and the indexed view was correctly maintained.
Conclusion: This is a very serious bug. When the bug actually happens, you will get no indication that the data in your database has become inconsistent. And it is a type of inconsistency that is NOT detected by running DBCC CHECKDB, so the corruption in your data could go unnoticed for days, weeks, months, or even forever. If you do notice the corruption, it is easy to fix it – just drop and recreated the index on the indexed view. But until the corruption is noticed, any query that uses the indexed view as one of its inputs might return incorrect results.
And as Michael states, just avoiding MERGE with a DELETE that targets a table that is used in an indexed view is not a safe enough strategy. Because next week, your co-worker might see a potential performance improvement or another reason to create a new indexed view, and if they are not aware that you have shipped code that uses MERGE with a DELETE action, they will create that indexed view.
My verdict: Until this bug is fixed, it is not safe to use MERGE with an UPDATE and a DELETE action, but no INSERT action. While there are no issues when the target table is not used in an indexed view, such a view might be created next week, making the code a potential hazard.
Issue #10: Concurrency issues
No feedback item found.
Summary: Under high concurrency, MERGE statements are subject to potential race conditions.
Verification: I found a script to reproduce the issue in this blog that Dan Guzman posted in 2009. Despite the age of that post, the repro script still works, and I was able to get the race condition by running the second version of his script from four windows (just as Dan had done), although it did not happen on every attempt. This means that the issue is still relevant today.
Analysis: It is not hard to understand the root cause of the race condition when we look at the execution plan for the MERGE statement:
The Clustered Index Seek on the right reads the existing row (if any) from dbo.Foo2. Because default locking options apply, it will use a shared lock that only exists for the duration of the read. As soon as the row is in working memory, ready to be returned to the parent operator, the lock is released. The rows is then passed through a few operators before the Clustered Index Merge on the right either updates or inserts it.
The race condition happens when two connections run the same code at almost the same time, when the row does not yet exist. The sequence of events is:
- Clustered Index Seek in connection 1 takes locks, attempts to read the row, finds nothing, then releases the locks.
- Clustered Index Seek in connection 2 takes locks, attempts to read the row, finds nothing, then releases the locks.
- Clustered Index Merge in connection 1 tries to insert the row and succeeds.
- Clustered Index Merge in connection 2 tries to insert the row and fails, with a duplicate key violation.
The kneejerk reaction to blame MERGE is a bad one in this case. If you tell people that this is an issue with MERGE, they will fix it by replacing the MERGE code with two separate statements, an UPDATE and an INSERT. They might be coded the way Dan Guzman did in this blog post from 2007 (with first an EXISTS test and then either an UPDATE or an INSERT). But as Dan himself already found in 2009, that version of the code is even more susceptible to race conditions! So hopefully the code will instead be rewritten without IF EXISTS, using just these two statements:
UPDATE dbo.Foo2 SET UpdateSpid = @@SPID, UpdateTime = SYSDATETIME () WHERE EXISTS (SELECT * FROM (SELECT @ID AS ID) AS f WHERE f.ID = Foo2.ID); INSERT dbo.Foo2 (ID, InsertSpid, InsertTime) SELECT new_foo.ID, @@SPID, SYSDATETIME () FROM (SELECT @ID AS ID) AS new_foo WHERE NOT EXISTS (SELECT * FROM dbo.Foo2 AS f WHERE f.ID = new_foo.ID);
But if you put those in the original stored procedure and then rerun the code to test for race conditions, you will, once more, get primary key violations. And no wonder. Because this is the execution plan for the INSERT statement:
Indeed. The same pattern where first (far right of the execution plan) a Clustered Index Seek attempts to read the row, and then, on the left, the row is inserted – this time with a Clustered Index Insert, but otherwise the plan is mostly the same. And has the same potential race issue, because the Clustered Index Seek releases its locks as soon as the row is read.
At this point, many will say: Ah yes, but you should of course have used the HOLDLOCK hint to ensure that the INSERT statement holds its locks until it is finished, instead of releasing them a bit too early. And yes, I agree. That is indeed the solution. But why apply that solution after needlessly ripping out MERGE and replacing it with more complex and slower code? Why not simply add the HOLDLOCK hint to the MERGE? That is an easier change that also completely eliminates the race condition here.
Conclusion: The potential for race conditions is real. But it is not specific to MERGE. The exact same potential exists on all data modification statements, because the execution plan will always first have to read existing data, then do some logic to determine new values, and then do the modification. When using default locking, that always leaves a small window where the row to be affected is unlocked and can hence be modified by concurrent transactions. Instead of blaspheming MERGE, which will cause developers to modify their code to something else that is at least as susceptible to the same race, we should make sure that everyone understands locking, and knows how to prevent race conditions on all data modifications.
My verdict: It feels unfair, to me, to blame MERGE for something that is not specific to MERGE at all, and that arguably is even worse in code that avoids using MERGE.
Issue #11: Potential issues with triggers
No feedback item found.
Summary: A MERGE statement can cause the same trigger to execute multiple times, and triggers might have an unexpected value in @@ROWCOUNT.
Verification: A very simple script to show the issue can be found here. This is still valid on SQL Server 2022.
However, this is not a bug, but by design. Everyone is of course free to disagree with these design choices, just as everyone is free to disagree with the choice to run triggers just once if an INSERT, UPDATE, or DELETE affects multiple rows, instead of running the trigger once per affected row as some other databases do. But disagreeing with a design choice is not an excuse for not learning how to work with it. We have (hopefully!) all learned to code triggers to handle all affected rows. It is similarly simply a requirement of the job to learn how MERGE interacts with triggers when we want to write T-SQL code.
Conclusion: Someone who does not know the product well enough might indeed write trigger code that doesn’t work as expected. But that has always been the case, as is clear from the many people who have, throughout the years, created triggers that only work correctly if just a single row is affected. I do not see this as an issue with MERGE, but instead as one of the things a developer needs to be taught before they write their first trigger.
My verdict: Bugs caused by not knowing how a feature works are in my eyes not the fault of that feature. If trained developers don’t know basic things about the functionality of the system they work on, than the training needs to be reconsidered. Avoiding MERGE because you don’t understand how it affects triggers is like always driving your car in second gear because you can’t be bothered to learn how to use the clutch.
Issue #12: Complex syntax
No feedback item found.
Summary: Some people complain that the syntax for MERGE is too complex, and they cannot remember it.
Analysis: The syntax that SQL Server uses is based on the syntax for MERGE as it is specified in the ANSI standard, with some added sugar to add more options than ANSI has defined. I personally am happy with that choice. Following standards is important.
Whether this is too complex is probably a personal matter. However, I am going to be very blunt now, if you do indeed consider the MERGE syntax too complex, then my response is “Well, boo-hoo-hoo. Cry some more. The CREATE DATABASE statement has more options and you managed to learn that as well. This is your job. If it would be easy, everyone would be doing it.”
My verdict: Sorry (okay, perhaps not even sorry). But I have problems even taking this issue serious. The first time I saw an SQL statement with a correlated subquery in the WHERE clause, I had no clue what I was looking at. The first time I saw a CROSS APPLY, I had to look several times before I understood it. I had to spend some time in the documentation to learn how a recursive CTE works, and even after more than 20 years of experience, I still have to look up what each value for the style parameter of CONVERT means.
So yes, you will need to invest some time before you can write a MERGE by head. And you might need to use the internet if you inherit code with a MERGE that uses options you have not seen before. Welcome to having a job.
I have looked at everything I could find as being alleged issues with MERGE. I found that some of them have been fixed, some of them are not really issues with MERGE, and some of them are so obscure that they should not be of concern.
But I did also find a few real issues. The most important one is that it is dangerous to use a MERGE with an UPDATE as well as a DELETE action. If an indexed view exists on the target table, or if one is created in the future, then this might cause inconsistencies. The other very real issue is that you can get breaking errors if the target of MERGE is a temporal table.
I will also mention the potential race condition. This is an issue that affects all data modification statements, not only MERGE. But it’s still worth calling out, because too many of us tend to forget to really consider whether their code works as expected under very high concurrency.
My final conclusion is different from what Aaron Bertrand and Michael J Swart have concluded in the past. I would not suggest to just avoid MERGE. I also don’t say it can just be blindly used.
Do not use MERGE with a DELETE action. Do not use MERGE to target a temporal table. Those two warnings are important, because those are the actual bugs that I could find. But if you do not target a temporal table and do not use the DELETE action, then, as far as I can tell, it is safe to use MERGE.