Clustered Index Merge
The Clustered Index Merge operator is used when rows in a clustered index need to be changed, and the required changes can be a combination of inserted, updates, and deletes. Unlike the specialized operators (Clustered Index Insert, Clustered Index Update, and Clustered Index Delete) that can simultaneously modify data in one or more nonclustered indexes as well, the Clustered Index Merge can only target a single index, which has to be the clustered index. The rows to be affected, as well as the required action and the new values of all columns in the index, are produced by the operator’s child subtree. After performing a modification, the operator returns data from that row to its parent; this data can be both the original or the new data.
The Clustered Index Merge operator is almost identical in functionality to the Clustered Index Update operator when it has its Action Column property set. One key difference, though, is that a Clustered Index Insert can optionally merge data into more than one index, when multiple objects are specified in its Object property. Even though in that case the operator targets both clustered and nonclustered indexes, it is still represented as a Clustered Index Merge in the execution plan. Since SQL Server 2016 SP2-CU5, SQL Server 2017 RTM-CU13, and SQL Server 2019, the optimizer no longer generates plans that include more than one object in the Object property.
A Clustered Index Merge operator that targets a clustered columnstore index is represented in SSMS (and most other tools) as a Columnstore Index Merge operator in the graphical execution plan. To match the representation of the execution plan that most people use, I’ll focus the descriptions below on how Columnstore Index Merge operates when the Storage property is ‘RowStore’, and call out relevant differences on the separate Columnstore Index Merge page.
Visual appearance in execution plans
Depending on the tool being used, a Clustered Index Merge operator is displayed in a graphical execution plan as shown below:
SSMS and ADS
Paste The Plan
The basic algorithm for the Clustered Index Merge operator is as shown below:
Before modifying data, the Clustered Index Merge operator will always acquire the necessary locks (or verify that it already has them). Which locks exactly are taken depends on the isolation level of the transaction, as well as the action column (if any). It is never possible to completely eliminate all locking on data modifications.
This operator does not release the locks it takes, nor does any other operator in the same execution plan. Locks for data modifications are always held until the end of the transaction.
A full discussion of locking strategies used for the various transaction isolation levels is beyond the scope of this website.
The Clustered Index Merge operator uses the input columns listed in its Action Column property to specify, per individual row, what type of modification is needed (insert, update, or delete).
The known values and their meaning for the action column (for any operator, not just for Clustered Index Merge) are:
- 1: Update
- 3: Delete
- 4: Insert
It is currently unknown whether there are more possible values.
When the action column indicates that a row needs to be inserted, the Clustered Index Merge operator uses the Predicate property to find the values for all columns in that row. The Clustered Index Merge operator does not actively check any constraints. Primary keys and unique constraints are implicitly checked because the storage engine throws an error as soon as data is written to a unique index that would cause a duplicate entry. For other constraints, the optimizer adds additional operators to gather relevant data, then checks the validity of the data in an Assert operator.
When the action column indicates that a row needs to be updated, the Clustered Index Merge operator uses the values provided for the index key columns in the input to locate the row. If then uses the Predicate property to find the new values to assign to the data in that row. The Clustered Index Merge always does an update in place; the row will never move to another location in the index because the index columns are not changed. If a query can change the indexed columns of a clustered index, then the optimizer uses a Split operator to convert these updates to equivalent insert and delete operations (optionally combined with a Collapse operator for better performance).
The optimizer ensures that foreign key and check constraints are checked by other operators in the execution plan. An Assert operator is typically used to force a rollback of the operation if a constraint would be violated.
When the action column indicates that a row needs to be deleted, the Clustered Index Merge operator uses the values provided for the index key columns in the input to locate the row, and then deletes it from the index. The Predicate property is ignored for rows that have to be deleted. No constraints are checked by the operator; the optimizer ensures that foreign key and check constraints are checked by other operators in the execution plan. An Assert operator is typically used to force a rollback of the operation if a constraint would be violated.
In older versions of SQL Server (see above for the exact list), the Object property of the Clustered Index Merge could hold one or more values. The first was always a clustered index. If there were more, then the rest were all nonclustered indexes on the same table, and rows modified in the clustered index were “simultaneously” modified in those nonclustered indexes as well. This is typical for a narrow update plan.
As explained above, the key values of for the extra indexes in the Object property do not need to be in the input. The Clustered Index Merge operator can fetch these from the data in the clustered index when it reads the row to be modified.
The option to update multiple nonclustered indexes in addition to the clustered index in the same Clustered Index Merge was presumably removed in order to fix a bug. It is unknown whether this bug fix only prevents the optimizer from producing such plans, or whether the functionality to process multiple objects has actually been removed from the operator’s code.
Like any operator, Clustered Index Merge is invoked by calling its GetNext() method. And like any other operator, it responds by returning a row, with the columns listed in the Output List property. These columns can be values from the input data, new values of the columns in the clustered index, or old (before update) values of those columns. The latter can be recognized in the execution plan by their name: the same name as the column, with “_OLD” appended.
The output rows can be used by parent operators for additional logic, such as constraint checking or propagating the changes to other indexes, indexed views, etc.
If the parent operator does not need any data, the Output List property is empty and an empty row is returned for each row deleted.
The properties below are specific to the Clustered Index Merge, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the Clustered Index Merge operator are marked with a *.
|ActionColumn||This property lists the name of the column from the input that determines for each input row whether it updates index data in place, inserts new data in the index, or deletes an existing row from the index.|
|DMLRequestSort||When set to true, the modification might qualify for minimal logging if additional conditions apply. Details of those additional conditions and the requirements for DMLRequestSort to be true can be found here.|
|Object||This first object listed is the clustered index that the Clustered Index Merge operator will modify, using four part naming (database, schema, table, index). On older versions of SQL Server (see main text), one or more additional values may optionally be supplied; these are nonclustered indexes on the same table. They, too, are specified using four part naming.
The subproperties of the Object property represent the four name parts separately, but also include two additional properties:
|Partitioned||When this property is present and set to true, the target of the Clustered Index Merge is a partitioned index.|
|Predicate||Maps columns from the input stream to the columns in the index indicated by the Object property, or sets these columns to hardcoded values or variables from the query text.
For input rows that are to be processed as a delete (as determined by the data in the column listed in the Action Column property), only the columns that are part of the index key are used. (Plus the uniqueifier column if the index is not declared as unique).
This table below lists the behavior of the implicit properties for the Clustered Index Merge operator.
|Batch Mode enabled||The Clustered Index Merge operator supports row mode execution only.|
|Blocking||The Clustered Index Merge operator is non-blocking.|
|Memory requirement||The Clustered Index Merge operator does not have any special memory requirement.|
|Order-preserving||The Clustered Index Merge operator is fully order-preserving.|
|Parallelism aware||The Clustered Index Merge operator does not support parallelism. It can only be used in a serial plan, or in a serial section of a parallel plan.|
|Segment aware||The Clustered Index Merge operator is not segment aware.|