Clustered Index Update
The Clustered Index Update operator is typically used to update rows in a clustered index, but can also insert and delete rows in that same index. Optionally (especially in a narrow update plan) a Clustered Index Update can update one or more nonclustered indexes as well. The rows to be affected, as well as the required action and the new values of all columns in the index, can be produced by the operator’s child subtree, or can be defined in the operator’s properties. After updating a row, the operator returns data from that row to its parent; this data can be both the original or the new data.
The Clustered Index Update operator is in most ways identical in functionality to the (nonclustered) Index Update operator. One key difference, though, is that a Clustered Index Update can optionally update data in more than one index, when multiple objects are specified in its Object property. Even though in that case the operator targets both clustered and nonclustered indexes, it is still represented as a Clustered Index Update in the execution plan.
A Clustered Index Update operator that targets a clustered columnstore index and no additional nonclustered indexes is represented in SSMS (and most other tools) as a Columnstore Index Update operator in the graphical execution plan. To match the representation of the execution plan that most people use, I’ll focus the descriptions below on how Clustered Index Update operates when the Storage property is ‘RowStore’, and call out relevant differences on the separate Columnstore Index Update page.
Visual appearance in execution plans
Depending on the tool being used, a Clustered Index Update operator is displayed in a graphical execution plan as shown below:
|SQL Server Management Studio||Azure Data Studio||Plan Explorer|
|(version 17.4 and up)||(until version 17.3)|
The basic algorithm for the Clustered Index Update operator is as shown below:
Read first / next row
The Clustered Index Update operator has two different methods of reading the rows to be modified.
If the operator has a child operator and the Seek Predicate property is absent, then rows to be modified are found by executing GetNext() on the child operator. The returned row will at least contain all the columns of the target clustered index (including the uniquifier if any). Those key columns can be used to identify the row in the clustered index, read it (using the same logic as a Clustered Index Seek in singleton seek mode) if needed to find key values of other nonclustered indexes to modify and of columns in the Output List property, and then update it.
If the operator has no child operator and the Seek Predicate property is provided, then that Seek Predicate property is used to read all rows to be modified, using the same logic as the Clustered Index Seek operator (as singleton seek if the Seek Predicate is guaranteed to identify at most one row; or as range seek otherwise).
Before modifying data, the Clustered Index Update operator will always acquire the necessary locks (or verify that it already has them). Which locks exactly are taken depends on the isolation level of the transaction, as well as the action column (if any). It is never possible to completely eliminate all locking on data modifications.
This operator does not release the locks it takes, nor does any other operator in the same execution plan. Locks for data modifications are always held until the end of the transaction.
A full discussion of locking strategies used for the various transaction isolation levels is beyond the scope of this website.
The Clustered Index Update operator may optionally have its Action Column property set. If so, then this lists the name of one of the columns in the input data that will be interpreted as specifying, per individual row, what type of modification is needed (insert, update, or delete).
The known values and their meaning for the action column (for any operator, not just for Clustered Index Update) are:
- 1: Update
- 3: Delete
- 4: Insert
It is currently unknown whether there are more possible values.
If the Action Column property is not provided, then all input rows will be processed as updates to an existing row.
As far as currently known, it is not possible for a Clustered Index Update operator to have both its Action Column property provided and also have more than one index listed in the Object property.
When the action column indicates that a row needs to be inserted, the Clustered Index Update operator uses the Predicate property to find the values for all columns in that row. The Clustered Index Update operator does not actively check any constraints. Primary keys and unique constraints are implicitly checked because the storage engine throws an error as soon as data is written to a unique index that would cause a duplicate entry. For other constraints, the optimizer adds additional operators to gather relevant data, then checks the validity of the data in an Assert operator.
When the action column is absent or indicates that a row needs to be updated, the Clustered Index Update operator uses the values provided for the index key columns in the Predicate property to locate the row. If the target index is not declared as unique, then the clustered index key columns are considered part of the index key for this index. The remaining columns in the Predicate property then provide the new values to assign to the other columns in that index row. This means that the Clustered Index Update always does an update in place; the row will never move to another location in the index because the index columns are not changed. If a query can change the indexed columns of a clustered index, then the optimizer uses a Split operator to convert these updates to equivalent insert and delete operations (optionally combined with a Collapse operator for better performance).
The optimizer ensures that foreign key and check constraints are checked by other operators in the execution plan. An Assert operator is typically used to force a rollback of the operation if a constraint would be violated.
When the action column indicates that a row needs to be deleted, the Clustered Index Update operator uses the values provided for the index key columns in the Predicate property to locate the row, and then deletes it from the index. If the target index is not declared as unique, then the uniqueifier column is considered part of the index key for this index. The data for non-key columns in the Predicate property is ignored for rows that have to be deleted. No constraints are checked by the operator; the optimizer ensures that foreign key and check constraints are checked by other operators in the execution plan. An Assert operator is typically used to force a rollback of the operation if a constraint would be violated.
The Object property of the Clustered Index Update can hold one or more values. The first is always a clustered index. If there are more, then the rest are all nonclustered indexes on the same table, and rows modified in the clustered index are “simultaneously” modified in those nonclustered indexes as well. This is typical for a narrow update plan.
As explained above, the key values of for the extra indexes in the Object property do not need to be in the input. The Clustered Index Update operator can fetch these from the data in the clustered index when it reads the row to be modifed.
As far as currently known, the Object property can only hold more than one value if the Action Column property is not included.
Like any operator, Clustered Index Update is invoked by calling its GetNext() method. And like any other operator, it responds by returning a row, with the columns listed in the Output List property. These columns can be values from the input data, new values of the columns in the clustered index, or old (before update) values of those columns. The latter can be recognized in the execution plan by their name: the same name as the column, with “_OLD” appended.
The output rows can be used by parent operators for additional logic, such as constraint checking or propagating the changes to other indexes, indexed views, etc.
If the parent operator does not need any data, the Output List property is empty and an empty row is returned for each row deleted.
The properties below are specific to the Clustered Index Update operator, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the Clustered Index Update operator are marked with a *.
|ActionColumn||When this property is present, it lists the name of the column from the input that determines for each input row whether it updates index data in place, inserts new data in the index, or deletes an existing row from the index.
When this property is missing, all input rows are considered to be an update in place.
As far as currently known, this property can’t be included if the Object property includes more than one value.
|DMLRequestSort||When set to true, the update operation might qualify for minimal logging if additional conditions apply. Details of those additional conditions and the requirements for DMLRequestSort to be true can be found here.|
|Object||The first object listed is the clustered index that the Clustered Index Update operator will modify, using four part naming (database, schema, table, index). Optionally, one or more additional values may be supplied; these are nonclustered indexes on the same table. They, too, are specified using four part naming.
The subproperties of the Object property represent the four name parts separately, but also include two additional properties:
|Partitioned||When this property is present and set to true, the target of the Clustered Index Update is a partitioned index.|
|Predicate||Maps columns from the input stream to the columns in the index indicated by the Object property, or sets these columns to hardcoded values or variables from the query text.
For input rows that are to be processed as a delete (as determined by the data in the column listed in the Action Column property), only the columns that are part of the index key are used. (Plus the uniqueifier column if the index is not declared as unique).
|Seek Predicate||A collection object that contains a single Seek Keys element, used to navigate the index and locate the rows to be updated. A full description of the Seek Keys element can be found on the Index Seek page.
If the Seek Predicate property is present, the Clustered Index Update will always be a leaf node in the execution plan. If it is not, then the operator will always have a child operator.
|WithOrderedPrefetch||This property is present (and set to true) when prefetching is requested. This requires the input data to be ordered in the order of the index. Exact details of prefetching in the context of a Clustered Index Update operator are unknown at this time.
The assumed effect of prefetching is that the operator issues an asynchronous read request to the storage system, so that read-ahead reading can be used to more efficiently find the rows to be deleted. Rows are still processed in the order in which the operator reads them, but they will already be in the buffer pool as a result of the read ahead prefetching.
When this property is not present, no prefetching is used.
This table below lists the behavior of the implicit properties for the Clustered Index Update operator.
|Batch Mode enabled||The Clustered Index Update operator supports row mode execution only.|
|Blocking||The Clustered Index Update operator is non-blocking.|
|Memory requirement||The Clustered Index Update operator does not have any special memory requirement.|
|Order-preserving||The Clustered Index Update operator is fully order-preserving.|
|Parallelism aware||The Clustered Index Update operator does not support parallelism. It can only be used in a serial plan, or in a serial section of a parallel plan.|
|Segment aware||The Clustered Index Update operator is not segment aware.|