RID Lookup
Introduction
The RID Lookup operator offers the same logical functionality within the execution plan as the Key Lookup operator. But where Key Lookup is used for tables that have a clustered index, RID Lookup is instead used when a table is “heap” (table without clustered index). It is used when another operator (usually an Index Seek, sometimes an Index Scan, rarely a combination of two or more of these or other operators) is used to find rows that need to be processed, but the index used does not include all columns needed for the query. The RID Lookup operator is then used to fetch the remaining columns from the heap structure where the table data is stored.
A RID Lookup operator will always be found on the inner input of a Nested Loops operator. It will be executed once for each row found. Unlike a Key Lookup, the RID lookup does not search for an index value; the search value passed in will be a so-called “row identifier” (abbreviated RID), consisting of file number, page number within the file, and slot number within the page, which allows for direct access to the row. However, if the has ever been relocated within the heap (due to an update where the new data took more space then was available on the page), then the indicated RID will lead to a forwarding pointer, which then reveals the current RID of the actual row.
Visual appearance in execution plans
Depending on the tool being used, a RID Lookup operator is displayed in a graphical execution plan as shown below:
SSMS and ADS |
Legacy SSMS |
Plan Explorer |
Paste The Plan |
The icon design is a bit strange, especially when compared to the icon design of the Key Lookup operator, and the design of other operators that depict either an action on a heap, or an action on a clustered index. My theory is that the icons for Key Lookup and RID Lookup were accidentally switched when they were introduced in SQL Server 2005, and then Microsoft didn’t want to change it later for fear of causing even more confusion.
Algorithm
The basic algorithm for the RID Lookup operator is as shown below:
Note that, depending on the transaction isolation level that is in effect for the table being read, locks may be taken and released as rows are accessed. This is part of the responsibility of the storage engine, and a detailed discussion of this functionality is beyond the scope of this website.
Locate Row
The RID Lookup operator, like the Key Lookup operator, uses a Seek Keys specification within a Seek Predicates property to locate the row. The name of this property is not totally accurate, since the RID Lookup operator doesn’t perform an actual seek operation. But the property does have the same logical function in RID Lookup as it does in Key Lookup, which might explain the name choice.
The Seek Keys specification of a RID Lookup operator is always an equality specification on the Row identifier, which is passed in (through a Nested Loops operator) from an Index Seek or Index Scan on a nonclustered index. This is represented as a column in the Output List property of those operators, but since it is an internal column, it does not have a normal column name. Instead, SQL Server assigns their name. The assigned name consists of the mnemonic prefix “Bmk” (for “bookmark”, because this operator was called a “bookmark lookup” before SQL Server 2005), followed by a unique 4-digit number.
Forwarded rows
Due to how a heap works, the storage location indicated by the Seek Keys specification can either be the row itself, or a forwarding pointer (which has he exact same structure as a Row identifier: file number, page number, and slot number) to the location where the row itself is stored. In the latter case, a second read operation is needed to retrieve the actual row.
Heaps do not use chains of forwarded row. If a slot contains a forwarding pointer, then that pointer always locates the row itself.
Row not found
Because the Row identifier passed into the RID Lookup operator has just been retrieved from a nonclustered index on the table, the row should normally always exist. However, under read uncommitted isolation level, this is not guaranteed; there are scenarios where one transaction might access a nonclustered index and read the row identifier from a row that is about to be deleted by another transaction, and then when the RID Lookup tries to read the data from the heap it has been deleted there.
In the flowchart I have indicated that in this case no row is returned, but that is an assumption. It is also possible that this might result in a runtime error, in which case the flowchart would be incorrect.
Predicate
The RID Lookup operator supports the optional Predicate property, just as the Key Lookup does. A Predicate property will typically be used in a RID Lookup when a query has a filter that can be effectively used by a nonclustered index, but also has filters on columns not included in that nonclustered index.
The operator uses the Row identifier in the Seek Keys specification to find the indicated row, reads it, but then first checks the Predicate property to determine whether to return the row, or return an end of data signal.
Operator properties
The properties below are specific to the RID Lookup operator, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the RID Lookup operator are marked with a *.
Property name | Description |
---|---|
Defined Values * | For a RID Lookup, this property lists the columns read from the index and returned to the calling operator. It is therefore the same as the Output List property. |
Estimated Number of Rows * | Normally this is an estimation of how many rows the operator will return per execution. Has to be multiplied to Estimated Number of Executions to get the estimated total number of rows (which can then be compared to the Actual Number of Rows). Determined during query optimization. Renamed in Management Studio 18.5 to Estimated Number of Rows Per Execution. Specifically for the RID Lookup operator, if a Predicate property is present then this number represents the estimated total of rows for all executions instead of the estimated number of rows per execution. |
Estimated Number of Rows for All Executions * | This property does not actually exist in the execution plan. It was introduced in Management Studio 18.5, as the result of multiplying Estimated Number of Rows Per Execution and Estimated Number of Executions; the result can easily be directly compared to the Actual Number of Rows for All Executions. Specifically for the RID Lookup operator, if a Predicate property is present then the Estimated Number of Rows Per Execution, in spite of its name, actually represents the estimated total of rows for all executions instead of the estimated number of rows per execution. This property will then still be computed by multiplying that number by the Estimate Number of Executions, resulting in a useless number. |
Estimated Number of Rows Per Execution * | Normally this is an estimation of how many rows the operator will return per execution. Has to be multiplied to Estimated Number of Executions to get the estimated total number of rows (which can then be compared to the Actual Number of Rows for All Executions). Determined during query optimization. In Management Studio 18.4 and older and in Azure Data Studio, this property is called Estimated Number of Rows. Specifically for the RID Lookup operator, if a Predicate property is present then, despite the name, this number actually really represents the estimated total of rows for all executions instead of the estimated number of rows per execution. |
Forced Index | This property is always false for a RID Lookup operator. |
ForceScan | This property is set to true if the query used a FORCESCAN hint to force the use of a scan operator even if the optimizer would rather use a seek operator. On a RID operator, this is always false. |
ForceSeek | This property is set to true if the query used a FORCESEEK hint to force the use of a seek operator even if the optimizer would rather use a scan operator. On a RID operator, this is always false. |
Lookup | Always set to true. |
NoExpandHint | This property is set to true if the NOEXPAND hint was used in the query to force the optimizer to use indexes on an indexed view. |
Number of Rows Read | This is the number of rows that was read by the operator as it was navigating the index. In a parallel execution plan, this property shows a breakdown of rows on each individual thread. The difference between this property and the Actual Number of Rows property represents the number of rows that was read but not returned due to the Predicate property. This property is not included for a Key Lookup on a columnstore index. Only available in execution plan plus run-time statistics. When the number of rows read is equal to zero, this property is omitted. |
Object | This property lists the table that the RID Lookup reads from, using three part naming (database, schema, table) and optionally followed by a table alias. |
Ordered | This property is always equal to True for a RID Lookup. |
Predicate | When present, this property defines a logical test to be applied to all rows that are read by the RID Lookup. Only rows for which the predicate evaluates to True are returned. When possible, the RID Lookup operator will push this predicate into the storage engine to prevent extra roundtrips between the operator and the storage engine. Note that a Predicate on a RID Lookup does not reduce the amount of rows that is touched by the operator. The difference between the Actual Number of Rows and the Number of Rows Read property (or their estimated counterparts) shows how many rows were read but not returned. This can be used to gauge how useful an index would be that better supports the search predicates. Also note that when a Predicate property is used, the Estimated Number of Rows, Estimated Number of Rows Per Execution, and Estimated Number of Rows for All Executions properties will be wrong, as detailed above. |
Scan Direction | This property is always set to FORWARD on a RID Lookup. It is not used. |
Seek Predicates | This property lists the Seek Keys property that is used to specify the row identifier, consisting of file number, page number within the file, and slot number within the page, of the row to be read. See the main text for more information. |
Storage | This property determines the type of index being navigated. (Note that this could also be determined by using the DMVs based on the Object property). For a RID Lookup, this can only be RowStore. |
Table Cardinality | This property shows the number of rows in the index’s table when the plan was last compiled. |
Implicit properties
This table below lists the behavior of the implicit properties for the RID Lookup operator.
Property name | Description |
---|---|
Batch Mode enabled | The RID Lookup operator supports row mode execution only. |
Blocking | The RID Lookup operator is non-blocking. |
Memory requirement | The RID Lookup operator does not have any special memory requirement. |
Order-preserving | The RID Lookup operator is fully order-preserving, since it can only return a single row per execution. |
Parallelism aware | The RID Lookup operator is not parallelism aware. |
Segment aware | The RID Lookup operator is not segment aware. |