Scalar Functions
Introduction
Many operators can use scalar expressions in their properties. The best example of this is the Defined Values property of a Compute Scalar operator, but there are other properties and other operators where you can find expressions as well.
Those scalar expressions can use simple operands such as + or *, but in many cases they can also call various functions. Most of those functions are simply the same scalar functions that T-SQL supports; these are all described in Books Online.
Internal scalar functions
But the optimizer also has a set of scalar functions available that are not allowed within T-SQL code, but that it can use in properties of some operators. The table below lists all these functions, with a description of their input parameters and their result.
Function name | Parameters | Description |
---|---|---|
APPROX_COUNT_DISTINCT_CONVERT | expression | Decodes expression, which encodes multiple values into a single column, into its parts; then applies the “Count” logic of the hyperloglog algorithm. Note that expression is the output of the APPROX_COUNT_DISTINCT_ACCUM aggregate function. Introduced in SQL Server 2019. |
APPROX_PERCENTILE_CONT | first percentile expression | Decodes expression, which is the output of the APPROX_PERCENTILE_ACCUM aggregate function. (See also: KLL sketch). Adds all weights to find the number of elements, then finds the values of the elements in positions FLOOR (total weight * percentile) and CEILING(total weight * percentile). Returns the weighted average of those values, where the weight is based on the distance between (total weight * percentile) and their positions in the KLL sketch. The first parameter is an integer value; its function is currently unknown. Introduced in SQL Server 2022. |
APPROX_PERCENTILE_DISC | first percentile expression | Decodes expression, which is the output of the APPROX_PERCENTILE_ACCUM aggregate function. (See also: KLL sketch). Adds all weights to find the number of elements, then returns the value of the element in position CEILING(total weight * percentile). The first parameter is an integer value; its function is currently unknown. Introduced in SQL Server 2022. |
BmkToPage | RID_value | Evaluates the RID_value as a Row identifier (a combination of page number, file number, and slot number of a row in a heap – see also RID Lookup), and returns just the page number, as an integer. |
CONVERT_IMPLICIT | data_type expression third | Converts expression from its current data type to data_type. The third parameter is an integer value; its function is currently unknown; I suspect that it is similar to the style parameter of the CONVERT function. |
ConvertSVToXml | column_set second | Converts column_set from the internal storage format used for sparse columns to XML. The second parameter is an integer value; its function is currently unknown. |
ConvertXmlToRESV | xml second | Converts xml from XML to the internal storage format used for sparse columns. The second parameter is an integer value; its function is currently unknown. |
edge_id_from_parts | object_id graph_id | Returns the JSON representation of an edge in an edge table, as identified by object_id (the OBJECT_ID of the edge table) and graph_id (the internal graph_id of the edge). Introduced in SQL Server 2017. |
FETCH_RANGE | expression | In the Fetch Query subplan for a cursor, the FETCH_RANGE() function returns a 1-based counter to indicate which row from the cursor to retrieve, by position within the result set. The function of the expression parameter is unclear, it has so far only been observed to be 0. |
ForceOrder | expression | Unknown; probably used to ensure that expression is evaluated without reordering elements for performance gain. |
getancestor | node_id second | Used in XQuery evaluation to find the ancestor node of a node in an XML fragment. The node_id parameter represents a node in the internal node table of an XML fragment. The value returned is the node_id of the ancestor of node_id. The second parameter is an integer value; its function is currently unknown; I suspect that it is used to specify the number of levels to go up in the hierarchy of the XML fragment. |
getdescendantlimit | node_id | Used for XQuery evaluation to find the highest numbered descendant node of a node in an XML fragment. The node_id parameter represents a node in the internal node table of an XML fragment. The value returned is the highest numbered node_id that is a descendant of node_id. |
GetGIFromNID | nid_value second | Retrieves the name of a node from the nid value in the internal node table of an XML fragment. The second parameter is an integer value; its function is currently unknown |
getgraphidentity | object_id second third | Provides the next value to insert in the internal graph_id column of the node or edge table identified by the object_id parameter. The function of the second and third parameters is currently unknown. Introduced in SQL Server 2017. |
Getidentity | object id second third | Provides the next value to insert in the IDENTITY column of the table identified by the object_id parameter. The function of the second and third parameters is currently unknown. |
getparent | node_id | Used for XQuery evaluation to find the parent node of a node in an XML fragment. The node_id parameter represents a node in the internal node table of an XML fragment. The value returned is the node_id of the parent of node_id. |
GetRangeThroughConvert (not properly displayed in Management Studio) | begin_expression end_expression seek_flags | This function is not properly displayed in Management Studio. When the Defined Values property includes a GetRangeThroughConvert function, it shows as empty. This function converts a range specified in one data type to a range in another data type; this is typically done to allow the use of an Index Seek operator when the data types of the indexed column and the value to find do not match. The seek_flags parameter is a set of flags, encoded in an integer value, to fine-tune the exact behavior of the seek operator. See Dynamic seek range for more information. GetRangeThroughConvert returns a vector, a set of three values. Two of these are the lower and upper bound of the range in the target data type; the third appears to be always equal to the seek_flags parameter. |
GetRangeWithMismatchedTypes (not properly displayed in Management Studio) | begin_expression end_expression seek_flags | This function is not properly displayed in Management Studio. When the Defined Values property includes a GetRangeWithMismatchedTypes function, it shows as empty. This function converts a range specified in one data type to a range in another data type; this is typically done to allow the use of an Index Seek operator when the data types of the indexed column and the value to find do not match. The seek_flags parameter is a set of flags, encoded in an integer value, to fine-tune the exact behavior of the seek operator. See Dynamic seek range for more information. GetRangeWithMismatchedTypes returns a vector, a set of three values. Two of these are the lower and upper bound of the range in the target data type; the third appears to be always equal to the seek_flags parameter. |
GetSparseColumnValue | column_set column_no | Extracts the value of a single sparse column from column_set, which is stored in the internal storage used for sparse columns. The column_no parameter is the column_id of the column to extract (as stored in sys.columns). |
GetURIFromNID | nid_value second | Retrieves the full URI (namespace plus node name_ of a node from the nid value in the internal node table of an XML fragment. The second parameter is an integer value; its function is currently unknown |
graph_id_from_edge_id | edge_id | Extracts the internal graph_id from the edge_id, where edge_id is a JSON representation of an edge in an edge table. Introduced in SQL Server 2017. |
graph_id_from_internal_graph_data | graph_data | Extracts the internal graph_id from graph_data, which represents a single node or edge in internal graph data format. The internal graph data format combines object_id (an int) and graph_id (a bigint) in a single value; the representation and data type of this internal format are unknown. Introduced in SQL Server 2017. |
graph_id_from_node_id | node_id | Extracts the internal graph_id from the node_id, where node_id is a JSON representation of a node in a node table. Introduced in SQL Server 2017. |
internal_graph_data_from_node_id | node_id | Converts node_id from its JSON representation to the internal graph data format. The internal graph data format combines object_id (an int) and graph_id (a bigint) in a single value; the representation and data type of this internal format are unknown. Introduced in SQL Server 2017. |
IsFalseOrNull | expression | Returns true when expression evaluates to either false or unknown, and returns false when expression evaluates to true. As far as currently known, this scalar function can only be used in Predicate properties. |
LikeRangeEnd | expression | Returns the high end of a range that encloses all values that can match a LIKE expression predicate, so that matching rows can be found using an Index Seek or other seek operator. |
LikeRangeInfo | expression | Returns a set of flags, encoded in an integer value, to define certain seek properties for an Index Seek or other seek operator to find all rows in a range defined by a LIKE expression predicate. See Dynamic seek range for more information. |
LikeRangeStart | expression | Returns the low end of a range that encloses all values that can match a LIKE expression predicate, so that matching rows can be found using an Index Seek or other seek operator. |
manufactureconstructordpath | first node_id third | Used in XQuery evaluation; it is currently not known what exactly this function does; the return value appears to be a node_id (an identifier of a node in an internal node table of an XML fragment). The first parameter is a binary value; its function is currently unknown. I suspect that it holds binary encoded flags to control the behavior of the function. The node_id parameter represents a node in the internal node table of an XML fragment. The third parameter is an integer value; its function is currently unknown. |
manufacturesubnodeordpath | node_id_1 node_id_2 node_id_3 | Used in XQuery evaluation; it is currently not known what exactly this function does; the data is returns is typically passed to a UDX operator. The three parameters node_id_1, node_id_2, and node_id_3 each represent a node in the internal node table of an XML fragment. |
node_id_from_parts | object_id graph_id | Returns the JSON representation of a node in a node table, as identified by object_id (the OBJECT_ID of the node table) and graph_id (the internal graph_id of the node). Introduced in SQL Server 2017. |
object_id_from_edge_id | edge_id | Extracts the object_id of an edge table from the edge_id, where edge_id is a JSON representation of an edge in that edge table. Introduced in SQL Server 2017. |
object_id_from_internal_graph_data | graph_data | Extracts the object_id of a node or edge table from graph_data, which represents a single node or edge in internal graph data format. The internal graph data format combines object_id (an int) and graph_id (a bigint) in a single value; the representation and data type of this internal format are unknown. Introduced in SQL Server 2017. |
object_id_from_node_id | node_id | Extracts the object_id of a node table from the node_id, where node_id is a JSON representation of an node in that node table. Introduced in SQL Server 2017. |
PROBE | opt_bitmap expression | This function computes the hash of expression and then checks whether the bit for that hash is set or reset in opt_bitmap (which is a reference to a bitmap as created elsewhere in the execution plan, by a Bitmap, Batch Hash Table Build, Hash Match, or Adaptive Join operator). As far as currently known, this scalar function can only be used in Predicate properties. |
RaiseIfNullInsert | expression | This function is used in insert operators to force an error is a NULL is inserted in a column that doesn’t allow NULL. If expression is NULL, the function raises a run-time error that aborts execution and rolls back the transaction. If expression is not NULL, then the function returns expression, unchanged and in its own data type. |
RaiseIfNullUpdate | expression | This function is used in update operators to force an error is a column that doesn’t allow NULL would be updated to the NULL value. If expression is NULL, the function raises a run-time error that aborts execution and rolls back the transaction. If expression is not NULL, then the function returns expression, unchanged and in its own data type. |
RangePartitionNew | expression type boundary_1 boundary_2 … boundary_n | This function converts expression to a partition number. Where the T-SQL function $PARTITION expects to be given a partition function as one of its inputs, this scalar function has all the relevant data in its parameters. The type parameter determines whether a value that is exactly equal to a boundary value belongs in the partition to its left (type = 0) or to its right (type = 1). All remaining parameters are the boundary values for each partition. |
remapbits | expression position_1 position_2 … position_n | Returns an integer that is built from the specified bits taken from expression. So the right-most bit (bit 0) would be copied from the bit in position_1 of expression, the next bit from the bit in position_2, and so no. If not all 32 bits are specified, then any remaining bits are set to 0. |
showplanxmldecompress | plan second | The plan parameter is the internal representation of an execution plan, such as for instance found in the internal system table sys.plan_persist_plan, but converted to the image (!) data type. It returns the XML representation of that execution plan, in a thus far unknown data type. The function of the second parameter is not known at this time, I have so far only observed it to be 0. |
sysmaxdatetime | len | The highest supported datetime2 value, based on len decimal places for the seconds part. The data type is always datetime2(7). So sysmaxdatetime(3) would be 9999-12-31 23:59:59.9990000. |
systrandatetime | The date and time the current transaction (which may be an implicit transaction) started, in UTC, returned as datetime2(7). |
Not properly displayed
In the table above, a few functions are marked as “(not properly displayed in Management Studio)”. These functions have (so far) only been seen in the Defined Values property of a Compute Scalar operator. When this happens, the function can be seen from the actual XML of the execution plan, but Management Studio is unable to properly display it. In most cases, it responds by simply ignoring the function and pretending it doesn’t exist.
Other tools may have their own limitations on which functions they can and cannot display.
Missing any function?
There is no official documentation (that I know of) of these internal scalar functions. The list above has been compiled based on what I have observed in execution plans that I have seen. It is very likely that this list is not complete yet.
You can help me complete this list! If you encounter an execution plan where Compute Scalar uses a keyword not listed here, please save it as a .sqlplan file and send it to me.
Thanks!