Remote Scan
Introduction
The Remote Scan operator reads data from an ‘object’ on a remote server. This means it first reads the first row (for whatever definition of ‘first’ is valid for the object), then the next row on the second call, until the last row, or until the parent operator stops asking for more rows.
Cases where Remote Scan are used are, based on my observations so far, are:
- Using OPENROWSET to read all data from a table or view at a remote server.
- Using OPENQUERY to execute a query at a remote server and receive the results.
- Using the BULK INSERT statement or OPENROWSET(BULK, …) to read data from a file.
- Using INSERT INTO … EXECUTE (‘query text‘) AT server to execute a query at a remote server and store the results in a local table.
If you find other used cases where Remote Scan is used, please let me know!
Visual appearance in execution plans
Depending on the tool being used, a Remote Scan operator is displayed in a graphical execution plan as shown below:
SSMS and ADS |
Legacy SSMS |
Plan Explorer |
Paste The Plan |
Algorithm
According to this answer on SQLServerCentral.com, a remote scan “usually” copies the data to “local memory (tempdb). I have not found any confirmation for this statement on any other sites. Also, execution plans that use a Remote Scan operator do not have an increased Memory Grant, nor do they report any I/O on a tempdb worktable. Finally, when I create a query that reuses the results of an OPENQUERY expression multiple times, the optimizer adds a Table Spool operator to store the results in tempdb instead of having to execute the remote query again, which would not be needed if the Remote Scan already stores those results itself. As such, my assumption is that the statement at SQLServerCentral.com is incorrect. The algorithm described here does not show this alleged “usual” functionality.
The basic algorithm for the Remote Query operator is as shown below:
Note that this flowchart is a simplification. It doesn’t show that execution stops whenever a row is returned, and resumes where it was upon next being called.
Send query
The Send query action sends a statement to be executed (as stored in the Remote Object property) to the remote server where it has to be executed (the Remote Source). There are a few special cases, depending on what function resulted in the Remote Scan in the execution plan.
For OPENQUERY, both the Remote Source and the Remote Object are included, as specified in the arguments to OPENQUERY.
For OPENROWSET, the Remote Source is included, as either the query text or the object name passed into the OPENROWSET function. The Remote Object is set to “<UNKNOWN>”. I assume that the actual information (provider name, and data source, user id, and password or connection string) is stored in the internal execution plan, but they are not included in the export to XML.
For BULK INSERT and OPENROWSET(BULK, …), the Remote Source property is missing and the Remote Object is set to “STREAM”. None of the details (such as file path, format file, or bulk options) are exposed in the graphical execution plan or in the execution plan XML. I assume that they are stored in the actual internal representation of the execution plan, but not included in the export to XML.
For INSERT INTO … EXECUTE (‘query text‘) AT server, the execution is actually split across two execution plans. The first execution plan is only exposed in an execution plan with runtime statistics, and consists of just the Remote Scan operator, with the Remote Source property missing and the Remote Object set to “STREAM”. I assume that the target server and the query to execute are actually stored in the internal execution plan, but they are not included in the export to XML. The second execution plan, which is visible in both execution plan with runtime statistics and execution plan only, uses the Parameter Table Scan operator to fetch the data that was returned from the remote execution, so that it can be inserted into the target table.
Read first / next row
After sending the query to be executed to the target server, Remote Scan fetches the result rows from the target server one by one, and then returns them to its parent operator.
Operator properties
The properties below are specific to the Remote Scan operator, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the Remote Scan operator are marked with a *.
Property name | Description |
---|---|
Estimated Number of Rows * | For the Remote Scan operator, the Estimated Number of Rows property is always equal to 10,000. |
Estimated Number of Rows for All Executions * | For the Remote Scan operator, the Estimated Number of Rows for All Executions property is always equal to 10,000 multiplied by the Estimated Number of Executions. |
Estimated Number of Rows Per Execution * | For the Remote Scan operator, the Estimated Number of Rows Per Execution property is always equal to 10,000. |
Remote Query | The text of the query that gets sent to the Remote Source for execution. This can be a query or the name of a stored procedure. When Remote Scan is used for BULK INSERT, OPENROWSET(BULK, …), or INSERT INTO … EXECUTE ('query text') AT server, this property is not exposed in the execution plan. |
Remote Source | The name of the server where the Remote Query needs to be executed. This property only shows the actual remote server name when Remote Scan is used for OPENQUERY. For OPENROWSET, it shows “ |
Implicit properties
This table below lists the behavior of the implicit properties for the Remote Scan operator.
Property name | Description |
---|---|
Batch Mode enabled | The Remote Scan operator supports row mode execution only. |
Blocking | The Remote Scan operator is non-blocking. |
Memory requirement | The Remote Scan operator does not have any special memory requirement. |
Order-preserving | Even though the Remote Scan operator is fully order-preserving, the query optimizer considers its results as unordered, because it does not have information about how the Remote Source executes the Remote Query. |
Parallelism aware | The Remote Scan operator can only be used in serial execution plans. |
Segment aware | The Remote Scan operator is not segment aware. |