Merge Join
Introduction
The Merge Join operator is one of four operators that join data from two input streams into a single combined output stream. As such, it has two inputs, called the left and right input. In a graphical execution plan, the left input is displayed on the top.
Merge Join is the most effective of all join operators. However, it requires all input data to be sorted by the join columns. Often this means that a Merge Join can’t be used without adding extra Sort operators. These extra sorts increase the total plan cost. In such cases, the optimizer tends to choose other join operators instead. The alternatives are Nested Loops (ideal for joining a small data stream with a cheap input), Hash Match (most effective for joining large unsorted sets), and Adaptive Join (which can be used when the optimizer finds viable plans for both Nested Loops and Hash Match and wants to postpone the final choice until run time).
The Merge Join operator supports all ten logical join operations: inner join; left, right, and full outer join; left and right semi and anti semi join; as well as concatenation and union. The algorithm requires at least one equality-based join predicate.
Visual appearance in execution plans
Depending on the tool being used, a Merge Join operator is displayed in a graphical execution plan as shown below:
SSMS and ADS |
Legacy SSMS |
Plan Explorer |
Paste The Plan |
Algorithm
The basic algorithm for the Merge Join operator is as shown below:
Note that this flowchart is a simplification. It doesn’t show that execution stops whenever a row is returned, and resumes where it was upon next being called. It also does not show some of the extra logic required to handle many to many joins.
The flowchart shows how the operator processes the data from two inputs at the same time. It compares the values from the two inputs, handles the data as appropriate, and then advances the input with the lowest value. This ensures that matching rows are always processed simultaneously.
In this post you can see some nice (slightly simplified) animations showing the algorithm in action. Thanks, Bert!
Variations by join type
The flowchart above shows the basic outline of Merge Join processing. Depending on the requested join type (the logical operation), some of the steps will change.
Inner Join
For an inner join, “Handle matching rows” returns the combined data to the calling operator. The “Done?” test for an inner join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.
“Handle unmatched left row” and “Handle unmatched right row” do nothing for the inner join operation. Because of that, the “Left row matched?” test is irrelevant, and there is no need to track whether the left row was matched. It is unknown whether this operator is actually implemented to test the logical operation and then skip this step, or whether Microsoft chose to simply run these actions in all cases.
Left Outer Join
The left outer join operation returns the same combined data from “Handle matching rows” as an inner join. However, “Handle unmatched left row” now also returns data, from the left input and with null values for the columns from the right input. “Handle unmatched right row” does nothing.
After reading the next row from the left or right input, the “Done?” test results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.
Right Outer Join
The right outer join operation returns the same combined data from “Handle matching rows” as an inner join. However, “Handle unmatched right row” now also returns data, from the right input and with null values for the columns from the left input. “Handle unmatched left row” does nothing; its associated logic has no effect and may or may not be skipped during processing.
After reading the next row from the right input, the “Done?” test results in yes if the right input is depleted. Execution continues as long as the right input has data, even if the left input is already fully processed.
Full Outer Join
The full outer join operation returns the same combined data from “Handle matching rows” as an inner join. However, both “Handle unmatched left row” and “Handle unmatched right row” now also return data, from the unmatched input and with null values for the columns from the other input.
After reading the next row from the left or right input (depending on where the row was returned), the “Done?” test results in yes if both inputs are depleted. Execution continues until both inputs are fully processed.
Left Semi Join
The left semi join operation returns data from “Handle matching rows”, but only when the left row was not yet marked as matched. If it is already marked as matched, then this is a second or later match for the same left row, and returning it again would cause a duplicate. The data returned will be from the left input only.
“Handle unmatched left row” and “Handle unmatched right row” do nothing in this case. As with the other logical operations, the logic associated with testing whether the left row was already matched has no effect and may or may not be skipped during processing.
The “Done?” test for a left semi join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.
Left Semi Join (probed)
The Merge Join operator does not have an explicit property to show when a left semi join actually executes as a probed left semi join. The only indication is the presence of a column in the Output List that does not come from the left input; this column is typically named Exprnnnn (where nnnn is a 4-digit number that is unique within the execution plan). There is no Defined Values property for this new column.
For a probed left semi join, “Handle matching rows” returns data from the left input, with the Probe column set to true, but only when the left row was not yet marked as matched. If it is already marked as matched, then this is a second or later match for the same left row, and returning it again would cause a duplicate. “Handle unmatched left row” also returns the data from the left input, but now with the Probe column set to NULL.
“Handle unmatched right row” does nothing. The “Done?” test for a probed left semi join results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.
Left Anti Semi Join
For a left anti semi join , “Handle matching rows” does nothing, other than marking the left row as matched. This effectively causes the operator to skip first all rows from the right input with the same values in the join columns.
“Handle unmatched left row” simply returns the row to the caller. “Handle unmatched right row” does nothing at all. The “Done?” test for a left anti semi join results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.
Right Semi Join
The right semi join operation returns data from “Handle matching rows”. The data returned will be from the right input only.
“Handle unmatched left row” and “Handle unmatched right row” do nothing in this case. As with the other logical operations, the logic associated with testing and tracking matches to the left row has no effect and may or may not be skipped during processing.
The “Done?” test for a right semi join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.
Right Anti Semi Join
For a right anti semi join , “Handle matching rows” does nothing. The “Handle unmatched right row” then is where a row is returned to the caller.
“Handle unmatched left row” does nothing; the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing. The “Done?” test for a right anti semi join results in yes if the right input is depleted. Execution continues as long as the right input has data, even if the left input is already fully processed.
Concatenation
It is rare to see Merge Join (Concatenation) in an execution plan. The Concatenation operator is a cheaper way to achieve the same result, but it doesn’t preserve order. When the concatenated results are required to be ordered, and both inputs are already sorted in that order, then Merge Join (Concatenation) is more effective than combining a Concatenation operator and a Sort operator to restore the correct order.
For a concatenation, both “Handle unmatched right row” and “Handle unmatched left row” return the unmatched row. “Handle matching rows” in this case returns a row from the right input, and does not mark the left row as matched. Because of this, the “Left row matched” test will always result in no. It is, as always, not clear whether this test is still performed or whether it is simply skipped for this logical operation. (Note that it is also possible that the left row does get marked as matched in “Handle matching rows”, but that for this operation, the “Left row matched?” test is simply skipped so that the left row is always returned when that branch is entered).
The “Done?” test only results in yes if both inputs are depleted; execution continues until both inputs are fully processed.
Union
A Merge Join (Union) is often a more effective way to obtain the union of two input sets than using a normal Concatenation and then adding e.g. a Sort (Distinct Sort) to remove the duplicates. This is especially the case if the inputs are already sorted and guaranteed to have no duplicates. If one or both of the inputs can have duplicates, then the optimizer has to add extra steps to remove these before inputting the data into the Merge Join operator. The union logic of Merge Join removes duplicates between the two sets, but does not remove duplicates within either of the sets. It is not uncommon to see a Merge Join (Union) with a Stream Aggregate on both inputs.
For a union, both “Handle unmatched right row” and “Handle unmatched left row” return the unmatched row. “Handle matching rows” in this case returns a row from the right input. Other than for concatenation, it does mark the left row as matched.
The “Done?” test only results in yes if both inputs are depleted; execution continues until both inputs are fully processed.
One to many
For the “real” join types (all logical join types except Concatenation and Union), the algorithm above only works correctly if the left input has no duplicates in the join columns. This is called a one to many join (even when the right input has no duplicates either). The optimizer uses (trusted) constraints and logic of plan elements (e.g. aggregations) to determine whether one of the inputs is guaranteed to have no duplicates. If it finds one such input, it arranges that input to be on the left side and marks the Merge Join as one to many.
Many to many
If the left input might have duplicates, the join is considered many to many. For a many to many merge join, a worktable is used in tempdb to store values from the right input that need to be used multiple times. This is avoided in some cases where it is not really needed. To achieve this, a few alterations are made to the algorithm.
In “Handle matching rows”, the operator will first read the next row from the right input and store this in the “peek-ahead buffer”, a temporary holding area in memory. It then compares the values in the current row from the right input and the “next” row in the peek-ahead buffer. What happens next depends on whether or not they are the same.
Unique row
When the values in the join columns of the current row from the right input and the peek-ahead buffer are not equal, the current row is unique in the right input. The operator returns the combined results. Once it re-gains control, it proceeds with “Left: GetNext()” instead of “Right: GetNext()”. If this row has the same values in the join columns as the previous row, it will again end up in “Handle matching rows”; the next row already is in the peek-ahead buffer and the values are still different so another match (between that left row and the still current right row) is returned. This repeats until the values in the left input’s join columns change. At that point, “Compare key values” enters the “Left > Right” branch. In this branch, a “Right row matched?” test is then added, to distinguish between right rows that had one or more matching left rows, and unmatched right rows. After that, the “Right: GetNext()” does not fetch its next row from the right input, but from the peek-ahead buffer.
Duplicated rows
When the values in the join columns of the current row from the right input and the peek-ahead buffer are the same, these two (and possibly more) rows in the right input are duplicates on the join columns. In this case, the Merge Join stores both the current row from the right input and the row in the peek-ahead buffer in a worktable (in tempdb), then continues to request rows from the right input and add them to the worktable. This continues until a row is returned with a different value in the join columns, which is then stored in the peek-ahead buffer. At this point, all rows from the right input with the same join value are processed and stored in the worktable, and one extra row from the right input is stored in the peek-ahead buffer.
The algorithm then reads the first row from the worktable, produces the combined row of this row and the matching row from the left input, and returns this. Upon regaining control, “Right: GetNext()” reads the next row from the worktable, then moves back to the top. If there are no more rows in the worktable, then the operator reads the next row from the left input, reads the first row from the worktable again, and moves back to the top.
If the new left row has the same values in the join columns as the previous row, it will again end up in “Handle matching rows”. All matching rows are still in the worktable, so the above process is repeated: rows are read one by one from the worktable and matching rows are returned, and then the next left row is requested. This repeats until the values in the left input’s join columns change. This would normally direct the algorithm to run GetNext() on the right input. In this case that has already been done and that row is stored in the peek-ahead buffer. So instead of calling its child operator, the Merge Join now clears out the worktable, promotes the peek-ahead buffer to be the current row, and then resumes from the top.
Unmatched right rows
For join types that need to return unmatched right rows (Right Outer Join, Full Outer Join, and Right Anti Semi Join), the operator adds one extra column to the worktable to track, for each row from the right input, whether it has already been matched. This column is set to false when rows are added to the worktable, and changed to true when a match between a left row and a row from the worktable is found.
After a row is read from the left input with different values in the join columns, one extra pass is made over the data in the worktable to find all rows that have not been marked as read; these can then be returned.
Note that this additional logic is only actually needed when the operator has at least one additional predicate in the Residual property, because otherwise all rows in the worktable are by definition matched against the first (and any other) left row with the same values in the join columns, but there is no logic to skip this extra work if there is no Residual.
Operator properties
The properties below are specific to the Merge Join operator, or have a specific meaning when appearing on it. For all other properties, see Common properties. Properties that are included on the Common properties page but are also included below for their specific meaning for the Merge Join operator are marked with a *.
Property name | Description |
---|---|
Defined Values * | Observed on Concatenation and Union only. This property lists, for each output column, the two corresponding input columns. |
Logical Operation * | The requested logical join type. Possible values are Inner Join, Left Outer Join, Right Outer Join, Full Outer Join, Left Semi Join, Left Anti Semi Join, Right Semi Join, Right Anti Semi Join, Concatenation, and Union. |
Many to Many | Used on all logical operations except Concatenation and Union. See the main text for further details. |
Output List * | If the Pass Through property is present, the Output List property may include a column named Passnnnn (where nnnn is a 4-digit number that is unique within the execution plan). This is a bit column that is set to 1 if the condition in the Pass Through property was met, and set to 0 otherwise. This Passnnnn column is not included in the Defined Values property. |
Pass Through | When this property is present, it specifies a condition that is evaluated for each row from the upper input. If the condition is met, that row is passed unchanged with NULL values for columns from the lower input, regardless of whether it has matching data. This property has so far only been observed for the Left Semi Join operation. |
Residual | Used on all logical operations except Concatenation and Union. Specifies the full logical condition that has to be met for a combination of rows from the left and right input to be considered a match. The Residual property always includes all elements of the Where (join columns) property, but may also include additional predicates. |
Where (join columns) | Used on all logical operations except Concatenation and Union. Specifies the part of the join condition that is used to drive the Merge Join algorithm. Additional predicates that are applied after finding a match on this property are described in the Residual property. The Where (join columns) property is broken down into two subproperties: Inner Side Join columns (for the right input) and Outer Side Join columns (for the left input). The names of these properties are a confusing reference to the input names of the Nested Loops operator. For Concatenation and Union, the Merge Join algorithm is always based on comparing all columns from both inputs. The Where (join columns) property is not listed in this case. |
Implicit properties
This table below lists the behavior of the implicit properties for the Merge Join operator.
Property name | Description |
---|---|
Batch Mode enabled | The Merge Join operator supports row mode execution only. |
Blocking | Merge Join is a non-blocking operator. Technically, there is some (small) amount of blocking in the case of a many to many Merge Join: when duplicates are detected in the right input, they are all loaded into the worktable before the operator can start to return data. |
Memory requirement | The Merge Join operator does not have any special memory requirement. |
Order-preserving | The Merge Join operator is in most cases fully order-preserving for both the left and the right input. For a many to many Merge Join, the order of the left input is fully preserved; the order of the right input is only partially conserved. The operator’s output is ordered by all the join columns, but the order of rows from the right input that have the same values in these columns can be affected. When the Logical Operation property is Left Outer Join, then the Merge Join preserves the order of the right input, but not the left input, because NULL values are supplied for right-retained rows. Similarly, for a Right Outer Join the left input preserves order but the right input does not, and for a Full Outer Join, the order of neither input is preserved. |
Parallelism aware | The Merge Join operator is not parallelism aware. |
Segment aware | The Merge Join operator is not segment aware. |