Rows read versus rows returned
The visual of the arrows in an execution plan strongly suggests that they represent the flow of rows from one operator to another. And hence, the width of that arrow strongly appears to be an indication of how many rows are passed between the connected operators. That used to be always the case. But unfortunately, this changed in December 2017 (or earlier, but that was when I first noticed the change).
The root cause is the addition of a new property, returned by Scan and Seek operators: Number of Rows Read (as well as its estimated counterpart: Estimated Number of Rows to be Read). Now don’t get me wrong, those properties are awesome and I’m super happy that they were added to execution plans. When a filter condition is pushed into a scan or seek operator as a Predicate property, I want to be able to see how effective it is, and comparing Actual Number of Rows to Number of Rows Read (or Estimated Number of Rows to Estimated Number of Rows to be Read) helps me assess just that. Which in turn can help me decide whether I should change my indexing, or try to rewrite the query. But I digress.
The problem with these properties is that, as of the 17.4 update of SQL Server Management Studio, the rendering logic for execution plans was changed to base arrow width on Number of Rows Read when available, instead of basing it on Actual Number of Rows as was the case before. When I first noticed this I thought it was a bug and reported it as such, but Microsoft closed it with a comment suggesting that this was a deliberate decision, as a way to visualize the amount of work done by the operator.
I understand the desire to make operators that do a lot of work stand out. I understand that, for Scan and Seek operators with a Predicate property, there is no incoming arrow that can be rendered with extra width to show more rows being read and processed. But I disagree with Microsoft’s choice to then use the outgoing arrow for this indication instead. It’s far too easy to be misled into thinking that a wide arrow between a Scan and its parent operator represents a lot of rows being returned to, and processed by that operator. And even if I am not misled, then I now still have no good visual indication of the amount of work that DID get returned to the parent operator.
And as if all of this is not confusing enough already, the same situation is not handled the same way in all cases. In an execution plan plus run-time statistics (aka “actual execution plan”), the Number of Rows Read property is used for arrow width when available. But in an execution plan only (aka “estimated execution plan”), the arrow width is still based on the Estimated Number of Rows, not on the Estimated Number of Rows to be Read. Which, by the way, does have my preference! But it’s not consistent, and makes it very hard to have an instinctive interpretation of execution plans based on the visual representation only.
Using the arrow between a scan or seek operator and its parent to represent the amount of rows read and processed by the scan or seek, and not the amount of rows passed to and processed by that parent, is confusing. I understand why Microsoft wants to give visibility to operators that process a lot of rows, but this is not the way.
In SentryOne’s Plan Explorer, scan and seek operators that read more data than they return are called out by putting a warning symbol on that operator. Why not use that same method in Management Studio? Or, if that’s not desired, why not add an extra icon that represents the storage subsystem, so that SSMS can now draw an arrow to represent how many rows were read from storage by the operator, without having to abuse an arrow that should be used for the amount of rows it returns?
I have created a new feedback suggestion with this request. Please vote for it if you agree!