This month (May 2020), I will post several blog posts, all about the same theme: row counts in execution plans, their representation as arrow width; how to use it, and how it can confuse you.
Of all the properties you find in execution plans, the Actual Number of Rows and Estimated Number of Rows properties are perhaps the most useful and most used, and definitely in the top three. They can be used for many things, and are extremely valuable. But there are some gotcha’s that a lot of people are unaware of.
In this first post, I will highlight a few of the useful things that can be done by looking at Actual Number of Rows and Estimated Number of Rows. Most of this is pretty basic and well known, though perhaps not all of it. In the follow-up posts later this month, I’ll look at various cases where these properties, or their representation in the user interface, can confuse or mislead us.
Visual representation of Actual and Estimated Number of Rows
Because both the Actual Number of Rows and the Estimated Number of Rows are very valuable for many use cases, Microsoft has built their primary tool for viewing execution plans (SQL Server Management Studio) to highlight the value of at least one of these two properties, in a visual way, when you look at the graphical representation of an execution plan.
Look at the arrows that show which operator (is called by and) returns rows to which operator, and that therefore can be thought of as representing the data stream. In most execution plans, they are not all the same width. Management Studio uses the width of the arrow as a quick, simple visual representation of the value of Actual Number of Rows (in an execution plan plus run-time statistics, aka “execution plan plus” or “actual execution plan”), or of the value of the Estimated Number of Rows in an execution plan only (aka “estimated execution plan”).
While this does of course not help you to know the exact value of these properties, it does allow you to immediately see whether an operator passes many rows to its parent, or just a few. If you want to have more detailed information, you can hover your mouse over these arrows and you will see a popup with a subset of the properties of the operator to the right; the subset chosen are the properties related to the amount of data (estimated to be) produced and returned.
Locating busy areas in the execution plan
Take a look at the execution plan above. While a bit more complex than most execution plans in demo and sample code, it’s still relatively simple: two relatively small branches, combined by a Concatenation operator, and then some aggregation done. Real execution plans of real queries are of course never this simple, but I’ll use this example to make my point.
Let’s assume that this query is taking too many resources, or running too slow. And I need to find the root cause, so we can try to fix it. Where should I start? The answer is that, given this execution plan, I would always start at the lower branch. And I can see that without even looking at any numbers, without diving into the properties of any operator, purely by a quick glance at the visual representation. All because the arrows in that lower part are much thicker, a clear indication of much more work being done by those operators than by the operators in any other part of the execution plan.
This is of course not a guarantee. It’s always possible that there is a section in the execution plan that does not process a lot of rows, but still takes a lot of resources. But those are the exceptions. So as long as I keep in mind that I can’t 100% rely on it, I will almost always start by investigating the areas with the wider arrows. Most of the time that means I can zoom right in on the problem areas. The cases where I later found I started in the wrong area are the exceptions.
Do keep in mind that the arrow width is based on the Actual Number of Rows in an execution plan plus run-time statistics only. In an execution plan only (as well as in a live execution plan), the lines represent the Estimated Number of Rows, which is not always correct.
Finding bad cardinality estimations
In an execution plan plus run-time statistics, as well as in an execution plan with live statistics (aka “live execution plan”), operators will report both the Actual Number of Rows and the Estimated Number of Rows properties. You can compare these, for every operator, and look for cases where they are very different.
Of course, since the Estimated Number of Rows is an estimation, you should not expect it to be completely correct. And you should not worry if it’s wrong. As long as it’s not wrong by too much. How much is “too much” is a question I’m not going to answer here, as there’s a lot more to that question than meets the eye at first. What is relevant for this post is that it’s important to know which estimates were totally out of whack.
We’re always able to find that information by looking at the properties of each operator, either through the popup (as shown in the picture to the right) or through the full properties window. In newer versions of Management Studio, an alternative way to quickly assess the difference between Actual Number of Rows and Estimated Number of Rows is to look at the completion percentages underneath. They were first indicated only in live execution plans, when they were introduced; but in a recent version of Management Studio they were also added to the execution plan plus. As you see in the highlighted fragment to the left (which is for the same operator as the property popup above), both the Actual Number of Rows and the Estimated Number of Rows are indicated, as well as their ratio. In a live execution plan, while the query is still executing, these numbers can still change and can hence be thought of as a progress indicator. In an execution plan plus, this is the final number. The big difference is an indication that the Estimated Number of Rows was much too high in this case, and this knowledge might help us to identify why the query runs too slow.
Identifying parallel skew
The third and last example of simple and effective usage of (in this case) the Actual Number of Rows property is to check whether performance in an execution plan perhaps has suffered from parallel skew. This term refers to a situation where an execution plan, or a part of it, runs in parallel, using multiple threads to each do a part of the work. Ideally, you want each of the threads to do roughly the same amount of work. Or rather, you want each thread to have its work completed at the same moment, which usually happens if the threads do the same amount of work.
In an execution plan plus run-time statistics, you can open the full properties window of any operator. If that operator runs in a parallel section of the execution plan, you will see a little plus sign to the left of the Actual Number of Rows property. And if you then click that plus sign, the property will expand to show how many rows were produced by that operator on each of the active threads.
In the screenshot here, you can see that in this case thread 5 process a lot less rows than the other threads. And while I have seen bigger differences than this, it might still be useful to investigate what caused this difference. It is of course possible that some of the rows in the input data require more effort to process, in this operator or in another operator in the same parallel section. If those “more expensive” rows all end up on thread 5, then that could be a valid explanation why it processed less rows and still all threads finished at the same time. But it’s also possible that there is some other reason why this thread received far less rows to process, and in that case the result might be that this thread spent an excessive amount of time waiting for the other threads to finish their work. Once you know that, you can see if that can be fixed, resulting in a better distribution of work, less resources wasted, and faster execution of the query.
The Actual Number of Rows property in an execution plan plus run-time statistics (as well as in a live execution plan), as well as the Estimated Number of Rows property in every execution plan, are invaluable tools. They can be used to identify the busiest sections of an execution plan, to find where bad cardinality estimations might have resulted in suboptimal plan choices, and where speed of parallel processing may have been hampered by skew. The visual representation of these properties, in the form of narrow or wide arrows, helps to visualize this information and draw attention to it.
But life is not always perfect. In my next blog posts, I will call out a few cases where either the reported Actual and Estimated Number of Rows, or their visual indication, can be confusing, misleading, or even wrong. Knowing these issues will help you work your way around it, and can prevent confusion, so stay tuned!