A few weeks ago I made a rather embarrassing discovery. I realised I had forgotten to include one of the most basic teachings about execution plans in my video training course. The subject is too important to leave out. It is too broad to add to an existing video. So I decided to add a seventh chapter to the (free!!) basic level of block 1.
When I write, present, talk, or record about execution plans, I focus on how operators interact with each other and how that results in the desired results. Understanding performance issues comes automatically once you know how the plan works, and because you know how the plan works you can be far more certain as compared to tuning strategies that don’t use the execution plan or only inspect fragments of it.
In order to understand those interactions, I focus on what operators do and how they do it; and this goes hand in hand with having a through understanding of what properties each operator has and what they mean. So that’s where I usually focus. The operators and the properties. And when teaching I like to also direct attention to issues that affect how properties are shown in the execution plan, because if some weird issue causes the value shown to be incorrect, you can waste a lot of time if you still trust that value.
But I seem to have a blind spot. Some things are so obvious to me that I can easily forget that they are not obvious to everyone.
Percentages in execution plans
A lot of information you see when looking at execution plan is shown as a percentage. And those percentages are often the first thing that draws our attention. So it’s really important to know what those values are. And yet, I had forgotten to include coverage for percentages in my training videos. And you’ll also not find those percentages mentioned in the Execution Plan Reference. Why? Simple. They don’t exist.
It is here that I imagine a short silence. Followed by a huge outburst. “What? Of course they exist! I look at them every day!”
So, let’s take a step back. If you look at an execution plan, in Management Studio, Azure Data Studio, Plan Explorer, or any other tool, what you see is not the execution plan. You see a representation of the execution plan. In reality, the tool receives a bunch of XML from SQL Server, and then converts it into the pretty pictures and neatly organized property lists we see. It is the XML that I usually refer to when I talk about the “actual” execution plan. (Although to be totally honest, even that is already created of a transformation of how the execution plan is stored internally in the engine; that format is undocumented and not accessible to us mere mortals).
The actual execution plan, the XML that the tool received from the engine when requesting the execution plan, does not store these percentages. Trust me. Or not: most tools provide ways to see the actual XML version of the execution plan so you can inspect it and verify that these percentages are, indeed, not included.
What you see when you look at these percentages on your screen is the result of a computation done in the tool. Done with the purpose to make the information easier to understand. Which is of course great, and I am totally happy that these tools provide that service.
However, it does require that people looking at and using these percentages know what they are based on, and know the limitations and issues of the underlying properties.
There are four main issues with percentages that I keep seeing confusion about. Cost of an execution plan relative to the batch. Cost of an operator relative to the plan. Rows returned in a live execution plan. And rows returned in an execution plan plus run-time statistics.
The video I recorded covers all these four scenarios. It’s slightly under twenty minutes and I recommend that everybody watches it. Please click here to access the basic level of block 1 of the training videos, and then select the seventh video.
If you don’t have twenty minutes right now, then at least read the rest of this blog, where I will touch on the most relevant issues for each of those areas.
Cost of a plan relative to the batch
In any execution plan, whether with or without run-time statistics, Management Studio (as well as many other tools) shows a number marked as “Query cost (relative to the batch)” above each execution plan. When tuning a batch, stored procedure, or any other code unit with multiple queries, people tend to use this to quickly direct their attention. “This query is only 2% cost, no need to look at it. This one is 10%, hmm, keep it in mind. Oh wait, here’s one that uses 78%. That’s where I’ll focus”.
I’ve also seen this used as a quick measure of effectiveness for tuning. Create a single batch with two queries: the original one and the (hopefully) optimized one. Look at the execution plans. If the optimized version is less than 50% we improved performance.
When using these numbers, always keep in mind what they are based on: the Estimated Subtree Cost of the top left operator, or in other words the estimated total cost of the entire execution plan. And yes, it’s always the estimated cost. Never the actual cost. Simply because the execution plan does not have a property for the actual cost. Microsoft never built anything into the execution plan to compute and return the actual cost. All we have is the estimate.
Which means that this number is wrong. Estimates are wrong, by definition. In an ideal world, they are just a tiny bit wrong. But in an ideal world, our queries run fast. If you need to look at execution plans, you are not in an ideal world. If your query runs slow, totally incorrect cardinality estimates may well be the root cause of it. Well, the estimated query cost is based on those incorrect cardinality estimates. And the percentage is based on that estimated query cost. You are, effectively, looking at a big steaming pile of nonsense based on errors. And yet you still trust that to determine where to spend your energy?
And even when the cardinality estimates are close to perfect, you still should be aware that the computed cost is just an attempt to capture something too complex for a single metric as a single metric. When your hardware is different from the hardware the costing algorithms are based on (“Nick’s machine”), it’s bound to fail.
Cost of an operator relative to the plan
This second issue is closely related to the previous one. The screenshot above is from Management Studio, but other tools show the same percentages, just in other ways. In fact, some tools even use extra colour highlighting to draw immediate attention to high percentages.
“This execution plan looks really complex. But I see that the Sort takes 9% and the Concatenation even 11%. Let’s focus there first.”
Like the conclusions drawn from query cost relative to the batch, the conclusions based on operator cost relative to the query are also no more reliable than the underlying data. Which in this case, again, is the Estimated cost of each operator. The screenshot above is from an execution plan only. But in an execution plan plus run-time statistics, the same values would be displayed. Simply because the actual execution plan XML does not include any actual values. Your tool cannot work based on data it does not have.
(Plan Explorer is the notable exception here, it has a process called “recosting” that attempts to reconstruct actual cost based on measurements done while running the query; but that process comes with its own issues and caveats).
Rows returned in a live execution plan
A live execution plan shows, underneath each operator, several constantly updating metrics. Such as how long the operator has been active, and how many rows were already returned versus the expected amount.
Those last two numbers are also presented as a percentage. So in the screenshot above, the highlighted 39% is simply the result of comparing the 188 rows returned so far to the operator’s Estimated Number of Rows, which is 473.
It’s tempting to look at the operator and conclude it has completed 39% of its work. But as with all metrics based on estimates, that’s a little off the truth. The only thing we can conclude for sure is that it has completed 39% of its estimated work. That estimate might be wrong.
And that is even visible in the screenshot, in the second highlighted operator. The Clustered Index Scan operator is shown as being at 2055%. And no, that does not mean it decided to do the same work a few times extra, just for fun. It just means its cardinality estimate was wrong. It was expected to return just 9 rows. But we are in reality at 185 already, and still counting!
Rows returned in the execution plan plus run-time statistics
Depending on your tool, you might see similar percentages as in a live execution plan when you look at the execution plan plus run-time statistics. But of course, since the execution plan plus is only returned when the query is completed, these can never be mistaken as an indicator of progression. Otherwise they’re still the same though: rows returned as a percentage of the number of rows that the optimizer expected.
Since the query has completed, a percentage that is far above or far below can be safely taken as an indication of a large difference between the estimated and actual number of rows. Since bad estimates can often be a root cause of suboptimal plan choices and slow queries, this is very useful, and it’s definitely nice to have this very quick and easy way to find bad estimates.
Of course, you still do need to be aware of situations where the Estimated Number of Rows of an operator is mis-reported in the execution plan XML. But these are not in scope for this post.
Showing data as a percentage is useful, because it makes it very simple and easy for us humans to see the bigger picture. But we should never forget that these percentages are nothing more than the result of applying maths to the actual source data. If we don’t know what the source data is and what limitations and issues it has, then percentages become dangerous and misleading instruments.
The seventh video in the basic level of block 1 of the SQLServerFast Execution Plan Video Training explains this in detail. You can watch it for free, no strings attached. And you can watch the other six videos in this block as well. The more you know about execution plans, the more effective your query tuning will be!
Thank you Hugo. I was watching your “SQLBits 2021 Execution plans … where do I start?” where you mentioned that cost base percentiles is just an pre execution estimate for whatever execution plan is chosen. Thanks for including in the series