I recently received an email from a reader of this blog. Specifically of the plansplaining posts. He wrote that he enjoyed reading it, but he still had some questions. Those questions made me realize that I made a mistake when I started this series. I dove in at the deep end. I assumed a lot of prior knowledge and experience with reading execution plans, but not every reader has that knowledge and that experience.
So I decided that for this one post, I will move from the deep end of the pool to the shallow water. In this post I will cover the basics. Note that this post is titled “plansplaining” – that moniker remains reserved for the deep end! This post, though, will star with the true basics. I am guessing that most readers will be familiar with the first parts but I am still including it just to make sure I cater to everyone. Throughout this post the level will move up and at the end you will be ready to swim for yourself.
Getting an execution plan
Most of the time when you are working on code that needs tuning, you will have the code in a query window in SQL Server Management Studio (SSMS). There are two buttons on the toolbar that you can use to get an execution plan; each has a corresponding item in the query menu and a keyboard shortcut.
When you select the “Display Estimated Execution Plan” option, the text in the query window (or a subset if you have highlighted a part of that window’s contents) will be submitted to the query optimizer to create an execution plan which will then be displayed in graphical format. The query will not actually be executed. This can be a good option to quickly look at an execution plan for a very long-running query, or for queries that would modify data.
When you select the “Include Actual Execution Plan” option, you toggle this option on or off. When it is on, you instruct SSMS to capture actual execution plans for every query executed from that window, and display them in graphical format. This option adds extra information than the estimated execution plan, but you will have to wait for the query to end.
Note that the terms “Estimated” and “Actual” execution plan are actually misleading; both represent the same plan, but an “actual” plan includes additional information that is gathered while the query executes (such as how many rows were actually processed in each stage of the plan).
If you want to follow along, now would be a good time to open a copy of SSMS and connect it to a test instance that has a copy of the AdventureWorks sample database loaded. Open a query window and paste in this simple query:
p.LastName + N', ' + p.FirstName AS FullName
FROM Person.Person AS p
INNER JOIN Person.BusinessEntityAddress AS bea
ON bea.BusinessEntityID = p.BusinessEntityID
INNER JOIN Person.Address AS a
ON a.AddressID = bea.AddressID
ORDER BY p.LastName, a.City;
Now toggle the “Include Actual Execution Plan” in the on state and then run the query. If your SSMS is set to the default option of returning results in a grid, you will now find a total of three tabs in the results pane, as shown below. The “Results” tab is the part that normal people are interested in, and the “Messages” tab can contain interesting information such as error messages or how many rows were returned. However, in this post (and, actually, in almost all content you will find on this site), we are mostly interesting in the third tab, marked “Execution Plan”. Here is what I see when I try this:
In case you have never gone through these steps before, congratulations are due: you have just requested and gotten your very first execution plan. On the other hand, if none of the above was a surprise you could have skipped this section. Let’s see if the next brings anything new for you.
In the execution plan above, you see two basic elements. The first are the various graphical icons, each with two or three short lines of text below them. The second basic element are the arrows; I’ll discuss them shortly but for now I’ll focus on the icons.
Each icon represents what is called an operator. Every operator is similar to a sub-program. Some code inside a larger program that is invoked when needed to perform a specific task. SQL Server execution plans have a rather limited set of distinct operators – less than 70 last time I counted, and most of them are actually very rare; there are only about 15 or 20 that you need to know to understand 99% of all execution plans.
Each operator performs its own specific task. When the optimizer creates an execution plan, it can use as many or as little copies of each operator as it sees fit. The execution plan above contains seven operators: two copies of the Index Scan operator, and one copy each of Sort, Hash Match, Merge Join, Clustered Index Scan, and Compute Scalar. In case you wonder why I didn’t include the icon named SELECT in this list, the reason is that all execution plans have one such icon at the top left that is not really an operator; it is the gateway between the internals of the execution plan and the externals of the client submitting a query and receiving results.
Every operator can be called in three ways (or to phrase it technically, each operator supports three method calls). The Initialize and Close calls are hardly ever relevant for understanding how an execution plan works, so in this post I will focus only on the GetNext call. This call asks the operator to produce a single row of data and return it. After returning a row, control is passed back to the caller. The operator does maintain state, so when the next GetNext call comes in it “wakes up” and resumes where it left off. When the operator has no more rows to return it returns an “end of data” signal; after that the GetNext method cannot be called again, unless the operator is first reset by a new Initialize call.
Every operator has a long list of properties. Some of those properties control exactly how the operator behaves (for instance, a Sort operator will always sort its input data, but the properties define by what column(s) the data will be sorted). Other properties provide estimations of how many rows the operator is expected to produce, or a count (made when executing the query) of how may rows it actually produced.
Accessing the properties of an operator in an execution plan is easy. There are even two methods. The fastest is to hover your mouse over an operator – after a second or so a yellow popup window will appear displaying some (but not all!) of the operator’s properties. The alternative, which allows access to all properties, is to right-click an operator and select “Properties” from the context menu. (If you leave the properties window open at all time, you can bypass the right-clicking and just left-click an operator once to give it focus).
The picture above shows the two versions of looking at the properties of, in this case, the Compute Scalar operator in the plan above. In the properties window (shown on the right) you have the ability to expand or collapse some nodes. In the screenshot the Defined Values property is shown in collapsed state, but I expanded the Output List property by one level to show some more details. If you compare the two pictures, you can see that this Output List property is not visible on the left-hand side, but the Defined Values property is. This is not configurable; Microsoft decides which properties to include in the pop-up. This can vary from version to version. I have also seen situations where, depending on the size of the information and the available screen size, the same property gets included in one case and omitted in another case. The full properties window is a few clicks extra to get, but has the advantage of always exposing all available information.
After talking about the icons, representing operators, it is now time to shift our attention to the arrows. What do they represent? Most people will say that they represent how data flows between operators. And while that is true, the real answer is actually a bit more complex than that. It is really important to always keep in mind that an operator doesn’t act by itself, it only does something when a GetNext call activates it. In other words, an execution plan is “pull-based”, not “push-based”. When people think of data “flowing”, the image tends to shift towards a push-based model, where operators produce rows all the time and push them to other operators, instead of doing nothing until asked to do something.
Let’s look at an example. In the execution plan above, there is an arrow pointing from Hash Match to Sort. To understand the actual logic of this plan, we must read this arrow in reverse order: as the Sort operator calling the GetNext method of Hash Match whenever it needs a row. The Hash Match operator responds by doing some work and then returning a row; this row traverses in the direction of the arrow. The arrows are drawn this way because in many cases, the key to understanding the logic of an execution plan is to follow the data. But you should always keep in mind that this is not an autonomous steady stream of data. The data movement is activated by the recipient of the data requesting a row when it needs one. Not by the producer creating rows whether they are needed or not.
Anyone who has sufficient background in math will recognize the graphical representation of an execution plan as a graph. Don’t worry if you never learned graph theory in high school (or if you forgot about it after graduating). You don’t need to understand this in order to work with execution plans. I only bring this up because, due to this mathematical background, operators in execution plans are often called nodes. The arrows establish a hierarchy between nodes. Looking again at the example above, the arrow pointing from Hash Match to Sort defines Sort as the parent of Hash Match, and conversely Hash Match is a child operator of Sort. As you see, some operators (Hash Match and Merge Join in the example above) have more than one child, where some others (e.g. Index Scan and Clustered Index Scan) have no children at all. Most operators always have exactly one child. And every operator always has exactly one parent, except for the operator at the top left. This operator is called the root node (or simply root) of the execution plan.
Data flow properties
Just as you can inspect the properties of operators by either hovering your mouse over them or by opening the properties window with that operator in focus, you can do the same with the arrows that represent the data flows. (In this case you will always see the exact same set of properties in the popup and in the properties window) .However, this can be slightly misleading. You see, the source of each execution plan is a large chunk of XML (which in turn is a representation of the internal structures SQL Server uses). And in this XML, there is nothing that represents these arrows. The hierarchy of the operators is represented in the XML by how the corresponding elements are ordered. SSMS parses this data and then uses it to represent it as arrows, which is obviously far easier to understand for us humans.
But if the arrows are not actually included in the execution plan XML, where then does SSMS take the properties you see when you hover your mouse? The answer is simple: the properties you see on an arrow are a subset of the properties you see on the operator where the arrow originates. Hover your mouse over the arrow between Sort and Hash Match and you will see the popup shown below and to the left. It exposes only the actual and estimated number of rows, estimated row size, and estimated data size (which is not a true property; it is the result of multiplying estimated number of rows by estimated row size, which SSMS shows as a convenience to the user). Hover your mouse over the Hash Match operator itself and you will see the popup at the right. All the three “real” properties you see on the left are included there as well, plus many more. There is no need to ever look at the properties of a data flow (apart perhaps from ease of access if you only need these specific properties, or if you don’t want to do a simple multiplication yourself).
It is also important to point out that the properties that SSMS shows do not include what I personally consider to be the most important properties of data flows: what data exactly is in the rows that the child operator passes to its parent, and where it is coming from. Have you ever had a manager request you to “produce a report showing a breakdown of sales by region” without any additional specification? Trying to understand an execution plan without knowing exactly what data the operators pass to each other is like trying to build that report without asking the manager which metrics should be included and by which measures the report should be broken down. It’s doomed to fail.
Data flow contents
Looking at a data flow, I always want to know two things: what data is included in the rows that the child operator returns to its parent, and where does this data come from. Let’s start by showing what data is included. This is easy to find if you know where to look. Let’s look at an example. The picture below is the property window for one of the two Index Scan operators (the bottom-most one, to be precise). There are a lot of interesting properties here that provide additional insight in what this operator does, but for the purpose of understanding the data it returns to the Compute Scalar operator that calls it we need to look only at the Output List property. I have expanded it in the window and highlighted in the screenshot.
Every operator in every execution plan has an Output List property. This property lists all the columns that the operator returns. In this case, you see that three columns are returned: the BusinessEntityID, FirstName, and LastName columns, all from table Person.Person in database AdventureWorks2017. This is important information. I know from seeing the Index Scan operator and the table name Person below it in the graphical representation that this operator is reading and returning data from the Person table. But by looking at the properties, I now know that it is not passing the entire row as it is stored in the table, or even in the index. No, the operator picks only three columns from however many are available, and only passes those values. The rest of the data available in the index is, apparently, not needed for this execution plan.
The origin of data
As mentioned before, I do not only want to know what data is passed in each data flow. I also want to know where all that data comes from. In the case of the Index Scan above, this is pretty obvious. The operator doesn’t call any child operators and it does read from an index on the Person table, so obviously all three returned columns hold data that is read from the Person table (or rather its index). That is also pretty easy to conclude from just looking at the column names. Things get more interesting in some other operators. The picture below represents the properties of the Compute Scalar operator, that requests rows from the aforementioned Index Scan and returns rows to the Hash Match operator.
As you can see from the Output List property, this operator also returns rows with three columns each. But they are not the same columns as what Index Scan returns. Two columns are familiar: BusinessEntitityID and LastName (for brevity I omit the full qualification of the column names here). The FirstName column that was passed into this operator is not passed on, but we do see a new column which is named Expr1003. These are the three columns that Compute Scalar passes to Hash Match. The first two columns originate from Index Scan and are simply passed unchanged. But Expr1003 is new. It does not come from Index Scan so it has to be created by Compute Scalar itself. This is confirmed by looking at another property: Defined Values. Unfortunately, SQL Server does not use this property as consistently as I would like. There are operators that include columns in a Defined Values property that are not computed in the operator but taken from somewhere else. There are operators that include columns in the Defined Values but do not bother to include a definition (usually because the definition is obvious if you understand the operator, but it would still be nice for consistency – and for those who do not understand the operator yet – to include the definition anyway). And there are operators that compute new columns and add them in the Output List but do not include a Defined Values property at all.
I have already selected this property in the screenshot above, to show that for properties with a long text as their content SSMS will show an ellipsis button you can click to open a new window with the full content. When I click that button, I see this window open:
With all the parentheses and brackets, this can be hard to parse. I sometimes copy & paste it in an editor and remove brackets and parentheses until I see the light. In this case, though, all I need is some patience and perseverance until I recognize this as an overly complicated representation of the expression for the FullName column in my query: p.LastName + N', ' + p.FirstName .
The optimizer could of course have called this column LastName already at this stage in the execution plan. But that is not how the optimizer works. Whenever a new column is introduced that is not directly taken from a table or index, it is given an artificial name. These names are always made up of a mnemonic prefix (“Expr”, for “expression”, is the most common but not the only prefix you will see) and a four-digit number. Numbering always starts at 1001, but sometimes numbers are assigned to expressions in intermediate phases of optimization and then later removed. Don’t despair if you see Expr1001 and Expr1003 but no Expr1002. This is normal, you are not overlooking anything.
Now I know everything I need to understand what is going on in this section of the execution plan. The Index Scan delivers rows to Compute Scalar with three columns. The BusinessEntityID column is not used by Compute Scalar, but passed unchanged, so I guess this column is needed elsewhere in the execution plan. The FirstName and LastName columns are both used in the expression to compute the Expr1003 column that matches my LastName column in the results. LastName is then also passed to the Hash Match operator so this column is apparently needed in at least one more location in the execution plan. However, the FirstName column was used for this computation only; it is not needed anymore so it is removed from the rows that are being passed between the operators.
Do not fall for the idea that a column named Exprnnnn in the Output List property of an operator is always generated by that operator. That is not the case. A column with that list is computed somewhere but not necessarily in that same operator. For instance, we have already seen that the Hash Match operator receives rows with a.o. the Expr1003 column from Compute Scalar; the Output List of that operator shows that Expr1003 is passed on, unchanged, to Sort. And Sort then passes it to SELECT, the gateway to the client. Which makes sense, of course: we already established that Expr1003 corresponds directly to one of the expressions in the SELECT of our query, so of course all operators have to pass this column to their parent in order for the data to reach the client.
The intention of this post was to highlight some of the basic things you need to look at when you want to read an execution plan. I have not explained all the details of the specific execution plan; that was not the intention. It only serves as an example to highlight how elements in an execution plan interact.
The key takeaways for this post are:
- The icons in an execution plan represent operators. Each operator is a miniature program that performs a very specific task. The properties of each operator expose a wealth of information on what exactly the operator does. It is almost always impossible to understand an execution plan without looking at the properties of the operators.
- The arrows in an execution plan represent data flows. Data moves from one operator (the child operator) to another operator (the parent operator) in the direction of the arrow. However, be aware that data is not constantly and autonomously streaming. The actual process is that the parent operator calls its child operator when it needs a row, and only then will the child operator do what is needed to produce and return a row. It is possible in some execution plans for an operator to never be called; in such cases that operator does no work at all.
- In order to understand how the operators in an execution plan interact, it is insufficient to look at which operators are included in the plan, and how they call each other (or in other words, how data flows through the operators). You will also need to look at exactly which columns are used in each data flow, and where these columns come from. Each operator exposes all the columns it passes to its parent in the Output List If you encounter columns in that list that are not included in the Output List of the operators children, then this column must be computed by the operator itself; in most cases you can find details on how it is computed in the Defined Values property.
This post may have been a bit basic for some of my readers. But for others, it might be just what they needed to make more sense of my other posts. However, it is not always easy to judge whether an explanation is spot-on, too basic, or too advanced. So here is what you can do. Use the form below to leave a reply. If you are still struggling with the concepts, tell me where you are stuck and I will try to help. Do not hesitate for fear of being laughed at. I can guarantee you that I will not ever laugh at someone for asking basic questions. I am very much aware, and so should you, that I once started knowing nothing. I have gone through phases where I had nobody to ask questions too, and I still shudder when I think of the things I did in those days. I also had periods where I did have the option to ask questions, and learn. The only reason that I am where I am today in my career is that I never hesitated to ask questions, but also never took the answers for granted. The best way to learn is to ask a question, get several (different is ideal!) answers, then set up some sample tables and on your playground database and run some code to verify whether the answers are correct. And if they are, then change the code, observe the effect of the change, try to understand how and why the change had that effect, and ask more questions if you can’t.
The first step for all learning is to ask questions!