Thinking about this topic, I remembered two moments in my career where I really was stuck … and then got unstuck in two very different ways.
The unconscious solution
The first event I want to share took place a long time ago. It was early in my career, my first programming job in fact. My job was to maintain and improve some programs, written in BASIC, for the administration of a laboratory.
I no longer remember what the issue was that had me (figuratively) banging my head against the wall. But I do remember the circumstances. It was a Friday, I was working on a program and I simply did not see any solution to the issue I was facing. I really wanted this done. I was eager to get this program done before the weekend, so that I could start on a fresh assignment on Monday. But I simply did not see any way how to solve my problem.
With hindsight, it is probably a good thing I didn’t own a car yet. If I had, I might have been sitting there, struggling to come up with something, until who knows how late. But now I had to leave the office in time to take the last bus home.
On the bus I was probably still mauling the issue in my head. But once home there was distraction. I cooked a meal, ate, flicked my favorite records on the record player, and played some computer games. I went to visit my parents on Saturday and Sunday. All this time I had so much distraction that I never spent one extra second contemplating the problem I had been facing that Friday.
And then on Monday I went back to the office, asked myself “now where was I?”, then remember that I had been very much stuck the Friday before. Anticipating yet another day of banging my head against, the wall, keyboard, and any other available object, I opened the program that I had been working on and the assignment document to remind me what it was I was about to do. I looked at the code. Looked at the assignment. Looked at the code again. And then I laughed at myself for getting stuck at this, because there was such an obvious, simple, elegant solution to this. I started typing, tested the code, and was done in less than half an hour.
I later realized that my mind apparently just needed to relax. When I was consciously trying to force myself to find a solution, my mind blocked. When I started relaxing and doing other things, I gave my subconscious mind the freedom to ponder the problem, without pressure or stress. I do not know whether the solution was born when I was playing games and listening to music, when I was sleeping, or when I was spending time with my parents. But I do know that it was there, ready for the taking, when I was back in the office.
This taught me an important lesson. Software developers (and probably most other professions too, but I’m not speaking for them) have their own version of writer’s block. If I run into that condition, then it doesn’t help to try to force a solution. It is far more efficient to just put the task to rest, pick up other tasks (or some time off), and let my subconscious mind do the hard work.
But taking time off until a solution presents itself is not a magic bullet that can be always used, as I experienced many years later. I was well into my 40’s and I had switched to database development. At that time I was working on a system to facilitate the implementation of rule-based derivation and verification rules. All rules were entered in a database and then the system would make sure that when data changed, derived data would automatically be updated and verifications were checked.
The problem I was facing was caused by the lack of restrictions. It was perfectly fine for users to specify rules where new information was derived from other derived information. My task was to design and write the code to optimize the order in which derivations were executed.
I had already spent multiple weeks on this, and I was getting nowhere. Whatever I tried, I kept getting incorrect results. Rules that should be evaluated were skipped, or were executed before the input data was derived, or sets of rules started running in endless loops. No matter what I did, I was unable to find the root cause. Whenever I found a problem I would issue a fix, only to get two new problems.
Taking my mind off the problem clearly didn’t work. There had been multiple weekends. I had worked on and completed various other tasks. I still ran into that same brick wall every time I returned to this issue.
Solving this was a lot of hard work. What I ended up doing was ditching the computer and the algorithm and resorting to pen and paper instead (well, actually an MS word document). I came up with various examples of what should be valid input to the algorithm. I tried to come up with the ideal order of events myself. And then I finally realized why I was never able to find a solution to the problem – it is because there was none. Even some fairly basic scenarios already caused loops in the ordering, with no way to automated find even a fairly okay order of processing the loop. The basic problem was in the foundation of the tool we were building. The assumption had always been that it should be possible to build this by tracking data that had changed and then processing the affected columns one at a time, in proper order. After several weeks of painstakingly working through countless examples, I was able to prove that this assumption was wrong.
I must admit that I was close to throwing the towel at that time. But I didn’t. I now knew the problem. I knew we had built on the wrong assumption. So why not change the assumption, build a new foundation, and take it from there?
Because I already had put in so much effort, I knew exactly where the old assumption failed. In this specific system, the choice to base the processing order on “processing one column at a time” was wrong. Instead, the processing should be based on “performing one process at a time”, and now I suddenly had a much easier solution to my original problem. Finding the correct order for processes was easy. Yes, there were still cases where I ended up finding a loop but that was always because the input had a loop (did I already say that there were hardly any restrictions on the input?) – so that was okay.
But the entire foundation of how we tracked the information, how the processing was orchestrated, would have to change. That was a lot of work and I did not even want to start it before making sure that it would actually have the intended results. So I went back to my Word document, described how the architecture should look like, then stepped through the processing for all the examples I had used to show that all of them would result in the correct result, in the most efficient processing order.
In the end, I spent several months (!!) analyzing the problem, pondering solutions, discussing options with my coworker, whiteboarding ideas, rejecting or refining as needed, prototyping, and mainly being very frustrated before I finally had the project rolling again. It is not often that one gets to invest so much time in a problem. It is luckily also not often that one needs this much time to find a solution. Let’s just say that this was a rather unique project.
This example did teach me that relaxing doesn’t solve every issue. Sometimes it just takes effort. Hard work, and lots of time are sometimes the only way to overcome an obstacle. And when that obstacle needs to be overcome, putting in the work is your only option.
For our profession, we need a mix of knowledge, experience, and creativity. Especially the latter can’t be forced. So if you feel that you run into a brick wall and make no progress, it can help to take your mind of the problem for a while.
Go fix another problem. Go home at the end of the day, spend time doing things you like. If time permits, move the issue to next week and go have a great weekend. Sometimes, just taking some time off is sufficient. While we are doing other things, some subconscious areas of our brains are still working on the problem. And these areas often operate far more efficient without the rigid force of needing a solution now.
But not every problem has an easy solution. Sometimes, a problem appears to be incredibly hard because … well, because it IS incredibly hard. At those times, there is no substitute for hard work. We just have to roll up our sleeves, pour an extra cup of coffee and start grinding.
This is part eight of the plansplaining series. Each of these posts takes an execution plan with an interesting pattern, and details exactly how that plan works.
In this post we look at a query that, given the known limitations of the available join operators, appears to be impossible to execute. But it’s a legal query so it should run. And it does. But how?
The query below can be executed in any version of the AdventureWorks sample database. Don’t bother understanding the logic, there is none. It is merely constructed to show how SQL Server handles what appears to be an impossible situation.
If you look at the descriptions of the various join operators in the Execution Plan Reference, you will see that this query poses the optimizer for what appears to be an insolvable problem: none of the join operators can be used for this query!
The Nested Loops algorithm can only be used for inner join, left outer join, left semi join, or left anti semi join. A right outer join or right [anti] semi join in the query can still use Nested Loops if the optimizer can reverse the inputs. But this query calls for a full outer join, which is simply not supported by Nested Loops.
Both the Merge Join and the Hash Match algorithms require at least one equality predicate in the join criteria. This query has just a single join criterium and it is an inequality. So these algorithms are also not eligible for this query. And Adaptive Join (not yet described in the Execution Plan Reference) chooses between the Nested Loops and Hash Match algorithm at runtime, so it suffers from the restrictions of both.
The query is legal syntax and should run. And it does indeed. Let’s look at the execution plan to find out how the optimizer solved this conundrum. (As usual, I have edited the NodeID numbers of each operator into the screenshot for easy reference).
Apparently, trying to make SQL Server do the impossible merely results in making SQL Server do a lot of extra work. Instead of two scans and a single join, as we would expect in most cases of a single two-table join query, we now see four scans, two joins, and some additional extra work. Time to break it down and see what’s going on!
The concatenation operator has a quite simple operation. It has two or more inputs and a single output. All it does it request rows from its first input and pass them unchanged; then if the first input is exhausted repeat the same for the second input; and so on until it has processed all of its inputs. In short, it has the same function in an execution plan that UNION ALL has in a T-SQL query, except that UNION ALL always concatenates the results of two queries, and Concatenation can concatenate as many inputs as needed.
In this case there are exactly two inputs, so we can immediately see that the execution plan consists of two independent sub-trees, and their results are concatenated and then returned to the client.
The top input
The first input to the Concatenation operator consists of three operators: Index Scan #2 which reads columns DepartmentID and Name from the table with alias d1; Clustered Index Scan #3 which reads columns DepartmentID and GroupName from the table with alias d2; and Nested Loops #1 to join them together, using logical join operation Left Outer Join.
I am not going to spend much time on this segment of the execution plan. It does exactly what it appears to do: read the two input tables and produce a left outer join result of the data. If you change FULL OUTER JOIN in the original query to LEFT OUTER JOIN, you will get an execution plan that looks exactly like this segment.
The observing reader might already have a clue at this time what is going on. Just think about the definition of what a full outer join should return – basically, the matching rows joined as normal (same as an inner join), plus the unmatched rows from the first table (with nulls for columns from the second table), plus the unmatched rows from the second table (with nulls for columns from the first table). A left outer join, as implemented here, returns exactly the first two items from this list.
The bottom input
Since the first input for Concatenation #0 already returns two of the three items from the list above, the second input should be expected to return the rest: rows from the second table (d2) that have no match in the first (d1). That sounds like a NOT EXISTS in T-SQL terminology, or a left anti semi join in execution plan terms.
Looking at the second input, you can indeed see elements of this expected left anti semi join, but there are a few extra operators as well. So let’s break this down in its parts.
Clustered Index Scan #6
This operator reads DepartmentID and GroupName from the table with alias d2.(This can be seen by reviewing its properties, specifically the Output List property – I decided to not include screenshots of all property lists in this post, merely the more interesting ones).
No big surprises here.
Nested Loops #5
As you can see in the properties, this operator uses logical operation Left Anti Semi Join. This means that it looks for rows in the second input that match a specific condition; if it finds a match the row from the first input is discarded and the scan of the second input is aborted; if the scan completes without finding a match the row from the first input is passed. This is equivalent to using the NOT EXISTS keyword in a query.
The properties of this operator reveal that there is no Outer References property. This implies that the inner (bottom) input is “static” – it returns the same set of rows for every row from the outer (top) input. For each of these rows the Predicate property is evaluated to determine whether the row is a match of not.
This predicate reads (simplified) d2.DepartmentID > Expr1002. This is similar to the predicate in our query but d1.DepartmentID has been replaced by Expr1002. Why?
Clustered Index Scan #8
As expected, this operator reads the table with alias d1, returning only the DepartmentID column (which makes sense because for this part of the query the Name column from the d1 table is not needed).
The optimizer has set the Ordered property to false, so the storage engine is free to return the rows in any order it sees fit. (The same is the case for all other scan operators in this execution plan but here it is a bit more relevant as we will see later).
Stream Aggregate #7
The properties of this operator (no screenshot shown) reveal that there is no Group By property. As explained on the Stream Aggregate page in the Execution Plan Reference, this means that it functions as a scalar aggregate. It returns a single row with the aggregation result of its entire input.
What type of aggregation is used can be seed in the Defined Values property (see screenshot). As you can see, this is where the Expr1002 column we saw referenced in Nested Loops #5 originates from. Simplified by removing unneeded four-part naming, parentheses, and brackets, the expression reads: “Expr1002 = MIN(d1.DepartmentID)”.
Nested Loops #5 (again)
Now that we have seen both inputs and the properties of the join operator itself we can bring these elements together to understand how it works.
All rows from table d2 are read. For each of these, the bottom input returns a single row with the lowest value of d1.DepartmentID, which is then compared to d2.DepartmentID to determine whether or not the row from d2 should be returned.
Since this is a left anti semi join, with a predicate d2.DepartmentID > d1.Department, the transformation to d2.DepartmentID > MIN(d1.DepartmentID) is indeed correct; it does not change the results in any way. It only affects performance. For the query as written, the bottom input of Nested Loops #5 (which executes 16 times) reads 16 rows, finds the minimum value, and then compares that single value to d2.DepartmentID in the join operator. Without the Stream Aggregate operator, it would instead read the same 16 rows, return all of them to the join operator and compare them there. Apparently, the optimizer still estimates that aggregating the data first is cheaper. (I personally doubt this – after all, without the aggregation the lower input would actually stop executing as soon as a match is found; in this case that would almost always be after reading the very first row. But even in other cases it is reasonable to assume that there is at least a non-zero chance of having to read less rows if you compare each row and stop after finding a match, as opposed to first reading and aggregating all rows and only then doing the comparison).
Compute Scalar #4
The final operator to discuss is the Compute Scalar. Looking at its properties, it is easy to see what it does: it simply adds an extra column to each row returned from Nested Loops #5; this column is called (simplified) d1.Name and is set to null.
This makes sense. The Nested Loops operator returns only a single column, d2.GroupName. But Concatenate expects all input data streams to return the same amount of columns, so the equivalent of the d1.Name column must be added before the stream can flow into the Concatenate.
Bringing it together
As already mentioned above, the Concatenation operator effectively combines the results of two independent queries. The first, shown on top in the execution plan, performs a left outer join between the two inputs. This returns two of the three elements that the query asks for: al matching rows (joined as normal), and all unmatched rows from the first table. The bottom input of the query then finds only the unmatched rows from the second table, which the Concatenation operator then adds to the results.
Effectively, the execution plan can be thought of as implementing this query, which is semantically equivalent to the original query though more complex:
If you execute the query above and look at its execution plan, you will see that this query does indeed have the exact same execution plan as our original query.
It may appear wasteful to do two joins and four scans (plus the extra aggregation and concatenation) instead of a single join and two scans. But as explained in the introduction, I forced the optimizer to find a workaround because I submitted a query that simply could not be executed as a single join between two tables, due to limitations in the available algorithms. So in that sense, this is really the best the optimizer could do.
That being said, there certainly is room for improvement. I already mentioned above that I doubt whether aggregating all rows to do just a single comparison in the Nested Loops operator is actually more effective then returning individual rows – especially because a Nested Loops with logical operation Left Anti Semi Join can stop calling rows from the inner input after the first match is found.
There are two other optimizations possible that in this case were probably not done because of the small table size (16 rows for both input tables), but that I have actually seen in execution plans for similar queries on much larger tables. The first is based on the observation that Nested Loops #5 uses a static inner input. This means that it will be the same for each execution – so why compute it over and over again? Why not add a Table Spool operator so that the actual computation of the MIN(d1.DepartmentID) is done only once, storing the result stored in a worktable which can then produce it again every execution without having to recompute it? And also, since DepartmentID is the clustered index key, why not change the Ordered property on Clustered Index Scan #8 to true and add a Top operator, so that only the first row (which by definition stores the lowest DepartmentID value) is returned?
However, my biggest disappointment is that the optimizer even chose to use a static inner input at all. I really would have expected to see a dynamic inner input on Nested Loops #5. The Outer References property would have been set to d2.DepartmentID. And the entire inner input would have been replaced by a single Clustered Index Seek operator on table d1, with its Seek Predicates property set to d2.DepartmentID > d1.DepartmentID. Each execution of this inner input would then navigate the index to the first matching row and either return nothing (if no match exists) or return the first match – and then not be called again so any other matching rows are not returned.
Long story short, I would have preferred to get the execution plan that it is actually produced for the (equivalent) query below:
All join operators have their specific limitations. Due to these, it is possible to write a query that at first glance appears impossible to run. But as long as it is a syntactically correct query, SQL Server has to be able to run it.
In this plan we looked at the hoops SQL Server has to jump to in one of such cases, where the inequality-based join predicate forces a Nested Loops join, but the query ask for a join type that is not supported by Nested Loops. We saw that in this case the query effectively gets rewritten to an equivalent but more complex query that can use supported operators.
I am still open for suggestions on topics to cover in this series. If you see an unusual and interesting pattern in an execution plan, let me know so I can consider using it in a future post!
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.