Plansplaining part 29 – Introduction to cursor processing

Plansplaining part 29 – Introduction to cursor processing

This is already the 29th part of the plansplaining series, where I look in detail at interesting examples of execution plans in SQL Server. With this part, I’ll kick off another mini-series, this time on cursor processing

Don’t use cursors!

And yes, I know the mantra. Do not use cursors. They are slow. There is (almost) always a faster set-based alternative available. So why would I even waste blog space on cursors, when the only smart thing to do is to rip them out and replace them with a set-based alternative?

Well, there are, in fact, many reasons. The “almost” above suggests that there are still cases where row by row processing is in fact the most efficient method. There are cases where we use a cursor, even though set-based is slower, because we need to call a stored procedure for each row returned, and that stored procedure is too complex to be changed to process an entire set at once. Or, perhaps, you just inherited existing code that uses a cursor, and you need to fix the immediate performance issues now, so you cannot afford the time investment to rewrite the row by row logic to set-based logic.

Cursors are a supported feature in SQL Server. They have there place, albeit a niche place. So it is important to understand how they work, and if you care about performance, then that means understanding their execution plans.

Options, options, options

When you look up the syntax for the DECLARE CURSOR statement, you are met with a staggering amount of options. You might be tempted to close the page immediately and just accept all defaults. Please resist that temptation. Some of the options affect supported functionality, some accept performance, … and the defaults are absolutely not the most effective choices in most cases!

ISO syntax or T-SQL extended syntax

It starts right at the top with a choice between two competing syntaxes. The ISO syntax, as the name suggests, follows the ISO standard. The T-SQL extended syntax offers many more options, which of course means more complexity, but also more control. I personally always prefer to use the T-SQL extended syntax, because of the extra control it offers. I would only use the ISO syntax if I have to write code that targets multiple platforms, and there is insufficient justification to have custom code for this specific code block.

The ISO syntax basically offers a total of eight different cursor types. With or without the INSENSITIVE keyword, with or without the SCROLL keyword, and for either READ ONLY or UPDATE. (The FOR UPDATE OF (column, column, …) option is basically the same as a simple FOR UPDATE, except that you’re not allowed to update any columns not listed here; this is a check that is done when compiling the code, not at run time, so it does not affect the execution plans in any way). However, INSENSITIVE is not compatible with FOR UPDATE, so that leaves six supported combinations. Each of those six combinations is internally mapped to the equivalent set of options in the T-SQL extended syntax, as follows:

ISO syntax Extended T-SQL syntax
INSENSITIVE SCROLL READ ONLY SCROLL STATIC READ_ONLY
INSENSITIVE READ ONLY FORWARD_ONLY STATIC READ_ONLY
SCROLL READ ONLY SCROLL KEYSET READ_ONLY
SCROLL FOR UPDATE SCROLL KEYSET OPTIMISTIC
READ ONLY FORWARD_ONLY FAST_FORWARD READ_ONLY
FOR UPDATE FORWARD_ONLY DYNAMIC OPTIMISTIC

Since every possible combination of options in the ISO syntax maps to a set of options in the extended T-SQL syntax, but the latter also adds more options and combinations, I will only describe the various extended T-SQL options in the rest of this series. Just remember that if you encounter code that uses the ISO syntax, you can check the table above to find the equivalent options in the extended T-SQL syntax.

Before diving into the execution plans for cursors, let’s first quickly walk through all the various options and see what they mean. I will walk through the various options in the same order as they are listed here, but note that these keywords can be specified in any order you like, as long as they are all between the CURSOR and FOR keywords.

LOCAL or GLOBAL

This keyword determines the scope (visibility) of the cursor. LOCAL means it is only visible in the current scope, similar to for example variables or temporary tables. GLOBAL means that it remains visible outside the current scope, similar to global temporary tables. This should be used if you for instance want to declare the cursor in a stored procedure, but then fetch data from it in the calling procedure. If you do not specify either of these keywords, then the default is determined by the CURSOR_DEFAULT database option.

This keyword has no effect on the execution plan, so I will not mention it again in this series.

FORWARD_ONLY or SCROLL

A FORWARD_ONLY cursor can, as the name suggests, only scroll forward through the cursor. You always read the first row first, then the second row, and so on, until the last row has been read. A SCROLL cursor is more flexible, here you can read rows in the cursor in any order, jump from one place to another, etcetera. In practice, this means that s SCROLL cursor allows the full syntax of the FETCH statement to be used, whereas a FORWARD_ONLY cursor allows only FETCH NEXT.

If you specify neither of these options, then the cursor defaults to SCROLL if the cursor is specified as STATIC, KEYSET, or DYNAMIC. In all other cases, the default is FORWARD_ONLY.

STATIC, KEYSET, DYNAMIC, or FAST_FORWARD

This is probably the most important set of options, since it determines the very core of how the cursor will behave.

STATIC means that all subsequent FETCH statements will return data as it was when the cursor was opened. The underlying data is not locked, so concurrent updates are not blocked. But they will not be visible when reading from the cursor.

DYNAMIC is the reverse. Every change that is made to the underlying data, be it an insert, an update, or a delete, will affect the data that is fetched.

KEYSET is the alternative that sits between the prior two. Which rows are included in the cursor will not change after the cursor is opened, but what data they show can still change. In other words, updates are visible, but inserts are not. And when a row that is included in the cursor is deleted from the underlying tables before it is fetched, then the attempt to fetch it will result in a @@FETCH_STATUS of -2.

Finally, FAST_FORWARD is described in the documentation as “a FORWARD_ONLY, READ_ONLY cursor with performance optimizations enabled”. What that means exactly will hopefully become more clear once we look at the execution plans for this type of cursor.

If none of these keywords is supplied, then the cursor type defaults to DYNAMIC.

READ_ONLY, SCROLL_LOCKS, or OPTIMISTIC

These options determine whether it is allowed to update the current cursor row by using the WHERE CURRENT OF cursorname clause in an UPDATE or DELETE statement. Note that this option has no effect at all on regular INSERT, UPDATE, DELETE, and MERGE statements, they can always be used, even within the cursor code, and even when it would affect the same row that WHERE CURRENT OF would specify.

When READ_ONLY is specified, then it is simply not permitted to use UPDATE or DELETE statements with WHERE CURRENT OF. This is the default for STATIC and FAST_FORWARD cursors.

OPTIMISTIC means that modifications with WHERE CURRENT OF are allowed, but SQL Server does not lock the row. When other processes change the underlying data after it is fetched but before it is modified, then the modification attempt will fail. This is the default option for DYNAMIC and KEYSET cursors, unless the SELECT statement in the cursor declaration does not support updates.

Finally, SCROLL_LOCKS also allows modifications, and even guarantees that they will succeed. In order to do so, rows are locked when they are fetched, and those locks are held until the next row is fetched. This does of course impact concurrency.

TYPE_WARNING

This option should, in my opinion, always be specified. At least during development of code.

There are many possible combinations of the previously described options, but not every combination is supported, and the list of supported and unsupported combinations is too long to memorize. Additionally, some combinations might or might not be supported, based on the exact query used, and on indexes that exist on the underlying tables. Some unsupported combinations will result in an error, but many other combinations will make SQL Server simply change some of the options so that it can still execute your code. And this will be done without warning … unless you include the TYPE_WARNING keyword. In that case, SQL Server will tell you it changed the type, by sending this message to the output window:

The created cursor is not of the requested type.

Admittedly, not the most helpful of error messages. It does not tell you what type it did create, nor why. But at least you know that there is a problem, that you can investigate and fix.

FOR UPDATE [ OF column_name [, column_name ] … ]

If you specify this option, the cursor allows updates. This is incompatible with the READ_ONLY option described before; you’ll get an error if you combine them. If you do not specify this option, then the cursor will still allow updates if SCROLL_LOCKS or OPTIMISTIC is specified, but it will be read only when you specify READ_ONLY.

If FOR UPDATE is specified, then the optional column list restricts which columns can be affected by any data modification through the cursor (with the WHERE CURRENT OF clause). This is checked when the code is compiled, and it does not affect the execution plans, so we will not look at this option in the rest of the series.

Note that this option does not in any way prevent you from updating cursor rows, even the row just fetched, through a conventional statement that for instance uses the key values in the WHERE clause. It only affects statements with WHERE CURRENT OF.

Two types of plan

It is (or should be) well known that SQL Server has two types of execution plan: execution plan only, and execution plan with run-time statistics (also, misleadingly, called estimated and actual execution plan). Specifically with cursors, these two types of execution plan expose very different behavior.

To illustrate this, I’m going to use the following sample code. It is not very realistic, typical cursor code uses a loop to process all rows in the cursor, but this works fine for illustration purposes.

DECLARE MyCursor CURSOR 
                 GLOBAL
                 FORWARD_ONLY 
                 STATIC 
                 READ_ONLY 
                 TYPE_WARNING
FOR
SELECT     soh.SalesOrderID,
           soh.OrderDate
FROM       Sales.SalesOrderHeader AS soh
WHERE      soh.SalesOrderID  BETWEEN 69401 AND 69410;

OPEN MyCursor;
FETCH NEXT FROM MyCursor;
FETCH NEXT FROM MyCursor;
CLOSE MyCursor;
DEALLOCATE MyCursor;

Execution plan only

If you request the execution plan only (“estimated execution plan”) for this entire batch, then the results should look like the screenshot below:

The execution plans for the sample cursor code

(Click on the picture to enlarge)

The DECLARE statement is the only statement that results in a “real” execution plan, with operators calling other operators, of the type that does the actual work of retrieving and manipulating data. All other cursor-related statements, OPEN, FETCH, CLOSE, and DEALLOCTE, all result in a “plan” that is just a single icon, the so-called Catchall icon, that is used in execution plans when no specific icon for the operator exists. And if you look at the execution plan XML, you will see that these four, but also the Snapshot icon in the first plan, are each a representation of a <StmtCursor> node in the XML.

Execution plan with run-time statistics

The output changes dramatically if you actually execute the code, with the option to include the execution plan plus run-time statistics enabled. Some statements now have no plan at all. Others do have a plan, but a bit different from what we saw before.

If you follow along, I suggest stepping through the code one statement at a time, so you can very clearly see which statement produces which plan (and which statements produce no plan at all). For this blog, I have executed the entire batch at once, so that I could capture it all in a single screenshot:

(EDIT: Stepping through the code one statement at a time only works with the GLOBAL scope option, so I changed the example above. With LOCAL scope, the cursor gets closed and deallocated as soon as the submitted batch ends).

The execution plans plus run-time statistics for the sample cursor code

(Click on the picture to enlarge)

In this case, there was no execution plan at all for the DECLARE CURSOR statement. The OPEN statement no longer gets a Catchall OPEN CURSOR operator, but a full execution plan, that looks exactly the same as the subtree of the Population Query icon in the first execution plan we saw. And then, for each of the two FETCH statements, we also see a real execution plan, and this one is a copy of what was below and to the right of the Fetch Query icon in the first execution plan! Finally, there is once more no plan at all for the CLOSE and DEALLOCATE statements.

How this fits together

So wrapping back to the execution plan only, what we saw here is that the DECLARE CURSUR statement results in a single execution plan, that is in fact a combination of two plans. One of these plans, the “population query”, is to be used when the cursor is opened; the other plan, the “fetch query”, is used to fetch data from the cursor. These plans are compiled based on the DECLARE CURSOR statement, which is why they appear at that spot in the execution plan only. But DECLARE CURSOR does not execute anything, which is why the execution plan with run-time statistics does not show anything for this statement.

The OPEN statement itself does not need its own plan. It uses the “population query” plan, that was compiled when the cursor was declared. So the execution plan only shows only that the OPEN statement is here, whereas the execution plan plus run-time statistics shows the execution plan that was executed as a result of the OPEN statement.

The FETCH statement works the same, but now, obviously, with the “fetch query” plan. The CLOSE and DEALLOCATE statements never need an execution plan, they are very simple metadata-only operations, which is why they are only represented with a Catchall in the execution plan only, and not at all in the execution plan plus run-time statistics.

In the next parts, we will look at the exact function of the population query and fetch query, how they interact, and how they change as you modify the options to change the cursor type.

Summary

This first post about execution plans for cursors actually hardly touched on execution plans. I wanted to first make sure that everyone has sufficient understanding of the various options you can specify for cursors, since my experience is that many people never look into all those options. Or start looking, and then turn away because of the complexity. I hope my short description made things easier to understand.

I also showed that cursors use a combination of two execution plans, compiled for the DECLARE CURSOR statement, but actually executed for the OPEN and FETCH statements. These are called the population query and the fetch query. These execution plans, and the changes in those plans as you modify the cursor options, is what makes the magic happen. But that will be for the next episode.

As always, please let me know if there is any execution plan related topic you want to see explained in a future plansplaining post and I’ll add it to my to-do list!

Plansplaining
Lots of events
T-SQL Tuesday 174 – Job interview questions

Related Posts

6 Comments. Leave new

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close