In my previous post, I explained the basic of parameter sniffing, and then built on that to explain the lesser known mechanics of variable sniffing and cardinality sniffing. However, I also sneakily inserted a few comments on misconceptions about the performance impact of parameter sniffing, with the promise to explain in more detail later. Well … it is later now, so here is the promised explanation!
Parameter sniffing’s bad rep
Lots of people in the SQL Server field will claim that parameter sniffing is a bad feature. That is not the case. Just think about it for a moment – the feature has been in the product since at least SQL Server 2005, probably even earlier. If it were truly a bad feature, then surely just removing the feature would have been a very simple way to improve the product. But Microsoft has done no such thing, and whatever horror stories on Microsoft you believe, they have not become one of the world’s biggest companies by foolishly refusing to take opportunities to improve their software for almost no investment.
However, there is no denying that parameter sniffing can cause terrible performance issues. I will explain by using a slightly modified version of the example that my good friend Grant Fritchey likes to use. (Note that I used the AdventureWorks2012 sample database, but other versions of AdventureWorks should expose the same behaviour).
CREATE INDEX ix_Addess_City
ON Person.[Address] (City);
CREATE PROC dbo.GetAddressByCity
sp.[Name] AS StateProvinceName,
FROM Person.[Address] AS a
INNER JOIN Person.StateProvince AS sp
ON sp.StateProvinceID = a.StateProvinceID
WHERE a.City = @City;
EXEC dbo.GetAddressByCity @City = ‘Mentor’;
SET STATISTICS IO ON;
EXEC dbo.GetAddressByCity @City = ‘London’;
SET STATISTICS IO OFF;
DROP PROC dbo.GetAddressByCity;
DROP INDEX ix_Addess_City ON Person.[Address];
Since the tables used in this example are fairly small, you will not actually notice any slowness. But if you look at the output generated by the SET STATISTICS IO option, you should see that there actually is a serious issue with this query. The second execution of the stored procedure takes 871 logical reads on the Address table, and 868 logical reads on the StateProvince table – insane numbers when you realize that the entire Address table is stored in just 216 pages, and StateProvince is even just 3 pages in size!
If you look at the execution plan, you will start to see why this is the case:
An index seek is used to find Address rows matching the supplied parameter, which are then retrieved in full by the index seek, and then a clustered index seek in StateProvince is used to do the join. This type of plan is very good when the filter is expected to be very selective, which is the case for Mentor (with just one matching row), the value that was sniffed when the plan was compiled during the first call. For the second call, the same plan was then reused, but because the filter on London (with 435 matches) is far less selective the plan is now actually quite bad.
The biggest issue with such cases of bad performance caused by parameter sniffing is that they tend to be hard to troubleshoot. The performance of the GetAddressByCity stored procedure depends on the value that happens to be passed in on the first execution after the previous plan cache entry was invalidated or removed. But most of the common reasons for plan invalidation are beyond our control, so to the user of the system it can appear that the performance of this procedure changes at “random” moments. And obviously, once the DBA is ready to respond to the ticket a new recompile might already have occurred and the performance might be find again.
Another very common cause for bad parameter sniffing is the use of a single generic procedure to return rows based on a series of parameters that are all considered optional for the search. A simple example would be a procedure dbo.GetAddressByCityOrPostalCode that uses two parameters @City and @PostalCode, and that has a query with a WHERE clause such as “WHERE (City = @City OR @City IS NULL) AND (PostalCode = @ PostalCode OR @ PostalCode IS NULL)”. When this stored procedure is first called with @ PostalCode = ‘W10 6BL’ and @City = NULL, a plan will be compiled and stored in plan cache that is optimal for this combination of parameters. If a later execution uses @PostalCode = NULL and @City = ‘Mentor’, then the results returned will still be correct but the performance will likely be bad, because the query is still executed using a plan that is optimized for a search by PostalCode only. For instance, it could use a plan that uses an Index Seek in an index on PostalCode to find ows matching the second part of the WHERE (which would find all rows on the second execution), then use a Key Lookup to find the rest of the data and then filter down on City = ‘Mentor’. For this type of stored procedure you really want multiple plans stored and the one that is best for the current parameters used, but that’s just not how SQL Server works.
Bad parameter sniffing in a production system can be hard to troubleshoot. The problem only materializes when a procedure that is susceptible to bad parameter sniffing is recompiled, and the parameter values sniffed during that recompile happen to cause a plan that is bad for many other values. This is especially rare when the issue is caused by skewed data distribution, because the really bad performance tends to be caused by a recompile for a rare outlier value, and these are not often used anyway. However, when it does hit you, it often hits you hard. When I see a database application that usually performs well but that at unexpected and unreproducible moments suddenly starts to perform bad (and then sometimes reverts to good performance later, again for no apparent reason), I always have bad parameter sniffing high on my list of likely root causes. The seemingly random switch from good to bad performance can be due to a recompile with a value that causes a bad plan. These occur at semi random moments because there are lots of reasons that can trigger a recompile (metadata change, (auto) update of statistics, memory pressure), and some of them are beyond our control.
The fanboys (and girls) are wrong
Many of the experts in the SQL Server community try to combat the misconception that parameter sniffing is bad. But they unfortunately use the wrong arguments. They say that yes, there are cases of “bad” parameter sniffing that you will have to work around (which is indeed correct), but they then also claim that apart from these exceptions parameter sniffing is “usually” or even “always” good. I have heard and read such claims from some very smart and experienced people that I personally hold in high esteem, such as Brent Ozar, Grant Fritchey, Gail Shaw, and many others. And I myself have also held and spread this belief for a long time. Until recently when I was trying to find a few examples to actually illustrate the benefit of parameter sniffing – and failed to find any!
I then realized that from a parameter sniffing point of view, there are just two types of queries: those that are affected by parameter sniffing and those that are not. A lot of queries fall in the “not affected” category. I already gave some examples above:
1. If a query filters on the primary key, the estimated rowcount when based on a sniffed value will be 1, but the estimated rowcount for the same query without sniffing (eg when using it as an ad hoc query with a variable) will also be 1 – so we’d get the same plan without sniffing as we do with sniffing.
2. If a query filters on a key with a fairly even data distribution, for instance with every value occurring somewhere between 90 and 110 times, then the estimate for a sniffed value can be any value between 90 and 110, and the estimate that we would get without sniffing, based on the average number of rows per value, would probably be around 100. It is extremely unlikely that such small changes in the estimate would cause a big difference in the execution plan. And in the rare cases where they do have an effect, see the below discussion on queries that are affected.
The only way for a query to fall in the “affected” category is when the column has a skewed data distribution. (Or, rare, to have values at a frequency around the threshold value between two alternative plans). In that case, there will be at least two values (A and B) for which a different plan is optimal. And here’s the rub. When the value A is sniffed, the plan is cached, and the procedure is then executed with B, the cached plan will be sub-optimal for that value. And vice versa. So the parameter sniffing in this case will always be bad. (You might argue that it is possible that the plan for A and the plan for B might both be better than the plan for the generic average rows per value; I’d have to agree that in theory this might be possible but I have never ran into, nor been able to construct, such a query).
Good parameter sniffing
Now that I have shown how parameter sniffing is usually irrelevant and sometimes bad, you might wonder why the feature exists at all. The answer is that there are some query patterns where parameter sniffing does actually help – and one of them is common enough that I would never recommend removing this feature from the product.
Lots of tables in lots of databases collect data over time. Orders keep being added in sales systems, banks keep processing transactions, and sensor data continues to poor into a table of readings. Those tables almost always have a column that timestamps the row (e.g. OrderDate, TransactionDate, MeasurementDate). And you will often find stored procedures that allow the user to select a recent subset of data by passing in a @ThresholdDate parameter – so that the results can be limited to e.g. the last hour, the last week, or the last 15 days. So the query in the stored procedure will filter on for instance WHERE OrderDate >= @ThresholdDate.
This is the type of stored procedure that will usually benefit hugely from parameter sniffing. The predicate uses inequality. Without sniffing the value, that type of filter is one of the hardest to get right for the cardinality estimator – the value can be November 3rd, 1848, or the Year 2525, or anything in between – and so the number of matching rows can be anything from the entire table to nothing. The cardinality estimator simply uses an estimate of 30% matching rows for this case. If the stored procedure is typically called to fetch just the last few days of data out of a table containing eight years of history, then that estimate is going to be horribly wrong.
This is one case where parameter sniffing really shines. It does not matter whether the sniffed value will be for a one-hour period, a one-day period, or a two-week period; in all these cases the plan based on that value will probably be the same, and definitely better than the plan you’d get for a 30% estimate. And if the cached plan is for two weeks and you next run the query for one hour, you’d still get the better performance. This single case is so common, and can have such huge impact on performance, that this use case on itself is already sufficient for me to defend the process of parameter sniffing – in spite of the occasions where it hurts, and the many cases where it’s irrelevant.
Another case of good parameter sniffing, far more esoteric, is if you have just the right combination of both a skewed data distribution and a skewed query pattern. Let’s return to Grant Fritchey’s example based on cities. This data distribution is skewed because there simply are far more people living in a city such as New York or Tokyo then there are in small villages such as Hum or Adamstown, so a stored procedure that filters on City tends to be subject to bad parameter sniffing. But what if you are working for a university and the researchers are working on a project involving only the inhabitants of very small villages? In that case, when the optimizer sniffs a value it will always be a small village and only a few rows will be returned, and again the plan based on any possible sniffed value is always going to be better for every value that is typically based, then a non-sniffed plan would be.
When good turns bad
You have to be aware, though, that good parameter sniffing, bad parameter sniffing, and irrelevant parameter sniffing, are all the same mechanism. The parameter sniffing remains the same, it’s the circumstances that make it go good or bad. And that’s something to keep in mind especially with code that relies on good parameter sniffing.
Let’s look once more at the Orders table with eight years of history, that is only ever queried for the most recent day, week or two week period. What happens if, one day, someone does actually submit a report to do some trend analysis on the last 5 years? The best and most likely result, in this case, would be that the cached plan, based on a previously sniffed parameter for a short period, is used. In that case, the report would be unbearably slow – the plan for the short period probably involves Nested Loop joins and Index Seeks, and running a few billion rows through that can take hours, or even days. The submitter of that query would suffer from bad parameter sniffing. But a worse scenario is also possible. It the previous plan has just been invalidated, a new plan will be created based on a sniffed value of five years ago, and now the plan would probably change to use scans. All other calls to the stored procedure will now start to use this plan as well, so that all those requests for short-term periods now run longer. The increased number of table scans will also start to affect overall system performance. The buffer cache fills up whenever the table is scanned so all other activity on the system has to reread from disk instead of run from cache. In short, the overall performance of the system as a whole would slow down, and it might take a long time to find the actual cause – and all that time, you would be losing sales because the web shop is now too slow for your impatient customers.
Even in a “good parameter sniffing” scenario, you should still be aware of the potential of bad parameter sniffing occurring, and take appropriate action to try to prevent it.
Dealing with bad parameter sniffing
If parameter sniffing is sometimes good, sometimes bad, and most often irrelevant, then the obvious question is: “how can I get the good without having to accept the bad as well”? And the good news is that this is possible. There are a lot of ways that allow you to deal with parameter sniffing, and I will briefly describe the options. (Note that most of them are described in much more detail elsewhere on the internet).
The most brutal way of preventing bad parameter sniffing is to disable it completely. I do not like this method. To me, it sounds like torching down your house to prevent it from being burgled. But if you insist, you can: activating trace flag 4136 will disable all parameter sniffing (good, bad, and irrelevant) on the entire instance. And yes, it is documented and supported.
A popular technique that is mostly found in old articles is to use local variables – this used to be one of the best options before SQL Server 2008. Declare a variable in the stored procedure for each parameter, assign it the value of the parameter, and then use that variable instead of the parameter in the rest of the code. The result is that SQL Server has to optimize the queries based on the variable, which it cannot sniff, so you always get a plan based on the generic data distribution rather than any sniffed value. I recently saw a recommendation somewhere to always do this for every parameter in every stored procedure – which has the same effect as using trace flag 4136, but with a lot more work. Seeing that recommendation was one of the things that triggered me to write this post. If you still need to support SQL Server 2005 or older, and if you only use it for specific parameters in specific stored procedures that are known or expected to get bad parameter sniffing, then I guess that using the local variable method can be okay. Using it in newer versions, or using it as a generic guideline for all parameters and stored procedure, is not correct.
Since SQL Server 2008, we can use OPTION (OPTIMIZE FOR (@parameter UNKNOWN)) to force SQL Server to disregard any sniffed value and optimize the query as if the value for @parameter is not known. This has the same effect as using a local variable, except it does away with the extra overhead and you can limit it to just a single query in a multi-statement stored procedure. For that reason, I prefer this method over local variables. This method works very well in situations where the plan based on generic statistics is good enough for all possible cases. These tend to be cases where irrelevant and bad parameter sniffing can occur. In situations with good parameter sniffing, the generic plan you get with this option tends to be not good enough.
Already since SQL Server 2005, you could use OPTION (OPTIMIZE FOR (@parameter_name = ‘value’)) to force SQL Server to optimize as if the value provided was sniffed. The risk of this method is that people tend to forget to maintain the value in the hint. For example in the case of retrieving the latest rows from the Orders table, this hint with a value of October 20, 2016 might work very well at this time – but if I forget to regularly update the code with a newer value, then that same hint will actually start to hurt performance in a terrible way after a few years, a year, or perhaps already after a few months. (And unfortunately, I have seen code in similar situations where a hint like this, for a datetime column, had not been changed for over three years!)
Perhaps the best solution, depending on scenario, is to use multiple stored procedures, with one master to determine which one to execute. This can work for scenarios such as the Orders table (with one procedure to search in only recent data and another to search for longer time periods, and a master calling the right one), or for generic search procedures with a limited number of possibilities (with two optional search arguments you have four possible combinations and you can create four procedures; with five parameters you have 32 possible combinations and you do not want to create that many almost similar procedures – but you could check how many combinations are actually really used). It will not work in situations with skewed data, unless you happen to know all the outlier values.
Another possible solution that works especially well for the optional search case is to use OPTION (RECOMPILE). As shown in my previous post, this enables SQL Server to sniff the variable. And since this is a different process (because the optimizer does not have to cater for later reuse of the same plan with different values), this can give you tremendous benefits, even up to the optimizer removing entire joins from the plan if they are not needed for the current set of values. But there is a price to be paid: every time the stored procedure executes the query has to be compiled, which is a relatively expensive process. Depending on how often the procedure executes and how complex the query is, just the CPU and memory cost of the recompilations could cause serious issues to the overall performance of your server. My recommendation for this hint is to definitely use it where it makes sense, and equally definitely not use it anywhere else.
And finally, specifically for the procedure with optional search arguments, you could change the stored procedure to use dynamic SQL. This can be very dangerous when not done appropriately, but if you really know what you are doing it can be a viable alternative. You would have to construct a query string that contains only the predicates (and preferably also only the joins) that are needed based on the parameters passed it. You should not hardcode the parameter values themselves in the query string, nor allow any other form of user input to find its way in the parameter string. So the conditions in the dynamically generated query string would still be of the form WHERE Column1 = @Param1 AND Column3 = @Param3 (if only @Param1 and @Param3 have a value), and you should then use sp_executesql to execute the SQL and pass in the correct parameter values. For the parameter sniffing process, using sp_executesql is exactly the same as calling a stored procedure, but now with the query string instead of the procedure name as the “owner” of the plan. The values will be sniffed, a plan will be created and stored in cache, and the plan will be reused if the same query string is executed again – which in this case is okay, because that means the same combination of set and not set search parameters was used. For a stored procedure with five optional parameters, this might eventually result in all 32 possible permutations of the query being in the plan cache with a plan that is optimal for that permutation. But without you having to write all 32 possible versions of the query. However, be aware of the danger of SQL injection, and be aware that there may be issues with permission settings if you use dynamic SQL. I recommend Erland Sommmarskog’s excellent articles on dynamic search conditions and on using dynamic SQL if you consider using this method in your production code.
How about the other sniffings?
In my previous post, I described three forms of sniffing. In this post on good, bad, or irrelevant, I have so far only focused on parameter sniffing (which includes, as briefly mentioned above, the use of sp_executesql). So what about variable sniffing and cardinality sniffing?
To start with the latter: since cardinality sniffing is really just a special case of parameter sniffing, all I wrote above applies equally to cardinality sniffing. It can be good (in some cases), it can be bad (in other cases), and it will in most cases probably have no impact. When it is bad, then OPTION (RECOMPILE) is probably going to be your only useful instrument, and if that causes too much compilation overhead you might have to rearchitect your solution to avoid this problem. If you are in charge of a database that uses table-valued parameters, and after reading this post you realize that you are benefiting or suffering from cardinality sniffing, then I’d love to hear from you in the comments section!
For variable sniffing, none of the above applies. Since variable sniffing only occurs when you use OPTION (RECOMPILE), the plan will always be optimized for the current values, and it will never be reused. There is no possibility at all of bad variable sniffing happening, since the whole concept of bad sniffing is that a plan created for one value is reused for another value, which simply never happens with variable sniffing. In other words, variable sniffing will only ever be irrelevant or good, never bad – but it will also always be associated with the cost of OPTION (RECOMPILE), since that is the only condition that triggers it.
In the previous post, I described three forms of sniffing: the relatively well-known parameter sniffing, the lesser known variable sniffing, and the hitherto undocumented cardinality sniffing. In this post I tried to prove that both the doomsayers who claim that parameter sniffing is the Absolute Evil (TM), and the fanboys who appear to be in total love with parameter sniffing, are incorrect.
Parameter sniffing is actually usually totally irrelevant. But it still has a reason to exist, because in at least one of the cases where it does make a difference, it can provide a huge performance benefit. That does mean, though, that we will also have to accept a few cases where it can hurt.
Because the pain caused by bad parameter sniffing can be immense, I also walked through some of the more common fixes for these problems, describing their benefit and cost and explaining when you would or would not use them. I then also explained how bad, good, and irrelevant sniffing applies to variable and cardinality sniffing.
I hope that these posts help you understand what sniffing is, how it can help or hurt, and how you can fix the latter without losing the former.