In my previous post, I demonstrated how the presence of a filtered index can get in the way of successful parameterization, possibly resulting in too much query compilations and procedure cache bloat. I suggest reading that first, because I will go straight where I left off.
Use the Force, Luke
If you read the previous post very carefully, you will have noticed that I wrote that if a plan is unstable, it is “unsafe for simple parameterization” – and that suggests that SQL Server will not have such inhibitions when using forced parameterization. Let’s have a look. Assuming you still have the indexes I created in the previous blog post, just execute the query below change the parameterization setting for AdventureWorks2012 to forced, clear out the plan cache (and remember not to do this on a production system!), execute the query that would not successfully parameterize when the database was set to simple parameterization, and inspect the plan cache. You will see that this time, the query was indeed parameterized. If you also add the code to watch the “Unsafe Auto-Params/sec” counter, you will see no changes to its value.
USE AdventureWorks2012;
go
ALTER DATABASE AdventureWorks2012
SET PARAMETERIZATION FORCED;
go
DBCC FREEPROCCACHE;
go
SELECT SUM(OrderQty)
FROM Sales.SalesOrderDetail
WHERE ProductID = 706;
go
WITH XMLNAMESPACES (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)
SELECT cp.objtype AS ObjectType,
cp.usecounts AS UseCount,
st.[text] AS QueryText,
cp.plan_handle AS PlanHandle,
stmt.value(‘(@ParameterizedPlanHandle)’,
‘varchar(64)’) AS ParameterizedPlanHandle,
qp.query_plan AS [Execution Plan]
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
CROSS APPLY query_plan.nodes(‘/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) AS batch(stmt)
WHERE qp.dbid = DB_ID()
AND st.[text] NOT LIKE ‘%ExcludeMe%’;
go
At first sight, this may appear to be a very simple and elegant solution for the problem. Just set all databases to forced parameterization and then sit back and relax, waiting for the inevitable flood of happy emails from end users who wish to thank you for the tremendous performance boost you just gave them.
But wait. When has solving a problem in SQL Server ever been this simple? There must be a catch, right? But where?
A new use case
In order to see the problem, I will use a different database – so let’s first clean up the mess we made in AdventureWorks2012 and restore it to its original state, so that other bloggers can use it for their demos without my stuff getting in the way:
ALTER DATABASE AdventureWorks2012
SET PARAMETERIZATION SIMPLE;
DROP INDEX Sales.SalesOrderDetail.ix_ProductID_Over_1000;
DROP INDEX Sales.SalesOrderDetail.ix_ProductID_Incl_OrderQty;
A very common use case of filtered indexes is in combination with “soft deletes” – data that is no longer valid is not physically removed from the table, but kept (often to maintain a history of events). A special column, usually called “is_deleted”, is used to track that this information is historic only. Since the majority of queries is only interested in current data, you will find the predicate “is_deleted = 0” in almost every query – so that makes this an ideal candidate for a filtered index. And since we are now under the impression that filtered indexes really require forced parameterization, we will change that setting right away. Here is the script to create this scenario in a sample database (it may take some time to run this!):
USE tempdb;
go
IF EXISTS (SELECT * FROM sys.databases WHERE name = ‘FilterDemo’)
BEGIN;
ALTER DATABASE FilterDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE FilterDemo;
END;
CREATE DATABASE FilterDemo;
ALTER DATABASE FilterDemo SET PARAMETERIZATION FORCED;
go
USE FilterDemo;
go
CREATE TABLE dbo.AllThings
(ThingID int NOT NULL,
ThingName varchar(20) NOT NULL,
is_deleted bit NOT NULL DEFAULT (0),
LotsOfStuff char(2000) DEFAULT (‘Placeholder’),
CONSTRAINT PK_AllThings
PRIMARY KEY (ThingID)
);
DECLARE @i int = 0;
WHILE @i < 5000
BEGIN;
SET @i += 1;
INSERT dbo.AllThings
(ThingID,
ThingName,
is_deleted)
VALUES (@i,
CHAR(65 + RAND() * 26) + CHAR(65 + RAND() * 26) + CHAR(65 + RAND() * 26), — Random name
CASE WHEN RAND() < 0.99 THEN 1 ELSE 0 END) — Most products are deleted
END;
CREATE INDEX ix_Name_NotDeleted
ON dbo.AllThings (ThingName)
INCLUDE (is_deleted) — Always include the columns you filter on!
WHERE is_deleted = 0;
go
Once the script is done, let’s get a quick count of the number of things that have not been deleted. I used a formula that results in 99% of the things being deleted, but there is a random factor involved. The actual number of non-deleted things should be close to 50. It was 58 on my system, which I found by running this query:
SET STATISTICS IO ON;
go
SELECT COUNT(*)
FROM dbo.AllThings
WHERE is_deleted = 0;
But if you switch to the messages tab, you will see a very disturbing number. I expect SQL Server to execute this query by simply scanning the filtered index, as this index exactly contains all the rows it needs to count. But the output from STATISTICS IO shows that a total of 1256 logical reads have been made. Over twelve hundred reads to count just 58 rows? How is that possible? Let’s take a look at the execution plan to find out:
As you see, the filtered index is not used at all; the optimizer chose to scan the clustered index instead, wading through all 5,000 “things” in my table to find just those 58 that were not deleted.
The reason for this is simple. I have enabled forced parameterization. So I told SQL Server that, no matter the consequences, it should always replace constant values with parameters. So the plan that the optimizer was forced to compile was not for the query I typed, but for this query instead:
DECLARE @1 int = 0;
SELECT COUNT(*)
FROM dbo.AllThings
WHERE is_deleted = @1;
And the optimizer has to produce a plan that will always return the correct results, for any possible value of the parameter. Of course, when executing this query with parameter value 1, scanning the clustered index is the only possible way to return correct results, so the plan choice that was made was indeed the only possible choice. By enabling forced parameterization, we have effectively crippled the optimizer in using any filtered index at all (except through views).
More force?
In a case such as this, where we know that we will always use is_deleted = 0 in the query predicate, it can become very tempting to find ways to convince the optimizer to choose the query plan we want without giving up on the forced parameterization plan. Let’s first see what we can achieve by parameterizing this query ourselves and applying the OPTIMIZE FOR hint:
DECLARE @1 int = 0;
SELECT COUNT(*)
FROM dbo.AllThings
WHERE is_deleted = @1
OPTION (OPTIMIZE FOR (@1 = 0));
Okay, I admit, it was a rather desperate attempt and I didn’t really expect much of it. The OPTIMIZE FOR hint tells the optimizer that I want a plan that gives the best performance for that particular value, but the results still have to be correct for other values. So the optimizer will still consider what would happen if I supply other values, and will reject the filtered index because of that.
So, back to the original query, and now use more force. I know that this query will always perform better when using the filtered index – so if the optimizer fails to see that, I will just force it. Applying an index hint does just that. It may be called a hint, but it is a directive; if you hint a query, it WILL be used no matter what. So this should help, right?
SELECT COUNT(*)
FROM dbo.AllThings WITH (INDEX = ix_Name_NotDeleted)
WHERE is_deleted = 0;
No, it does not help. The result is an error message. We have now given the optimizer a totally impossible task. We told it to parameterize the query, no matter what, so it did. It now has to produce an execution plan for the parameterized query, and that query will be reused with different values. You and I know that the value will really never be different, but SQL Server does not, and it still has to guarantee correct results. But then we also told SQL Server that it really has to use an index that does not include some of the rows that, for some values of the parameter, may have to be returned. Like I said – an impossible task, and SQL Server responds just like you and I would do, by throwing up its hands in despair and giving up.
The simple solution
Luckily, there is a solution to this problem. It’s simple – simple parameterization, to be precise. Just set the parameterization option back to its default setting of simple, and you will get a much better behavior.
ALTER DATABASE FilterDemo
SET PARAMETERIZATION SIMPLE;
SET STATISTICS IO ON;
go
SELECT COUNT(*)
FROM dbo.AllThings
WHERE is_deleted = 0;
Now the query takes just two logical reads. And the execution plan looks as expected: a scan of the filtered index, that’s all.
Conclusion
When you look only at the plan cache, forced parameterization may look like manna from heaven. But when you look further, you will see that setting the parameterization option to forced is probably not a good idea at all. It may appear to solve some issues, but you get bigger issues in return – filtered indexes that might boost performance tremendously are ignored, and if they are hinted it can even cause errors. Do you know all the code that is running on your system? Are you sure that none of your application developers has ever added an index hint? Do you want to find out the hard way?
Simple parameterization in combination with filtered indexes may not always play well with the plan cache. But I would think very long and hard, and do a close inspection of all objects in the database and all queries used before even considering switching to forced parameterization.
1 Comment. Leave new
Hi Hugo,
As an aside, adding OPTION (RECOMPILE) allows the filtered index to be used even where parameterization is forced (there is no possibility of plan reuse).
Paul