The Problem
Recently our development team added a new stored procedure, and rightly suggested we add an index to support one section of that procedure. The column to be indexed is a bit, and it is NULL or false 99.94% of the time (in an 18M row table). I felt a filtered index would be best.
On a non-prod box I added the new filtered index, and then tested the related snippet from the stored procedure to verify that I had the filtered index defined correctly. In this article we can use AdventureWorks2012 to demonstrate all of this, using the SalesOrderDetail table and integer column SpecialOfferID (note: you may not see the same query plans as I do until later):
--add filtered index CREATE INDEX IX_SalesOrderDetail_SpecialOfferID_9 ON Sales.SalesOrderDetail(ProductID) INCLUDE (SalesOrderID, SalesOrderDetailID) WHERE SpecialOfferID = 9; --query snippet SELECT SalesOrderID, SalesOrderDetailID FROM AdventureWorks2012.Sales.SalesOrderDetail WHERE SpecialOfferID = 9;
When I checked the query plan, the snippet did not use my new filtered index, it used a clustered index scan!
I tried various perturbations of the CREATE INDEX statement, no change. I read many articles/posts (mostly by Paul White it seems) regarding filtered indexes, problems, bugs, workarounds...no luck.
Almost by accident I tried this:
DECLARE @Results TABLE ( SalesOrderID INT NOT NULL, SalesOrderDetailID INT NOT NULL ); INSERT @Results (SalesOrderID, SalesOrderDetailID) SELECT SalesOrderID, SalesOrderDetailID FROM AdventureWorks2012.Sales.SalesOrderDetail WHERE SpecialOfferID = 9;
and the query plan showed the filtered index was used! Perplexing.
I tried same with a #temp table but this just yielded another clustered index scan. I tried forcing the filtered index with a hint and was rewarded with an error:
SELECT SalesOrderID, SalesOrderDetailID FROM AdventureWorks2012.Sales.SalesOrderDetail WITH (INDEX(IX_SalesOrderDetail_SpecialOfferID_9)) WHERE SpecialOfferID = 9; Msg 8622, Level 16, State 1, Line 2 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
Search for Root Causes
At this point I figured I would reach out to SQLServerCentral community and ask for help. I scripted out a table with mocked-up data that I could use to repro this. But I could not repro the problem! I tried making a higher fidelity mockup of the table but still could not repro the problem.
I asked a colleague if he had any ideas. Gave him my queries. He could not repro the problem either. Using the same exact data, mind you. When he ran the query with the index hint, it worked! At some point we thought maybe there was something odd with my session. We both ran DBCC USEROPTIONS and compared results--no difference.
Finally we noticed that we were using different database contexts. My colleague was running the queries from master (default DB for his login), while I had changed into the DB where the data was.
I began comparing database options. Maybe we had a DB compatibility mode issue? No. Nothing jumped out at first (there are lots of DB options). I wrote a script to iterate through all of our DBs and generate query plans. Only a handful of our DBs generated plans with clustered index scans. With this list it was not so hard to find the DB option behind all this: is_parameterization_forced. This was set to 0 for the DBs where the filtered index was used, and 1 for the DBs where clustered index scan was chosen. The corresponding ALTER DB syntax for this option is PARAMETERIZATION { SIMPLE | FORCED }.
Root Cause Identified
I researched this some more, starting with "why do some of my DBs have this setting and some not?" Turns out this was set 6 or 7 years ago, based on a recommendations from a performance consultation. The DBs where PARAMETERIZATION is not set were created later, and this setting had never made its way into our DB standards.
Anyway...I finally understood why the filtered index was not used, and why I saw an error when I attempted to force that index. When my database context was a DB with PARAMETERIZATION FORCED, the optimizer first replaced the literal "9" in my query with a placeholder parameter before attempting to optimize the query. Note: this is completely evident in the top query plan above, the predicate shows @0 instead of my "9", I just didn't notice. Since my filtered index was only good for 9 this index could not be chosen for this more general query, and next best option was a clustered index scan.
This article, https://technet.microsoft.com/en-US/library/ms175037%28v=SQL.105%29.aspx, has a list of exceptions, cases where literals will not be converted to parameters before compilation. One exception is "statements that reference variables," which explains why my @tablevar example used the filtered index.
Knowing all this, I ran one last test, encapsulating the query in a stored procedure. This worked fine, and the proc uses the filtered index no matter what DB I'm in:
CREATE PROC dbo.TEST1 AS SELECT SalesOrderID, SalesOrderDetailID FROM AdventureWorks2012.Sales.SalesOrderDetail WHERE SpecialOfferID = 9; GO EXEC dbo.TEST1;
How to Repro
If you've been unable to reproduce my results with the AdventueWorks2012 query above, I bet it's because of your current database context. See if you have any DBs with PARAMETERIZATION FORCED:
SELECT name, is_parameterization_forced FROM sys.databases ORDER BY name;
If you find one, USE that DB and then repeat the above query--you should see clustered index scans now. If you don't find one, just create a new test DB and ALTER that DB, setting option PARAMETERIZATION to FORCED. Then USE the new DB and run the query.
Takeaway
I wrote this article partly just to share my story, but more importantly to say "we need to be aware of this DB setting, because it changes the way things work!" Here are two situations where I think this will matter:
- Users running large ad-hoc queries that rely on very specific literal values. I think, in general, you want the database context for such queries to always be a DB with PARAMETERIZATION SIMPLE. This way the optimizer can use those literal values and the corresponding statistics to arrive at an optimal query plan. If these large ad-hoc queries are instead run from a database context with PARAMETERIZATION FORCED, then all those literals will be parameterized away before the optimizer chooses a query plan, and the resulting plan will inevitably be worse. Maybe dramatically worse. I am certain I have users who have suffered from running queries in the wrong DB context.
- Developers/DBAs wanting to check query plans of stored procedures. I commonly work with snippets for large stored procedures, checking query plans along the way. I realize now that this can only work 100% when the current database context is a DB with PARAMETERIZATION SIMPLE. When checking query plans for snippets (not compiled procedures) and the current database context is a DB with PARAMETERIZATION FORCED then the resulting query plans are going to be worse/different/wrong.
Frankly I've never paid that much attention to my current database context. I will going forward.
Note: this was on SQL 2012, but I expect this applies equally to SQL 2008 or SQL 2014.