Let’s say you have a very large table on a SQL Server 2012 Standard Edition instance. This means: old cardinality estimator and no partitioning. The table has a DATETIME column, which is ever-increasing, and it contains 5 years of data (it has to, due to regulations). The auto-update statistics kicks in only every 4 days, more or less, even when trace flag 2371 is enabled. The problem is that users usually query the table for the last day, and only rarely need to access older data. Since auto-update statistics uses a very small sample rate for very large tables, the result is not very accurate. The bottom line of all this is that most of the time you get a poor execution plan, because the optimizer estimates very few rows, while in fact there are many rows. What can you do?
You can disable auto-update statistics, and replace it with your own job that runs daily and update statistics with full scan. This will definitely improve accuracy, but since the table is very large, it’s a very heavy operation that you can’t afford to run on your mission-critical 24×7 database. And we already agreed that you can’t use partitioning or the wonderful incremental statistics feature (introduced in SQL Server 2014). But there are good news – you can use filtered statistics.
You can create a daily job that creates statistics only for the last day. You’ll need to use dynamic SQL for that, but it’s not a big deal. Since it’s just the last day, you can create it with full scan, and it will be super-fast and super-accurate. Don’t forget to drop the statistics from the previous day in each job execution. Now, you also need to rewrite your queries to use dynamic SQL (or OPTION(RECOMPILE)), so that the optimizer can be sure that the filtered statistics covers your query. This part can be a pain, I know. It is much easier in newer versions of SQL Server (2016 and above), so if you needed more motivation to upgrade, there you go…
As soon as you have that mechanism in place, you’ll start to see good execution plans with very accurate row estimations. And the users will be happy again. Try it and see for yourself…
The post Use Filtered Statistics to Improve Performance on Very Large Tables appeared first on Madeira Data Solutions.