A few weeks ago I had a customer engagement, where the customer wanted to implement
Nested Partitioned Views. Before we go into the details of this approach and if it
could work, I want to give you an overview about the requirements, and why the customer
wanted to have Nested Partitioned Views.
The main idea was that the customer had to implement their SQL Server database for
both Standard and Enterprise Edition. With the Enterprise Edition the customer used
a combination of Partitioned Views and Partitioned Tables, as you can see it from
the following picture.
Some of you might ask, why the heck you want to implement such a scenario. The answer
is easy: Partitioned Views itself have some disadvantages, and on the other hand Partitioned
Tables also have some disadvantages. But when you combine then you will get the benefits
from BOTH worlds. The following list describes the pros and cons of Partitioned Views
in SQL Server.
- Pros
- Available on any Edition of SQL Server
- Each table has its own better Statistics
- Index Rebuilds of any table is an ONLINE operation (when done on Enterprise Edition)
- Each table can be indexed independently
- Operational Data
- Historical Data
- Cons
- Lots of tables to administer
- Indexes must be created on individual tables
- Check Constraints are needed for Table Elimination
- Gaps and overlapping values are possible
The following list describes the pros and cons of Partitioned Tables in SQL Server.
- Pros
- Only one table to administer
- Gaps and overlapping values are NOT possible
- Completely transparent
- Cons
- Available only on the Enterprise Edition of SQL Server
- Table-Level Statistics
- Less accurate on larger Partitioned Tables
- Filtered Statistics can help here…
- Partition Level Index Rebuilds are OFFLINE operations
- Only the whole Partitioned Table can be rebuild ONLINE
- Supports Partitioning only over a single column
- Persisted Computed Columns are needed
As you can see from this list, there are a lot of disadvantages of Partitioned Tables,
especially in the area of maintenance (less accurate Statistics, Partition Level Index
Rebuilds). As you have seen in the previous figure, you can achieve a very powerful
solution, when you combine Partitioned Views with Partitioned Tables.
Because the customer also had to support the database on the Standard Edition of SQL
Server (where Partitioned Tables are not available), they just wanted to substitute
the Partitioned Tables with Partitioned Views – so in the final solution you just
have Partitioned Views inside Partitioned Views – just Nested Partitioned Views. The
following picture illustrates this concept.
So the question was now, if this could be done with SQL Server? Of course, you can
make it this way, but what's about performance? SQL Server can use with a single Partitioned
View Table Elimination when each participating table has a corresponding
CHECK constraint defined. Therefore SQL Server only has to query the necessary tables
– it's almost the same concept as Partition Elimination with Partitioned
Tables. But would Table Elimination work with Nested Partitioned Views? Because there
is no chance to define a CHECK constraint on the inner Partitioned Views…
Before I give you the ultimate answer on that question, I want to walk through a complete
scenario, how you can setup and implement Nested Partitioned Views as illustrated
in the last picture. In the first step we are creating a new database, and adding
2 new file groups to it, where the data of 2007 and 2008 is stored.
USE master
GO
-- Create
a new database
CREATE DATABASE VLDB ON PRIMARY
(
NAME = N'VLDB',
FILENAME = N'C:\Program
Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\VLDB.mdf',
SIZE = 5072KB,
FILEGROWTH = 1024KB
)
LOG ON
(
NAME = N'VLDB_log',
FILENAME = N'C:\Program
Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\VLDB_log.ldf',
SIZE = 2048000KB, --
Initialize the log with 2GB, this gives us 16 VLFs
FILEGROWTH = 10%
)
GO
-- Create
a new file group for the 2007 sales data
ALTER DATABASE VLDB
ADD FILEGROUP Sales2007FG
GO
-- Add
a new file to the previous created file group
ALTER DATABASE VLDB
ADD FILE
(
NAME = 'Sales2007_Data',
FILENAME = 'C:\Program
Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\Sales2007_Data.ndf',
SIZE = 300,
FILEGROWTH = 10%
)
TO FILEGROUP Sales2007FG
GO
-- Create
a new file group for the 2008 sales data
ALTER DATABASE VLDB
ADD FILEGROUP Sales2008FG
GO
-- Add
a new file to the previous created file group
ALTER DATABASE VLDB
ADD FILE
(
NAME = 'Sales2008_Data',
FILENAME = 'C:\Program
Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\Sales2008_Data.ndf',
SIZE = 300,
FILEGROWTH = 10%
)
TO FILEGROUP Sales2008FG
GO
In each file group we create a Partitioned View which subdivides the data into 4 separate
tables for each quarter (Q1, Q2, Q3, Q4). The following listing shows how to setup
the Partitioned View for the 2007 sales data.
-- Table
for 2007 Q1
CREATE TABLE [Sales2007_Q1]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2007Q1 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20070101' AND DateKey < '20070401'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2007_Q1]
ADD CONSTRAINT PK_Sales2007Q1 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2007/Q1
INSERT INTO [Sales2007_Q1] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20070101' AND DateKey < '20070401'
GO
-- Table
for 2007 Q2
CREATE TABLE [Sales2007_Q2]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2007Q2 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20070401' AND DateKey < '20070701'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2007_Q2]
ADD CONSTRAINT PK_Sales2007Q2 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2007/Q2
INSERT INTO [Sales2007_Q2] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20070401' AND DateKey < '20070701'
GO
-- Table
for 2007 Q3
CREATE TABLE [Sales2007_Q3]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2007Q3 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20070701' AND DateKey < '20071001'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2007_Q3]
ADD CONSTRAINT PK_Sales2007Q3 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2007/Q3
INSERT INTO [Sales2007_Q3] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20070701' AND DateKey < '20071001'
GO
-- Table
for 2007 Q4
CREATE TABLE [Sales2007_Q4]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2007Q4 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20071001' AND DateKey < '20080101'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2007_Q4]
ADD CONSTRAINT PK_Sales2007Q4 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2007/Q4
INSERT INTO [Sales2007_Q4] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20071001' AND DateKey < '20080101'
GO
-- Create
the Partitioned View for 2007
CREATE VIEW Sales2007
AS
SELECT * FROM Sales2007_Q1
UNION ALL
SELECT * FROM Sales2007_Q2
UNION ALL
SELECT * FROM Sales2007_Q3
UNION ALL
SELECT * FROM Sales2007_Q4
GO
As you can see from the listing, each individual table has a corresponding CHECK constraint,
so that SQL Server can eliminate the tables which don't have to be accessed, when
we query our Partitioned View. The next listing shows how to setup the Partitioned
View for the 2008 sales data.
-- Table
for 2008 Q1
CREATE TABLE [Sales2008_Q1]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2008Q1 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20080101' AND DateKey < '20080401'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2008_Q1]
ADD CONSTRAINT PK_Sales2008Q1 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2007/Q1
INSERT INTO [Sales2008_Q1] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20080101' AND DateKey < '20080401'
GO
-- Table
for 2008 Q2
CREATE TABLE [Sales2008_Q2]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2008Q2 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20080401' AND DateKey < '20080701'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2008_Q2]
ADD CONSTRAINT PK_Sales2008Q2 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2008/Q2
INSERT INTO [Sales2008_Q2] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20080401' AND DateKey < '20080701'
GO
-- Table
for 2008 Q3
CREATE TABLE [Sales2008_Q3]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2008Q3 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20080701' AND DateKey < '20081001'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2008_Q3]
ADD CONSTRAINT PK_Sales2008Q3 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2008/Q3
INSERT INTO [Sales2008_Q3] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20080701' AND DateKey < '20081001'
GO
-- Table
for 2008 Q4
CREATE TABLE [Sales2008_Q4]
(
DateKey DATETIME NOT NULL
CONSTRAINT Sales2008Q4 --
The CHECK constraint is still needed to do Partition Elimination for the Partitioned
View
CHECK (DateKey >= '20081001' AND DateKey < '20090101'),
OnlineSalesKey INT NOT NULL,
SalesOrderNumber NVARCHAR(20) NOT NULL,
SalesAmount MONEY NOT NULL
)
GO
-- Add
a primary key clustered
ALTER TABLE [Sales2008_Q4]
ADD CONSTRAINT PK_Sales2008Q4 PRIMARY KEY CLUSTERED
(
DateKey,
OnlineSalesKey
)
GO
-- Load
data into the table for the year 2008/Q4
INSERT INTO [Sales2008_Q4] (DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount)
SELECT DateKey, OnlineSalesKey, SalesOrderNumber, SalesAmount FROM ContosoRetailDW.dbo.FactOnlineSales
WHERE DateKey >= '20081001' AND DateKey < '20090101'
GO
-- Create
the Partitioned View for 2008
CREATE VIEW Sales2008
AS
SELECT * FROM Sales2008_Q1
UNION ALL
SELECT * FROM Sales2008_Q2
UNION ALL
SELECT * FROM Sales2008_Q3
UNION ALL
SELECT * FROM Sales2008_Q4
GO
After we have now setup the Partitioned Views for 2007 and 2008, we finally create
another Partitioned View which just unions the data of the other 2 Partitioned Views
together:
-- Create
a Partitioned View on top of the 2 Partitioned Views
CREATE VIEW Sales
AS
SELECT * FROM Sales2007
UNION ALL
SELECT * FROM Sales2008
GO
So now we have created our Nested Partitioned View: Partitioned Views within Partitioned
Views. The final thing that we now have to check is, if SQL Server is able to do Table
Elimination with our Nested Partitioned View. When we talk about Table Elimination
(or Partition Elimination with Partitioned Table) we have to differentiate between
2 types of elimination:
- Static Elimination
- Dynamic Elimination
Static Elimination means that you provide within your query a static
value. So during the compilation of the Execution Plan the Query Optimizer already
knows which data we are accessing. In that case SQL Server generates an Execution
Plan, which only references the relevant data. Look at the following query:
-- Shows
static Table Elimination across the Partitioned View.
-- This
SELECT accesses the following table:
-- =>
Sales2007_Q3
SELECT * FROM Sales
WHERE DateKey >= '20070721' AND DateKey <= '20070725'
GO
We are providing here hard coded values, so the Query Optimizer can sniff them, and
generates an Execution Plan, which only accesses the Sales2007_Q3 table.
So Static Elimination works without any problems with Nested Partitioned Views, as
you can see it in the following Execution Plan:
Static Elimination is fine, if you are doing demos etc., but in a production database
when you are calling your query with different values through parameters, SQL Server
has to do Dynamic Elimination. Dynamic Elimination means that the
Query Optimizer has to generate an Execution Plan, which can't take into account during
the compilation which tables you are accessing, because those tables are just determined
during runtime – therefore the name Dynamic Elimination. So let's imagine we are executing
the following parameterized query:
DECLARE @from DATETIME = '20070721'
DECLARE @to DATETIME = '20070725'
SELECT * FROM Sales
WHERE DateKey >= @from AND DateKey <= @to
GO
In that case SQL Server can't determine which data is accessed, because the referenced
variables are runtime constructs, they are not available during the compilation phase
of our query. The above query produces the following Execution Plan:
When you look at the first step on that plan, you might think that SQL Server has
to seek into each table to get the corresponding records. The Query Optimizer estimates
each Clustered Index Seek with around 12%. But that's only the half-truth
of that Execution Plan! The real power lies in the Filter Operator which
comes before the Clustered Index Seek. When you look at the Tooltip Window of the
Filter Operator inside SQL Server Management Studio, you can see that the operator
has a so-called Startup Expression Predicate:
As you can see from the picture, the Startup Expression Predicate has
the same value as the CHECK constraint on the table itself. So that Filter Operator
only calls the Clustered Index Seek operator on the underlying table
when the Startup Expression Predicate is evaluated to true! You can also cross-check
this with the property Actual Executions on the Clustered
Index Seek Operators:
In our case SQL Server only executed the Clustered Index Seek Operator on the Sales2007_Q3 table.
On all other tables the Filter Operator prevented through the Startup Expression Predicate
the execution of the Clustered Index Seeks! That's a really powerful concept for Dynamic
Table Elimination inside Partitioned Views.
As you can see from this example, Nested Partitioned Views are working without any
problems in SQL Server, and on the other hand SQL Server uses a Startup Expression
Predicate inside a Filter Operator to drive Dynamic Table Elimination when you work
with Partitioned Views – it doesn't matter here if those Views are nested or not.
In that case it is really important that you check the whole Execution Plan to get
a better understanding what SQL Server is actually doing for your query. When you
just look on that plan (without going into the details), you might think that SQL
Server has to access every table, but that's not really true!
A big Thank You goes to Paul White (Weblog, Twitter)
who helped me to understand that behavior and implementation strategy of SQL Server.
You can find the whole sample of Nested Partitioned Views here for
download.
Thanks for reading
-Klaus