Partitioned tables and indexes have to divide their stuff into small specified chunks of data which might be spread across more than one file groups/data files in a database.
In versions prior to SQL Server 2005 we were implement the horizontal partition using slicing the one large table into more than one small physical tables by implementing of database check constraint (that’s known as range partition) and then we combine that small physical tables into one partitioned updatable view. But in SQL Server 2005 we have an ability to partition a table using table/index partitioning feature.
Why partition? What was the point for your company/client?
Partitioned tables and indexes are partitioned horizontally so that groups of rows are mapped into individual partitions. DML operations performed on the data are executed as if the whole table or index is a single entity.
In our example we will create a partitioned Tasks table with a huge number of rows, where we will implement the solution to overcome the performance issues using partitioning technique.
First we need to execute this code snippet to create a table and dump 10 million rows in the Tasks table with TaskId and TaskName columns using common type expression technique
WITH Numbers AS ( SELECT 0 as Num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 ) SELECT (mil.Num * 100000) + (hThou.Num * 100000) + (tThou.Num * 10000) + (thou.Num * 1000) + (hund.Num * 100) + (tens.Num * 10) + ones.Num AS TaskId, 'Task Name' AS TaskName INTO Tasks FROM Numbers AS ones CROSS JOIN Numbers AS tens CROSS JOIN Numbers AS hund CROSS JOIN Numbers AS thou CROSS JOIN Numbers AS tThou CROSS JOIN Numbers AS hThouCROSS JOIN Numbers AS mil
Steps to partitioning the Tasks table:
The CREATE PARTITION FUNCTION is a first step that maps the rows of a table or index into partitions, based on the values of one or more columns. The following partition function will partition a Tasks table into four partitions.
CREATE PARTITION FUNCTION fnk_tasks (INT) AS RANGE RIGHT FOR VALUES (1000000, 2000000, 5000000)
The following is shows how a table that uses this partition function on partitioning column TaskId would be partitioned.
Partition-I (TaskId >= 1 AND TaskId < 1000000)
Partition-II (TaskId >= 1000000 AND TaskId < 2000000)
Partition-III (TaskId >= 2000000 AND TaskId < 5000000)
Partition-IV (TaskId >= 5000000)
Now creates a partition scheme in the current database that maps the partitions of a partitioned table or index to file groups, which associates partition function to file groups, determining physical layout of data.
CREATE PARTITION SCHEME ups_Tasks AS PARTITION fnd_Tasks TO (DataFile1, DataFile2, DataFile3, DataFile4)
The following listing shows how the table partitioning spans over multiple data files. Partitions of a table that uses partition function fnk_Tasks on partitioning column TaskId.
Partition-I
(TaskId >= 1 AND TaskId < 1000000) [reside in DataFile1]
Partition-II
(TaskId >= 1000000 AND TaskId < 2000000) [reside in DataFile2]
Partition-III
(TaskId >= 2000000 AND TaskId < 5000000) [reside in DataFile3]
Partition-IV
(TaskId >= 5000000) [reside in DataFile4]
Now we create a partitioned table based on already created partition function and partition scheme objects:
CREATE TABLE Tasks (TaskId INT, TaskName VARCHAR(20)) ON ups_Tasks(TaskId)
Once you start performing operations on the target table (such as data additions, modifications, or queries), the database engine will transparently redirect them to appropriate partitions.
The following data access statement only finds the requested rows from first partition of Tasks table:
SELECT TaskId, TaskName FROM Tasks
WHERE TaskId BETWEEN
10000 AND 20000
The
following DML UPDATE statement updates the specific rows just only exploring
the second table partition:
UPDATE Tasks SET TaskName = 'Task Modified' WHERE
TaskId = 1500000
And the following DML DELETE statement deletes specific rows just only exploring the third table partition:
DELETE Tasks WHERE TaskId IN ( SELECT TaskId FROM Tasks WHERE TaskId BETWEEN 2500000 AND 2500010
The above mentioned queries are just simple example queries to explain how table partitions are worked, the actual queries for Tasks and their related tables are very long and complex by nature. The important thing is here why I need to use partitioning feature of SQL Server2005 on Tasks table in our development DB, this table have approximately 10+ million rows and in past (SQL Server2000) we were using data archiving for performance improvements but in this way we were facing some hectic consideration to maintain all independent table partitions and all Tasks related queries. Now using cool partitioning feature SQL Server 2005 internally manage how SQL Server engine search rows using a particular partition instead of whole table.
For further detail please see Microsoft SQL Server Books Online or following Microsoft MSDN link on table/indexes partition:http://msdn2.microsoft.com/en-us/library/ms345146.aspx
Conclusion
Table-based partitioning features in SQL Server 2005 provide flexibility and performance to simplify the creation and maintenance of partitioned tables. Trace the progression of capabilities from logically and manually partitioning tables to the latest partitioning features, and find out why, when, and how to design, implement, and maintain partitioned tables using SQL Server 2005.
SQL Server 2005 offers a way to easily and consistently manage large tables and indexes through partitioning, which allows you to manage subsets of your data outside of the active table. This provides simplified management, increased performance, and abstracted application logic, as the partitioning scheme is entirely transparent to the application.