If the partitioning MERGE command is executed for the wrong range after a SWITCH OUT of historic data, data movement between file groups may be necessary before or during the next index rebuild. The SQLCMD lab setup script at the end of this article:
- Creates a test database and 2 partitioned tables, one table is range right, the other range left. Each contains 6 partitions, the first and last partitions are empty. The 4 middle partitions in each table contain data in the ranges between 1 to 999, 1,000 to 1,999, 2,000 to 2,999, 3,000 to 3,999.
- Inserts 4,000 rows of test data
- Data in the lowest populated partition range (1 – 999) is switched out, partition 2 in each table
- The merge command is executed but at the wrong boundary for the RANGE RIGHT table (1001) causing a move of data between file groups.
- Stop the script before running the MERGE final INDEX REBUILD’s, download the SQL Server Debug symbols and create an extended events session to capture the call stack.
Immediately after the MERGE / before the final INDEX REBUILD, the system ‘Data Space’ views for the RANGE RIGHT table show where the data will be (FG2), not where it actually still is (FG3). The data seems to have moved between file groups instantly but I think the operation has been deferred until the index rebuild.
The lower file group (FG2) is retained by the MERGE command when the partition scheme is RANGE RIGHT. The data previously resided in File Group 3 so a move is required before or during the next index rebuild.
Extended Events and the Call Stack
An extended event session was started before the MERGE and final index rebuilds (RANGE RIGHT & RANGE LEFT) and the call stack analysed for various wait types including WRITE_COMPLETION and IO_COMPLETION. There were no noticeable, consistent bucketized call stack differences for any of the wait types I tried. There were differences during ‘file-written’ events, both tables included the call stack below although it was invoked more for the RANGE RIGHT table with it’s associated data movement between file groups
XeSqlPkg::file_written::Publish+1ba [ @ 0+0x0 FireWriteEvent+16c [ @ 0+0x0
FCB::AsyncWrite+1bc [ @ 0+0x0
SQLServerLogMgr::FlushLC+7b4 [ @ 0+0x0
SQLServerLogMgr::AppendLogRequest+1a2 [ @ 0+0x0
SQLServerLogMgr::ReserveAndAppend+74 [ @ 0+0x0
XdesRMReadWrite::GenerateLogRec+5ac [ @ 0+0x0
PageRef::FormatFull+1a3 [ @ 0+0x0
CBulkAllocator::FinishCurrentBuf+97 [ @ 0+0x0
CBulkAllocator::AllocateLinkedAndFormattedLeafPage+f2 [ @ 0+0x0
CIndBuild::AllocateNextIndexPage+15 [ @ 0+0x0
CIndBuild::InsertRow+bf2 [ @ 0+0x0
RowsetCreateIndex::InsertIndexRow+39e8 [ @ 0+0x0
CValRowCrtIdx::SetDataX+1d6 [ @ 0+0x0 0x000007FE070F6FE7
CQScanIndexNew::NonOptimizedGetAndInsertRows+f3 [ @ 0+0x0
CQScanIndexNew::GetRow+31 [ @ 0+0x0
CQScanNLJoinTrivialNew::GetRow+126 [ @ 0+0x0
CQueryScan::GetRow+81 [ @ 0+0x0
CXStmtQuery::ErsqExecuteQuery+36a [ @ 0+0x0
CXStmtDML::XretDMLExecute+2f3 [ @ 0+0x0
CXStmtDML::XretExecute+ad [ @ 0+0x0
CMsqlExecContext::ExecuteStmts<1,1>+400 [ @ 0+0x0
CMsqlExecContext::FExecute+a33 [ @ 0+0x0
The call stack below only appeared during INDEX REBUILD of the RANGE RIGHT table.
XeSqlPkg::file_written::Publish+1ba [ @ 0+0x0 FireWriteEvent+16c [ @ 0+0x0
FCB::AsyncWrite+1bc [ @ 0+0x0
SQLServerLogMgr::UpdateFileHdr+29b [ @ 0+0x0
SQLServerLogMgr::FlushLC+97f [ @ 0+0x0
SQLServerLogMgr::AppendLogRequest+1a2 [ @ 0+0x0
SQLServerLogMgr::ReserveAndAppend+74 [ @ 0+0x0
PageRef::ModifyBitsNonTransactional+3e4 [ @ 0+0x0 ChangeGhostPageState+116 [ @ 0+0x0 ChangeGhostPageState+129 [ @ 0+0x0
PageRef::MarkPfsGhostState+24 [ @ 0+0x0
IndexPageRef::MarkGhost+135e [ @ 0+0x0
BTreeRow::DeleteRecord+1713 [ @ 0+0x0
IndexDataSetSession::DeleteRow+296 [ @ 0+0x0
RowsetNewSS::DeleteRows+4e [ @ 0+0x0
CMEDScan::DeleteRow+82 [ @ 0+0x0
CMEDCatKatmaiIndex::DropRowset+2df [ @ 0+0x0
VisibleHoBt::DropHoBt+357 [ @ 0+0x0 SEDropRowsetInternal+68f [ @ 0+0x0
DDLAgent::SEDropRowsets+4b [ @ 0+0x0
CIndexDDL::DropRowset+72 [ @ 0+0x0
CIndexDDL::DropAllRowsets+ae [ @ 0+0x0
CIndexDDL::DropIndexEntryAndRowsets+1c0 [ @ 0+0x0
CIndexDDL::ConstructIndicesOffline+539 [ @ 0+0x0
Conclusion
This was hard work and a steep learning curve but worthwhile. Installing the debug symbols in a development environment allows low level diagnostics that help by putting a context to events. Despite the cryptic nature of the results, I think there is value in it.
References
Ideas for a test lab - http://www.brentozar.com/archive/2013/01/sql-server-table-partitioning-tutorial-videos-and-scripts/
Clarify range left/range right partitioning - http://www.sqlskills.com/blogs/kimberly/clarifying-left-and-right-in-the-defintion-of-a-partition-function-in-sql-server-2005/
Extended Events Session & Debug Symbols - http://www.sqlskills.com/blogs/paul/determine-causes-particular-wait-type/
Partitioning Query, modified to include scheme to qualify tables and an outer join to file groups so empty ones are included - http://davidpeterhansen.com/view-partitions-in-sql-server/
Lab Setup
The T-SQL code below is the partitioning lab setup script used to generate the events, not the XE session to capture the call stack though. That and the process for downloading the SQL Server debug symbols are in Paul Randal's blog in this documents references. His XE session predicate captured only WRITE_COMPLETION wait types, that was changed here and the call stack was captured during 'file-written' events, and just for the SPID running the final INDEX REBUILD commands.
----------------------------------------------------------------------------------- -- PartitionLabSetup_20140428.sql -- Creates a test database (workspace) -- Adds file groups and files -- Creates partition functions and schema's -- Creates and populates 2 partitioned tables (PartitionedRight & PartitionedLeft) -- Creates target tables for switching out -- Performs SWITCH & MERGE RANGE ----------------------------------------------------------------------------------- USE [master] GO ----------------------------------------------------------------------------------- -- Create Test Database ----------------------------------------------------------------------------------- :SETVAR DatabaseName "workspace" :SETVAR TableNameRight "PartitionedRight" :SETVAR TableNameLeft "PartitionedLeft" :SETVAR DataDrive "D:\SQL\Data\" :SETVAR LogDrive "D:\SQL\Logs\" -- Drop if exists and create Database IF DATABASEPROPERTYEX(N'$(databasename)','Status') IS NOT NULL BEGIN ALTER DATABASE $(DatabaseName) SET SINGLE_USER WITH ROLLBACK IMMEDIATE DROP DATABASE $(DatabaseName) END GO CREATE DATABASE $(DatabaseName) ON ( NAME = $(DatabaseName)_data, FILENAME = N'$(DataDrive)$(DatabaseName)_data.mdf', SIZE = 10, MAXSIZE = 500, FILEGROWTH = 5 ) LOG ON ( NAME = $(DatabaseName)_log, FILENAME = N'$(LogDrive)$(DatabaseName).ldf', SIZE = 5MB, MAXSIZE = 5000MB, FILEGROWTH = 5MB ) ; GO DECLARE @msg varchar(200) = 'Run Status'; SET @msg = 'Database $(DatabaseName) has been created'; RAISERROR (@msg,0,0); ----------------------------------------------------------------------------------- -- Add file groups and files ----------------------------------------------------------------------------------- DECLARE @nSQL NVARCHAR(2000) ; DECLARE @x INT = 1; -- Range Right WHILE @x <= 6 BEGIN SELECT @nSQL = 'ALTER DATABASE $(DatabaseName) ADD FILEGROUP $(TableNameRight)_fg' + RTRIM(CAST(@x AS CHAR(5))) + '; ALTER DATABASE $(DatabaseName) ADD FILE ( NAME= ''$(TableNameRight)_f' + CAST(@x AS CHAR(5)) + ''', FILENAME = ''$(DataDrive)\$(TableNameRight)_f' + RTRIM(CAST(@x AS CHAR(5))) + '.ndf'' ) TO FILEGROUP $(TableNameRight)_fg' + RTRIM(CAST(@x AS CHAR(5))) + ';' EXEC sp_executeSQL @nSQL; SET @x = @x + 1; END -- Range Left SET @x = 1 WHILE @x <= 6 BEGIN SELECT @nSQL = 'ALTER DATABASE $(DatabaseName) ADD FILEGROUP $(TableNameLeft)_fg' + RTRIM(CAST(@x AS CHAR(5))) + '; ALTER DATABASE $(DatabaseName) ADD FILE ( NAME= ''$(TableNameLeft)_f' + CAST(@x AS CHAR(5)) + ''', FILENAME = ''$(DataDrive)\$(TableNameLeft)_f' + RTRIM(CAST(@x AS CHAR(5))) + '.ndf'' ) TO FILEGROUP $(TableNameLeft)_fg' + RTRIM(CAST(@x AS CHAR(5))) + ';' EXEC sp_executeSQL @nSQL; SET @x = @x + 1; END SET @msg = 'File Groups have been created.'; RAISERROR (@msg,0,0); ----------------------------------------------------------------------------------- -- Create partition functions and schemes ----------------------------------------------------------------------------------- USE $(DatabaseName); CREATE PARTITION FUNCTION $(TableNameRight)_func (int) AS RANGE RIGHT FOR VALUES ( 1, 1001, 2001, 3001, 4001 ); CREATE PARTITION FUNCTION $(TableNameLeft)_func (int) AS RANGE LEFT FOR VALUES ( 0, 1000, 2000, 3000, 4000 ); CREATE PARTITION SCHEME $(TableNameRight)_scheme AS PARTITION $(TableNameRight)_func TO ( $(TableNameRight)_fg1, $(TableNameRight)_fg2, $(TableNameRight)_fg3, $(TableNameRight)_fg4, $(TableNameRight)_fg5, $(TableNameRight)_fg6 ); CREATE PARTITION SCHEME $(TableNameLeft)_scheme AS PARTITION $(TableNameLeft)_func TO ( $(TableNameLeft)_fg1, $(TableNameLeft)_fg2, $(TableNameLeft)_fg3, $(TableNameLeft)_fg4, $(TableNameLeft)_fg5, $(TableNameLeft)_fg6 ); ----------------------------------------------------------------------------------- -- Create and populate test tables ----------------------------------------------------------------------------------- -- Range Right CREATE TABLE [dbo].$(TableNameRight)( [Partition_PK] [int] IDENTITY(1,1)NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableNameRight)] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC ) ON $(TableNameRight)_scheme(Partition_PK) ) ON $(TableNameRight)_scheme(Partition_PK); ALTER TABLE [dbo].$(TableNameRight) ADD CONSTRAINT [DF_$(TableNameRight)_CreateDate] DEFAULT (getdate()) FOR [CreateDate] ALTER TABLE [dbo].$(TableNameRight) ADD CONSTRAINT [DF_$(TableNameRight)_CreateServer] DEFAULT (@@servername) FOR [CreateServer] -- Range Left CREATE TABLE [dbo].$(TableNameLeft)( [Partition_PK] [int] IDENTITY(1,1) NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableNameLeft)] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC ) ON $(TableNameLeft)_scheme(Partition_PK) ) ON $(TableNameLeft)_scheme(Partition_PK); ALTER TABLE [dbo].$(TableNameLeft) ADD CONSTRAINT [DF_$(TableNameLeft)_CreateDate] DEFAULT (getdate()) FOR [CreateDate] ALTER TABLE [dbo].$(TableNameLeft) ADD CONSTRAINT [DF_$(TableNameLeft)_CreateServer] DEFAULT (@@servername) FOR [CreateServer] GO -- Populate test tables with different test data SET NOCOUNT ON; INSERT INTO [dbo].$(TableNameRight) (RandomNbr) VALUES (ROUND((RAND() * 100),0)) ; GO 4000 -- Make test easier using row counts to cross reference data movement between file groups DELETE FROM [dbo].$(TableNameRight) WHERE Partition_PK IN (50,1050,1051,2050,2051,2052,3050,3051,3052,3053) SET IDENTITY_INSERT [dbo].$(TableNameLeft) ON INSERT INTO [dbo].$(TableNameLeft) (Partition_PK,RandomNbr) SELECT Partition_PK, RandomNbr FROM [dbo].$(TableNameRight); GO -- Display Results Procedure CREATE PROCEDURE dbo.DisplayTestFileGroups AS SELECT -- GETDATE() AS CaptureDate fg.name AS FileGroupName ,x1.PartitionNumber ,x1.destination_data_space_id ,x1.rows ,x1.boundary_value_on_right AS BoundaryRight ,x1.BoundaryValue ,x1.first_page ,x1.total_pages ,x1.used_pages --,df.name AS DataFileName --,df.physical_name --,x1.* FROM sys.filegroups fg INNER JOIN sys.database_files df ON fg.data_space_id = df.data_space_id LEFT OUTER JOIN ( SELECT o.name ObjectName ,s.name AS SchemeName ,p.partition_number PartitionNumber ,p.rows ,i.name IndexName ,au.total_pages ,au.used_pages ,CASE au.first_page WHEN NULL THEN '' ELSE CONVERT (VARCHAR(6), CONVERT (INT, SUBSTRING (au.first_page, 6, 1) + SUBSTRING (au.first_page, 5, 1))) + ':' + CONVERT (VARCHAR(20), CONVERT (INT, SUBSTRING (au.first_page, 4, 1) + SUBSTRING (au.first_page, 3, 1) + SUBSTRING (au.first_page, 2, 1) + SUBSTRING (au.first_page, 1, 1))) END AS first_page ,ps.name PartitionSchemeName ,pf.name PartitionFunctionName ,pf.boundary_value_on_right ,rv.value BoundaryValue ,dds.data_space_id AS destination_data_space_id FROM sys.partitions p INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id INNER JOIN sys.objects o ON p.object_id = o.object_id INNER JOIN sys.schemas s ON s.schema_id = o.schema_id INNER JOIN sys.system_internals_allocation_units au ON p.partition_id = au.container_id INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id INNER JOIN sys.partition_functions pf ON pf.function_id = ps.function_id INNER JOIN sys.destination_data_spaces dds ON dds.partition_scheme_id = ps.data_space_id AND dds.destination_id = p.partition_number LEFT OUTER JOIN sys.partition_range_values rv ON pf.function_id = rv.function_id AND p.partition_number = rv.boundary_id WHERE i.index_id < 2 AND p.index_id < 2 ) x1 ON x1.destination_data_space_id = fg.data_space_id WHERE fg.name LIKE '%_fg%' ORDER BY fg.name GO -- Show partitions and file group details EXEC DisplayTestFileGroups; GO ----------------------------------------------------------------------------------- -- Target tables for SWITCH of old archive data. IF OBJECT_ID('dbo.$(TableNameRight)Out') IS NOT NULL DROP TABLE [dbo].[$(TableNameRight)Out] CREATE TABLE [dbo].[$(TableNameRight)Out]( [Partition_PK] [int] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableNameRight)Out] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC ) ) ON $(TableNameRight)_fg2; IF OBJECT_ID('dbo.$(TableNameLeft)Out') IS NOT NULL DROP TABLE [dbo].[$(TableNameLeft)Out] CREATE TABLE [dbo].[$(TableNameLeft)Out]( [Partition_PK] [int] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableNameLeft)Out] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC ) ) ON $(TableNameLeft)_fg2; ----------------------------------------------------------------------------------- -- Switch out partitions ---------------------------------------------------------------------------------- ALTER TABLE dbo.$(TableNameRight) SWITCH PARTITION 2 TO dbo.$(TableNameRight)Out; ALTER TABLE dbo.$(TableNameLeft) SWITCH PARTITION 2 TO dbo.$(TableNameLeft)Out; EXEC DisplayTestFileGroups; ALTER INDEX ALL ON $(TableNameRight) REBUILD; ALTER INDEX ALL ON $(TableNameLeft) REBUILD; CHECKPOINT; DBCC DROPCLEANBUFFERS; ---------------------------------------------------------------------------------- -- Merge 2 empty partitions at start of ranges -- Start the 'Investigate Waits' XE session before these final index rebuilds ---------------------------------------------------------------------------------- :SETVAR TableNameRight "PartitionedRight" ALTER PARTITION FUNCTION $(TableNameRight)_func() MERGE RANGE (1001); :SETVAR TableNameRight "PartitionedRight" ALTER INDEX ALL ON $(TableNameRight) REBUILD; :SETVAR TableNameLeft "PartitionedLeft" ALTER PARTITION FUNCTION $(TableNameLeft)_func() MERGE RANGE (0); :SETVAR TableNameLeft "PartitionedLeft" ALTER INDEX ALL ON $(TableNameLeft) REBUILD; EXEC DisplayTestFileGroups;