Recently we faced an issue in UAT environment. In our application, one table has more than 50 million records. It has one Identity column with an int datatype. The table has ten default constraints and one clustered index. This table is highly utilized by our application and frequent DML operations are being performed. Every night when a user performs a calculation operation, a bulk set of rows get deleted. As we know, the identity will not reset when we perform the delete operation, so during the daytime when new data added, the Identity value kept increasing. Finally, it reached at the maximum allowable int data type value and insert operations started failing.
Considering amount of data in the table and the allowed downtime, we find two approaches which are suitable for our environment to change from an int to a biginet.
- Perform an insert operation from a source table to a destination table in multiple batches, issue frequent checkpoints so that log does not fill, and complete a manual switch of the tables.
- Use an SSIS package to perform a bulk insert and complete a manual switch of the tables.
Approach One
Step 1: Change database recovery model from full to simple so that log file does not get full.
Step 2: Script out the definition of existing table. (Below is table definition of source table)
USE [DemoDatabase] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[DemoTable_Source]( [ID] [int] IDENTITY(1,1) NOT NULL, [AuditEventID] [int] NOT NULL, [TimeRecorded] [datetime] NOT NULL, [UserName] [varchar](50) NOT NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] GO ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] GO
Step 3: Make changes in the extracted script. Rename the Indexes and constraints. Update the datatype of the identity column from int to bigint.
USE [DemoDatabase] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[DemoTable_Destination]( [ID] [int] IDENTITY(1,1) NOT NULL, [AuditEventID] [int] NOT NULL, [TimeRecorded] [datetime] NOT NULL, [UserName] [varchar](50) NOT NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] GO ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] GO
Step 4: Insert rows from source table to destination table into batches and run frequent checkpoints.
SET IDENTITY_INSERT dbo.DemoTable_Bigint ON DECLARE @x INT DECLARE @BatchSize INT DECLARE @Message VARCHAR(500) SET @x = 0 SET @BatchSize = 100000 WHILE @x < 2500000 -- Set appropriately BEGIN INSERT INTO [DemoTable_Bigint](ID,AuditEventID,TimeRecorded,UserName ) SELECT TOP(@BatchSize) ID, AuditEventID,TimeRecorded,UserName FROM DemoTable_Source WITH (nolock) WHERE ID > @x ORDER BY ID ASC SET @x = @x + @BatchSize SET @Message = CAST(@x AS VARCHAR(10)) RAISERROR (@Message,10,1) WITH NOWAIT CHECKPOINT END
Step 5: Rename tables and constraints.
/*Rename table Names */EXEC sys.sp_rename N'DemoTable_Source', N'DemoTable_Source_Old' EXEC sys.sp_rename N'DemoTable_Destination', N'DemoTable_Source' /*Rename constraints */EXEC sys.sp_rename N'DF_DemoTable_Source_AuditEventID', N'DF_DemoTable_Source_AuditEventID_Old' EXEC sys.sp_rename N'DF_DemoTable_Destination_AuditEventID', N'DF_DemoTable_Source_AuditEventID' EXEC sys.sp_rename N'DF_DemoTable_Source_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded_Old' EXEC sys.sp_rename N'DF_DemoTable_Destination_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded' /*Drop old Tables and constraints*/alter table DemoTable_Source_Old drop constraint DF_DemoTable_Source_AuditEventID_Old alter table DemoTable_Source_Old drop constraint DF_DemoTable_Source_TimeRecorded_Old Drop Table DemoTable_Source_Old
Approach Two
Step 1: Script out the definition of existing table. (Below is table definition of source table)
USE [DemoDatabase] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[DemoTable_Source] ( [ID] [int] IDENTITY(1,1) NOT NULL, [AuditEventID] [int] NOT NULL, [TimeRecorded] [datetime] NOT NULL, [UserName] [varchar](50) NOT NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] GO ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] GO
Step 2: Make changes in extracted script. Rename the indexes, constraints and update the datatype of identity column from int to bigint.
USE [DemoDatabase] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[DemoTable_Destination] ( [ID] [int] IDENTITY(1,1) NOT NULL, [AuditEventID] [bigint] NOT NULL, [TimeRecorded] [datetime] NOT NULL, [UserName] [varchar](50) NOT NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] GO ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] GO
Step 3: Create a SSIS package to perform bulk insert. Below is screen shot of data flow task of SSIS package.
Below is a description of the dataflow package and the tasks performed.
- The source table is an ADO.Net source that will extract data from "DemoTable_Source" table.
- An Int datatype is a four byte signed and a Bigint is eight byte signed, so we have to add a data conversion step.
- The converted data of DemoTable_Source will be transfered to the DemoTable_Destination table.
Step 4: Rename tables and constraints and drop old tables and columns
/*Rename table Names */EXEC sys.sp_rename N'DemoTable_Source', N'DemoTable_Source_Old' EXEC sys.sp_rename N'DemoTable_Destination', N'DemoTable_Source' /*Rename constraints */EXEC sys.sp_rename N'DF_DemoTable_Source_AuditEventID', N'DF_DemoTable_Source_AuditEventID_Old' EXEC sys.sp_rename N'DF_DemoTable_Destination_AuditEventID', N'DF_DemoTable_Source_AuditEventID' EXEC sys.sp_rename N'DF_DemoTable_Source_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded_Old' EXEC sys.sp_rename N'DF_DemoTable_Destination_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded' /*Drop old Tables and constraints*/alter table DemoTable_Source_Old drop constraint DF_DemoTable_Source_AuditEventID_Old alter table DemoTable_Source_Old drop constraint DF_DemoTable_Source_TimeRecorded_Old Drop Table DemoTable_Source_Old
Conclusion
As we have dropped indexes before the bulk insert, our data was not correctly updated. To ensure that data get inserted properly we need to perform a sort operation, which was not advisable with 50 million records. We decided to use the second approach, and it worked well. Even though we did not drop indexes, the entire operation was completed in 5 hours.