Problem
Your SQL Server production database must be available 7x24 but it contains millions of obsolete rows which hurt performance. How do you delete these rows without the lock contention and business impact of a standard delete? In this tip we look at how you can delete massive number of rows in small minimal lock batches.
Solution
Quick OLTP is critical to a business’s success so DBAs need to mindful of data contention caused by application cleanup jobs. One approach is to select a huge number of keys in one pass then break the cleanup into small batches with commits and sleep windows to minimize contention. I recommend enabling the cleanup in a SQL job and run at low system usage time.
This is accomplished with the T-SQL code below. You would customize the definition of obsolete rows (I call deadheads) to your shop. In this case a sale without money and no contact is obsolete. Historical cleanup is a common use of this routine. This version works with multiple tables in this case a parent table and its two child tables. The column SalesID relates the tables.
Two layers of temporary tables are used: the outer table to contain all the obsolete primary keys for this run; the inner table to contain a subset of the same. The number of keys placed in the temporary tables is best determined by altering the number of rows until the highest rows deleted per minute is reached. I started with 1,000 rows in the outer table and 100 in the inner table. I slowly scaled up to 500,000 rows in the outer table and 5,000 rows in the inner table. 25,316 deletions per minute was the peak value. Beyond 500,000 rows in the outer table the deletion per minute value declined on my server.
Since the application tables are defined with primary keys using the identity property the rows are inserted with ever increasing primary key values making it easy to pick up the oldest rows via SELECT top. Your case may be based upon a date column so alter the SELECT top 500000 accordingly.
The code loops through the 500,000 row outer table loading 5,000 keys values into the inner table. Those 5,000 rows are then deleted from the application tables and outer table. The last step is to truncating the inner table. This loop continues until all 500,000 outer table rows are deleted. I find keeping the inner table small (5000) minimizes lock contention but you will need to adjust this number to your shop.
If desired use the WAITFOR command to pause processing and allow other applications to process without contending with cleanup.
At any point the job can be cancelled resulting in a brief rollback only for the last inner table set of keys. Committed deletes are not rolled back.
The job can be restarted from the top without issue and can be run as many times a day as needed. I run this job two times a night on tables needing a quick cleanup then pull back to once a week to maintain optimal performance.
The Code
USE [msdb] GO /****** Object: StoredProcedure [dbo].[usp_ObsoleteRowCleanup] Script Date: 12/18/2011 14:00:38 ******/SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO /******************************************************************** File Name: usp_ ObsoleteRowCleanup.sql Applies to: SQL Server 2005/8 Purpose: To cleanup obsolete rows. Prerequisite: None. Assumption: Removal of obsolete rows improves performance Parameters: None Returns: 0 = Succesful, 1 = Errors. Author: Edward A. Polley Spec -- Definition of deadhead. Select COUNT(*) from SalesLead L, Transaction T WHERE L.SalesID = T.SalesID AND L.ContactId IS NULL AND T.MoneyIn = 0 ********************************************************************/CREATE PROCEDURE [dbo].[usp_ ObsoleteRowCleanup] --Will delete first 500000 deadheads --Delete is from child tables: SalesZip and Transaction then parent SalesLead -- AS SET NOCOUNT ON SET QUOTED_IDENTIFIER OFF DECLARE @DeadHeadCnt int, @Toprow int BEGIN PRINT 'Building Outer Temp Tbl ' PRINT getdate() CREATE TABLE #OuterTemp(SalesID bigint) -- hold all deadhead keys CREATE INDEX IX_OuterTemp_SalesId on #OuterTemp (SalesID) CREATE TABLE #InnerTemp(SalesID bigint) -- hold subset of deadhead keys CREATE INDEX IX_InnerTemp_SalesID on #InnerTemp (SalesID) INSERT INTO #OuterTemp -- select first 500,000 keys Select TOP 500000 L.SalesID from YourDB.dbo.SalesLead L (Nolock), YourDB.dbo.Transaction (Nolock) T WHERE L.SalesID = T.SalesID AND L.LeadContactId IS NULL -–substitute your condition AND T.MoneyIn = 0 OPTION (MAXDOP 8); -- allow parallel processing -- mainline Select @DeadHeadCnt = count(*)From #OuterTemp PRINT 'Number of Dead Heads' PRINT @DeadHeadCnt While @DeadHeadCnt > 0 BEGIN PRINT 'Begins deletes ' PRINT getdate() PRINT 'Top row ' SELECT @toprow = min(SalesID) From #OuterTemp -- ever incr key PRINT @toprow BEGIN TRANSACTION INSERT INTO #InnerTemp -- experiment with this value Select TOP 5000 SalesID from #OuterTemp -- Delete from application tables by processing small intertemp DELETE from YourDB.dbo.SalesZIP where SalesID in (Select SalesID from #InnerTemp (nolock) ); --innertemp DELETE from YourDB.dbo.Transaction where SalesID in (Select SalesID from #InnerTemp (nolock) ) ; DELETE from YourDB.dbo.SalesLead WHERE SalesID in (Select SalesID from #InnerTemp (nolock) ); --- Delete from outer table DELETE from #OuterTemp WHERE SalesID in (Select SalesID from #InnerTemp (nolock) ) ; Select @DeadHeadCnt = count(*)From #OuterTemp -- reset loop cnt TRUNCATE TABLE #InnerTemp -- cleanup working table COMMIT --WAITFOR DELAY '00:00:01' -- adjust as needed PRINT 'End Deletes' END -- Destroy all temporary tables. IF OBJECT_ID('tempdb..#OuterTemp') IS NOT NULL DROP TABLE #OuterTemp IF OBJECT_ID('tempdb..#INNERTEMP') IS NOT NULL DROP TABLE #InnerTemp END GO