Batch delete TSQL

  • Hi,

    I am trying to 'tune' one of our applications housekeeping scripts, which runs against dozens of database tables every night. The details of the tables to be maintained are stored in a master table with the names, number of days data to be kept, and a 'flag' (there are other fields, but they don't seem to be referenced by the query). Each night, a stored procedure runs that deletes data from each of the tables listed in the master table, based on the number-of-days-data-to-be-kept column, and whether the flag columns has the right value in it.

    This is my attempted re-hash of the original procedure (which I've attached to try and keep the length of the post down):

    USE [mydb]

    GO

    /****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER PROCEDURE [dbo].[USP_Housekeeping]

    AS

    SET NOCOUNT ON

    DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)

    DECLARE @IDINT

    DECLARE @MaxIDINT

    DECLARE @dbNameNVARCHAR(50)

    DECLARE @ICNTINT

    SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],

    st.TABLE_SCHEMA,

    st.TABLE_NAME,

    CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,

    T.DateColumn

    INTO #Store_Duration

    FROM dbo.MASTERSTORE T

    INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME

    WHERE st.TABLE_NAME = T.TableName

    SELECT @MaxID = COUNT(ID) FROM #Store_Duration

    SET @ICNT = 1

    WHILE @ICNT <= @MaxID

    BEGIN

    SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]

    FROM #Store_duration WHERE ID = @ICNT

    if exists(select SC.* from sysobjects SO

    INNER JOIN syscolumns SC

    ON SO.id = SC.id

    where SO.Name = @TableName

    and SC.Name = @DateColumn)

    BEGIN

    SET @sql = 'DELETE FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored

    print (@SQL)

    --EXEC (@SQL)

    END

    SET ROWCOUNT 0

    SET @ICNT = @ICNT + 1

    END

    DROP TABLE #Store_Duration

    This is the table definition for MASTERSTORE, which tracks the tables to be maintained and the criteria used for deleting the data

    USE [mydb]

    GO

    /****** Object: Table [dbo].[T_MASTER_STORE_DURATION] Script Date: 11/16/2011 10:02:27 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[MASTERSTORE](

    [TableName] [varchar](100) NOT NULL,

    [DaysStored] [int] NOT NULL,

    [DateColumn] [varchar](50) NOT NULL

    PRIMARY KEY CLUSTERED

    (

    [TableName] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    Here's some data for the MASTERSTORE table:

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable1',5, 'load_date')

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable2',5, 'load_date')

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable3',5, 'load_date')

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable4',35, 'load_date')

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable5',99, 'Load_Date')

    INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable6',9999, 'Load_Date')

    Typically, the MASTERSTORE table has about 175 rows in it. All of the tables being referenced from the MASTERSTORE table have different structures. The only commonality between them is the 'load_date' column (as far as I know). Some of the underlying tables have only a couple of hundred rows, and others may have millions.

    I'm not allowed to change anything other than the stored procedure that's iterating through all the tables and deleting rows based on a given data age and type.

    1) I would like to have some logic that deletes rows from the underlying tables in batches. I can't get it to work within the dynamic SQL.

    2) Please feel free to make any improvements/criticism.... I'm still learning

    Thanks for taking the time to look, and if you need any more information, please let me know.

    Andrew

  • Please post the code about the non working batch delete.

  • Hi,

    This is one of many attempts... It generates valid SQL okay, but it doesn't seem (based on the printed output) to be looping through the table until all qualifying records have been deleted.

    USE [mydb]

    GO

    /****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER PROCEDURE [dbo].[USP_Houseke]

    AS

    SET NOCOUNT ON

    DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)

    DECLARE @IDINT

    DECLARE @MaxIDINT

    DECLARE @dbNameNVARCHAR(50)

    DECLARE @ICNTINT

    SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],

    st.TABLE_SCHEMA,

    st.TABLE_NAME,

    CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,

    T.DateColumn

    INTO #Store_Duration

    FROM dbo.MASTERSTORE T

    INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME

    WHERE st.TABLE_NAME = T.TableName

    SELECT @MaxID = COUNT(ID) FROM #Store_Duration

    SET @ICNT = 1

    WHILE @ICNT <= @MaxID

    BEGIN

    SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]

    FROM #Store_duration WHERE ID = @ICNT

    if exists(select SC.* from sysobjects SO

    INNER JOIN syscolumns SC

    ON SO.id = SC.id

    where SO.Name = @TableName

    and SC.Name = @DateColumn)

    BEGIN

    DECLARE @rowcount int

    SET @rowcount = 1

    DECLARE @batchsize int

    SET @batchsize = 5000

    WHILE (@rowcount > 0)

    BEGIN

    SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored

    PRINT (@SQL)

    --delete top @batchsize from table where foo=1

    SET @rowcount = @@ROWCOUNT

    END

    END

    SET ROWCOUNT 0

    SET @ICNT = @ICNT + 1

    END

    DROP TABLE #Store_Duration

  • That whole section needs to be a single dynamic sql call.

    DECLARE @rowcount int

    SET @rowcount = 1

    DECLARE @batchsize int

    SET @batchsize = 5000

    WHILE (@rowcount > 0)

    BEGIN

    SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored

    PRINT (@SQL)

    --delete top @batchsize from table where foo=1

    SET @rowcount = @@ROWCOUNT

    END

    P.S. You might want to try a little more than 5000 rows. Most systems are fine with a lot more.

    See how it goes but increasing the tlog backups frequency during a purge like that is not a bad idea either.

  • Hi,

    Point taken about batch size - i'll change this.

    So, I've come up with this. How's it look? Anything stand out as being rubbish? It seems to go round the tables okay.

    USE [MYDB]

    GO

    /****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER PROCEDURE [dbo].[USP_Houseke]

    AS

    SET NOCOUNT ON

    DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)

    DECLARE @IDINT

    DECLARE @MaxIDINT

    DECLARE @dbNameNVARCHAR(50)

    DECLARE @ICNTINT

    SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],

    st.TABLE_SCHEMA,

    st.TABLE_NAME,

    CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,

    T.DateColumn

    INTO #Store_Duration

    FROM dbo.MASTERSTORE T

    INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME

    WHERE st.TABLE_NAME = T.TableName

    SELECT @MaxID = COUNT(ID) FROM #Store_Duration

    SET @ICNT = 1

    WHILE @ICNT <= @MaxID

    BEGIN

    SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]

    FROM #Store_duration WHERE ID = @ICNT

    if exists(select SC.* from sysobjects SO

    INNER JOIN syscolumns SC

    ON SO.id = SC.id

    where SO.Name = @TableName

    and SC.Name = @DateColumn)

    BEGIN

    SET @sql = '

    DECLARE @rowcount int

    SET @rowcount = 1

    DECLARE @batchsize int

    SET @batchsize = 5000

    WHILE (@rowcount > 0)

    BEGIN

    DELETE TOP (@batchsize) FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < '''+ @DaysStored + '''

    SET @rowcount = @@ROWCOUNT

    END'

    PRINT (@sql)

    /*

    DECLARE @rowcount int

    SET @rowcount = 1

    DECLARE @batchsize int

    SET @batchsize = 5000

    WHILE (@rowcount > 0)

    BEGIN

    SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored

    PRINT (@SQL)

    SET @rowcount = @@ROWCOUNT

    END

    */

    END

    --SET ROWCOUNT 0

    SET @ICNT = @ICNT + 1

    END

    DROP TABLE #Store_Duration

    Appreciate the help

  • Looks ok but it's easy to overlook something.

    Only 1 way to tell if it works!

    1 guess what it is :hehe:

  • let someone else run it and pretend you never had anything to do with it?

  • adb2303 (11/16/2011)


    let someone else run it and pretend you never had anything to do with it?

    Sorry, all out of guesses! 😀

  • thanks for your help

  • So the test runs were successful?

  • on my sandpit database it looks to be. I'll have to wait for the app support team to tell me how much trade data I've inadvertently deleted before celebrating though....

  • adb2303 (11/16/2011)


    on my sandpit database it looks to be. I'll have to wait for the app support team to tell me how much trade data I've inadvertently deleted before celebrating though....

    It should be easy enough to export the setup table into excel, recalulate the delete date there and then compare with your sql statements.

    Quick test, easy piece of mind.

  • Also you could take a copy of the bigger tables from prod and start testing with batches side and log growth.

    Those are the only 2 things that can bite you in the arse.

  • good ideas - I'll start on that now...

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply