November 16, 2011 at 3:30 am
Hi,
I am trying to 'tune' one of our applications housekeeping scripts, which runs against dozens of database tables every night. The details of the tables to be maintained are stored in a master table with the names, number of days data to be kept, and a 'flag' (there are other fields, but they don't seem to be referenced by the query). Each night, a stored procedure runs that deletes data from each of the tables listed in the master table, based on the number-of-days-data-to-be-kept column, and whether the flag columns has the right value in it.
This is my attempted re-hash of the original procedure (which I've attached to try and keep the length of the post down):
USE [mydb]
GO
/****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[USP_Housekeeping]
AS
SET NOCOUNT ON
DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)
DECLARE @IDINT
DECLARE @MaxIDINT
DECLARE @dbNameNVARCHAR(50)
DECLARE @ICNTINT
SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],
st.TABLE_SCHEMA,
st.TABLE_NAME,
CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,
T.DateColumn
INTO #Store_Duration
FROM dbo.MASTERSTORE T
INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME
WHERE st.TABLE_NAME = T.TableName
SELECT @MaxID = COUNT(ID) FROM #Store_Duration
SET @ICNT = 1
WHILE @ICNT <= @MaxID
BEGIN
SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]
FROM #Store_duration WHERE ID = @ICNT
if exists(select SC.* from sysobjects SO
INNER JOIN syscolumns SC
ON SO.id = SC.id
where SO.Name = @TableName
and SC.Name = @DateColumn)
BEGIN
SET @sql = 'DELETE FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored
print (@SQL)
--EXEC (@SQL)
END
SET ROWCOUNT 0
SET @ICNT = @ICNT + 1
END
DROP TABLE #Store_Duration
This is the table definition for MASTERSTORE, which tracks the tables to be maintained and the criteria used for deleting the data
USE [mydb]
GO
/****** Object: Table [dbo].[T_MASTER_STORE_DURATION] Script Date: 11/16/2011 10:02:27 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[MASTERSTORE](
[TableName] [varchar](100) NOT NULL,
[DaysStored] [int] NOT NULL,
[DateColumn] [varchar](50) NOT NULL
PRIMARY KEY CLUSTERED
(
[TableName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
Here's some data for the MASTERSTORE table:
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable1',5, 'load_date')
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable2',5, 'load_date')
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable3',5, 'load_date')
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable4',35, 'load_date')
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable5',99, 'Load_Date')
INSERT INTO MASTERSTORE (TableName, DaysStored, DateColumn) VALUES ('MyTable6',9999, 'Load_Date')
Typically, the MASTERSTORE table has about 175 rows in it. All of the tables being referenced from the MASTERSTORE table have different structures. The only commonality between them is the 'load_date' column (as far as I know). Some of the underlying tables have only a couple of hundred rows, and others may have millions.
I'm not allowed to change anything other than the stored procedure that's iterating through all the tables and deleting rows based on a given data age and type.
1) I would like to have some logic that deletes rows from the underlying tables in batches. I can't get it to work within the dynamic SQL.
2) Please feel free to make any improvements/criticism.... I'm still learning
Thanks for taking the time to look, and if you need any more information, please let me know.
Andrew
November 16, 2011 at 5:00 am
Please post the code about the non working batch delete.
November 16, 2011 at 5:24 am
Hi,
This is one of many attempts... It generates valid SQL okay, but it doesn't seem (based on the printed output) to be looping through the table until all qualifying records have been deleted.
USE [mydb]
GO
/****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[USP_Houseke]
AS
SET NOCOUNT ON
DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)
DECLARE @IDINT
DECLARE @MaxIDINT
DECLARE @dbNameNVARCHAR(50)
DECLARE @ICNTINT
SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],
st.TABLE_SCHEMA,
st.TABLE_NAME,
CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,
T.DateColumn
INTO #Store_Duration
FROM dbo.MASTERSTORE T
INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME
WHERE st.TABLE_NAME = T.TableName
SELECT @MaxID = COUNT(ID) FROM #Store_Duration
SET @ICNT = 1
WHILE @ICNT <= @MaxID
BEGIN
SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]
FROM #Store_duration WHERE ID = @ICNT
if exists(select SC.* from sysobjects SO
INNER JOIN syscolumns SC
ON SO.id = SC.id
where SO.Name = @TableName
and SC.Name = @DateColumn)
BEGIN
DECLARE @rowcount int
SET @rowcount = 1
DECLARE @batchsize int
SET @batchsize = 5000
WHILE (@rowcount > 0)
BEGIN
SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored
PRINT (@SQL)
--delete top @batchsize from table where foo=1
SET @rowcount = @@ROWCOUNT
END
END
SET ROWCOUNT 0
SET @ICNT = @ICNT + 1
END
DROP TABLE #Store_Duration
November 16, 2011 at 5:33 am
That whole section needs to be a single dynamic sql call.
DECLARE @rowcount int
SET @rowcount = 1
DECLARE @batchsize int
SET @batchsize = 5000
WHILE (@rowcount > 0)
BEGIN
SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored
PRINT (@SQL)
--delete top @batchsize from table where foo=1
SET @rowcount = @@ROWCOUNT
END
P.S. You might want to try a little more than 5000 rows. Most systems are fine with a lot more.
See how it goes but increasing the tlog backups frequency during a purge like that is not a bad idea either.
November 16, 2011 at 5:53 am
Hi,
Point taken about batch size - i'll change this.
So, I've come up with this. How's it look? Anything stand out as being rubbish? It seems to go round the tables okay.
USE [MYDB]
GO
/****** Object: StoredProcedure [dbo].[USP_DeleteFromStoreTables] Script Date: 11/15/2011 12:02:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[USP_Houseke]
AS
SET NOCOUNT ON
DECLARE @TableName VARCHAR(100), @TableOwner VARCHAR(20), @DateColumn VARCHAR(50), @sql VARCHAR(1000), @DaysStored VARCHAR(100)
DECLARE @IDINT
DECLARE @MaxIDINT
DECLARE @dbNameNVARCHAR(50)
DECLARE @ICNTINT
SELECT ROW_NUMBER() OVER(ORDER BY [TableName]) as [ID],
st.TABLE_SCHEMA,
st.TABLE_NAME,
CONVERT(VARCHAR(50), DATEADD(DD,'-' + t.DaysStored, GETDATE()),102) AS DaysStored,
T.DateColumn
INTO #Store_Duration
FROM dbo.MASTERSTORE T
INNER JOIN INFORMATION_SCHEMA.TABLES st ON T.TableName = st.TABLE_NAME
WHERE st.TABLE_NAME = T.TableName
SELECT @MaxID = COUNT(ID) FROM #Store_Duration
SET @ICNT = 1
WHILE @ICNT <= @MaxID
BEGIN
SELECT @Tableowner = [TABLE_SCHEMA], @TableName = [TABLE_NAME], @DaysStored = [DaysStored], @DateColumn = [DateColumn]
FROM #Store_duration WHERE ID = @ICNT
if exists(select SC.* from sysobjects SO
INNER JOIN syscolumns SC
ON SO.id = SC.id
where SO.Name = @TableName
and SC.Name = @DateColumn)
BEGIN
SET @sql = '
DECLARE @rowcount int
SET @rowcount = 1
DECLARE @batchsize int
SET @batchsize = 5000
WHILE (@rowcount > 0)
BEGIN
DELETE TOP (@batchsize) FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < '''+ @DaysStored + '''
SET @rowcount = @@ROWCOUNT
END'
PRINT (@sql)
/*
DECLARE @rowcount int
SET @rowcount = 1
DECLARE @batchsize int
SET @batchsize = 5000
WHILE (@rowcount > 0)
BEGIN
SET @sql = 'DELETE TOP ' + CONVERT(VARCHAR(10), @batchsize) + ' FROM [' + @TableOwner + '].[' + @TableName + '] WHERE [' + @DateColumn + '] < ' + @DaysStored
PRINT (@SQL)
SET @rowcount = @@ROWCOUNT
END
*/
END
--SET ROWCOUNT 0
SET @ICNT = @ICNT + 1
END
DROP TABLE #Store_Duration
Appreciate the help
November 16, 2011 at 6:01 am
Looks ok but it's easy to overlook something.
Only 1 way to tell if it works!
1 guess what it is :hehe:
November 16, 2011 at 6:38 am
let someone else run it and pretend you never had anything to do with it?
November 16, 2011 at 6:43 am
adb2303 (11/16/2011)
let someone else run it and pretend you never had anything to do with it?
Sorry, all out of guesses! 😀
November 16, 2011 at 6:46 am
thanks for your help
November 16, 2011 at 6:46 am
So the test runs were successful?
November 16, 2011 at 6:56 am
on my sandpit database it looks to be. I'll have to wait for the app support team to tell me how much trade data I've inadvertently deleted before celebrating though....
November 16, 2011 at 6:59 am
adb2303 (11/16/2011)
on my sandpit database it looks to be. I'll have to wait for the app support team to tell me how much trade data I've inadvertently deleted before celebrating though....
It should be easy enough to export the setup table into excel, recalulate the delete date there and then compare with your sql statements.
Quick test, easy piece of mind.
November 16, 2011 at 7:00 am
Also you could take a copy of the bigger tables from prod and start testing with batches side and log growth.
Those are the only 2 things that can bite you in the arse.
November 16, 2011 at 7:28 am
good ideas - I'll start on that now...
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply