March 5, 2014 at 4:49 am
This is my current stored procedure it is taking more time to finish the opration.
any suggestion to improve this?
There are around 25 _TESTDATA tables each around 8 million records , i want to delete the records which are older than 30 days.
CREATE PROCEDURE sp_delete_by_date
@pdate AS int,
@nrp bit = 1
AS
SET DEADLOCK_PRIORITY LOW
SET IMPLICIT_TRANSACTIONS ON
DECLARE @i tinyint
DECLARE @Table_Name varchar(50)
CREATE TABLE #MC (
ID int IDENTITY(1,1) NOT NULL,
Table_Name varchar(50) NOT NULL)
INSERT INTO #MC
SELECT object_name(fkeyid) FROM sysforeignkeys
WHERE ( object_name(fkeyid) LIKE '%_TESTDATA' )
AND object_name(fkeyid) NOT LIKE 'TMP_%'
SET @i = 1
WHILE @i <=(SELECT MAX(ID) FROM #MC)
BEGIN
SELECT @Table_Name=Table_Name FROM #MC WHERE ID=@i
EXEC ('DELETE FROM '+ @Table_Name +' WHERE RUN_START_TIME<=' + @pdate + '')
COMMIT TRAN
SET @i=@i +1
END
IF @nrp <> 1
BEGIN
WAITFOR DELAY '00:00:00.800'
END
DELETE TESTDATA WHERE RUN_START_TIME <=@pdate
COMMIT TRAN
IF @nrp <> 1
BEGIN
WAITFOR DELAY '00:00:00.200'
END
DELETE TR WHERE RUN_END_TIME<=@pdate
COMMIT TRAN
DROP TABLE #MC
COMMIT TRAN
Do we really need Commit at every delete statement or can we have it as common statement
something like
IF @@ERROR != 0
BEGIN
ROLLBACK TRANSACTION
RETURN
END
ELSE
BEGIN
COMMIT TRANSACTION
END
March 5, 2014 at 5:40 am
If you commit after every row, it will run slower, but it also allows you to manage your log size. If it's one big commit, it's one big transaction that must be maintained in the log.
In terms of making this faster, I'd look at deleting batches of rows, perhaps in the 10,000-100,000 size range to speed things up. That often gives you a good balance between one large transaction and 8mm small ones.
March 5, 2014 at 6:02 am
So steve you are suggesting some thing like
EXEC ('DELETE Top (10000) FROM '+ @Table_Name +' WHERE RUN_START_TIME<=' + @pdate + '')
COMMIT TRAN
DELETE Top (10000) TESTDATA WHERE RUN_START_TIME <=@pdate
COMMIT TRAN
DELETE Top (10000) TR WHERE RUN_END_TIME<=@pdate
COMMIT TRAN
incase what if in lower version of sql server say sql server 2000 how can we manage it
March 5, 2014 at 6:36 am
yuvipoy,
we have similar requirements for our real-time systems. We need to delete millions of rows and not cause locking that would stop transactions. I wrote a delete loop that works for us.
--Start by getting the records to delete
SELECT ID
INTO #DeleteKeys
FROM TESTDATA WITH (NOLOCK)
WHERE RUN_START_TIME <=@pdate
--loop thru small batches to avoid excessive locking
create table #DeleteBatch (ID int)
declare @RowCount int
SET @RowCount = 1000
WHILE @RowCount = 1000
BEGIN
DELETE TOP (1000) #DeleteKeys
OUTPUT DELETED.ID
INTO #DeleteBatch
FROM #DeleteKeys
SET @RowCount = @@ROWCOUNT
DELETE
FROM TESTDATA
FROM TESTDATA TD
INNER JOIN #DeleteBatch DB
ON TD.ID = DB.ID
TRUNCATE TABLE #DeleteBatch
END
March 5, 2014 at 11:25 pm
Hi EricEyster,
After having bacth operation is delete operation faster?
how much time you gained after rewritting your query.
Incase what if in lower version of sql server say sql server 2000 how can we manage it
March 6, 2014 at 6:00 am
yuvipoy (3/5/2014)
Hi EricEyster,After having bacth operation is delete operation faster?
how much time you gained after rewritting your query.
Incase what if in lower version of sql server say sql server 2000 how can we manage it
You should see at least a 10x performance improvement versus single record delete. It will vary depending on how big you make the batch size for the delete. We have to decrease the size on some systems and that slows the delete. Others, we can go as high as 100,000 and get great performance.
If you are using SQL2000, then you do not have the output operator. I updated the code to work without it. It does require an extra step, but it should not drastically change the performance on a reasonably sized delete:
declare @pdate datetime
set @pdate = getdate()-7
--Start by getting the records to delete
SELECT ID
INTO #DeleteKeys
FROM TESTDATA WITH (NOLOCK)
WHERE RUN_START_TIME <=@pdate
--loop thru small batches to avoid excessive locking
create table #DeleteBatch (ID int)
declare @RowCount int
SET @RowCount = 1000
WHILE @RowCount = 1000
BEGIN
INSERT INTO #DeleteBatch
SELECT TOP (1000) ID
FROM #DeleteKeys
SET @RowCount = @@ROWCOUNT
DELETE #DeleteKeys
FROM #DeleteKeys DK
JOIN #DeleteBatch DB
ON DK.ID = DB.ID
DELETE
FROM TESTDATA
FROM TESTDATA TD
INNER JOIN #DeleteBatch DB
ON TD.ID = DB.ID
TRUNCATE TABLE #DeleteBatch
END
March 10, 2014 at 2:57 pm
Sorry for the delay. I typically do what Eric has written here, even back in SQL 2000/7, but I might run 3-4 batches manually at different sizes and time them to understand what the best size might be for my system.
March 17, 2014 at 6:29 am
declare @pdate datetime
set @pdate = getdate()-7
--Start by getting the records to delete
SELECT ID
INTO #DeleteKeys
FROM TESTDATA WITH (NOLOCK)
WHERE RUN_START_TIME <=@pdate
If i have refer another table for delete then will this query will be better choice or
New Code :1
declare @pdate datetime
set @pdate = getdate()-7
--Start by getting the records to delete
SELECT a.ID
INTO #DeleteKeys
FROM TESTDATA as a WITH (NOLOCK)
WHERE a.id not in(Select b.id from another_table as b )
and a.RUN_START_TIME <=@pdate
this query
New Code :2
declare @pdate datetime
set @pdate = getdate()-7
--Start by getting the records to delete
SELECT a.ID
INTO #DeleteKeys
FROM TESTDATA as a WITH (NOLOCK)
WHERE not exists(Select b.id from another_table as b where a.id =b.id )
and a.RUN_START_TIME <=@pdate
Not exists vs Not in.
The id field in both the table is not null column.
March 17, 2014 at 6:40 am
yuvipoy (3/17/2014)
Not exists vs Not in.The id field in both the table is not null column.
No difference.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 17, 2014 at 7:12 am
Table A is having 100k records
Table B is having 8000k records
When i delete the TableB with TableA record exists the query is taking long time thats why i came across Not Exists Vs Not IN.
how about leftouter join?
Thanks!
March 17, 2014 at 7:16 am
yuvipoy (3/17/2014)
When i delete the TableB with TableA record exists the query is taking long time thats why i came across Not Exists Vs Not IN.how about leftouter join?
Very slightly slower than not in/not exists in my tests.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 17, 2014 at 7:37 am
Is there a better way to do so to delete the records checking with another table.
March 20, 2014 at 7:24 am
You should see at least a 10x performance improvement versus single record delete. It will vary depending on how big you make the batch size for the delete. We have to decrease the size on some systems and that slows the delete. Others, we can go as high as 100,000 and get great performance.
If you are using SQL2000, then you do not have the output operator. I updated the code to work without it. It does require an extra step, but it should not drastically change the performance on a reasonably sized delete:
declare @pdate datetime
set @pdate = getdate()-7
--Start by getting the records to delete
SELECT ID
INTO #DeleteKeys
FROM TESTDATA WITH (NOLOCK)
WHERE RUN_START_TIME <=@pdate
--loop thru small batches to avoid excessive locking
create table #DeleteBatch (ID int)
declare @RowCount int
SET @RowCount = 1000
WHILE @RowCount = 1000
BEGIN
INSERT INTO #DeleteBatch
SELECT TOP (1000) ID
FROM #DeleteKeys
SET @RowCount = @@ROWCOUNT
DELETE #DeleteKeys
FROM #DeleteKeys DK
JOIN #DeleteBatch DB
ON DK.ID = DB.ID
DELETE
FROM TESTDATA
FROM TESTDATA TD
INNER JOIN #DeleteBatch DB
ON TD.ID = DB.ID
TRUNCATE TABLE #DeleteBatch
END
I tried your code but it is taking more time than the normal delete at present which i am doing.
My column is uniquidentifier not as INT u created in #Delete tables.
joining on uniquidentifier to delete some 50 k records out of 5 million records is taking more than 15 min with locking the main table.
uniquidentifier column is noncluster primary key
and where condition column is a clustered one
When i run SP_lock and see the SPID of Delete transaction session it is having more records EXT , TAB,RID with some 300 records for the transaction.
March 27, 2014 at 7:55 am
Delete operation takes more or less same time i dont see any improvement in the query when i have batch delete. To delete some 1 million records.
atually the approach is taking more time than the current one.
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply