April 11, 2005 at 3:28 pm
I need to delete millions of rows from a table older than x days. I broke this up into smaller delete batches like the following.
SET ROWCOUNT 1000
WHILE 1 = 1
BEGIN
delete from events
where datediff(day, EVENT_DATE, getdate()) > 10
IF @@ROWCOUNT = 0
BREAK
END
SET ROWCOUNT 0
Would this lock the table at all and affect INSERT Statements or SELECT Statements? Thanks.
April 11, 2005 at 4:51 pm
Only for the duration of each individual delete. That is DML from other connections will pause and continue after each pass of the loop.
April 12, 2005 at 1:10 pm
You're on the right track, pop, but I've found that the "Set RowCount" option causes unnecessary row locking anyway, so you're back in the same boat with this approach. Here's what I use instead (this is just an example, for clarity of purpose only, so don't ding me on names).
declare @SeedDelete Table (PrimaryKeyID Int Not null Primary Key)
While 1 = 1
BEGIN
Insert @SeedDelete
select top 1000 PrimaryKeyID
From [Table]
where Status = 'Needs Delete'
if @@RowCount = 0 Break
delete t
From @SeedDelete s
JOIN [Table] t on s.PrimaryKeyID = t.PrimaryKeyID
delete @SeedDelete
END
Signature is NULL
April 12, 2005 at 1:40 pm
Calvin is absolutely correct, SET ROWCOUNT should be avoided especially for modifying statements. Even though it has the same effect as TOP and the example Calvin uses, SET ROWCOUNT is differently implemented and is not recognised by the optimizer in the same way. This means it might 'kick in' at a later time in the execution causing SQL Server to still take a lot of locks.
Also, in SQL Server 2005 SET ROWCOUNT is deprecated for modifying statements, menaing it will have no effect for these in the next version of SQL Server.
April 12, 2005 at 7:09 pm
April 13, 2005 at 6:02 am
To back up to Pop's original question: Yes and yes.
As written it's the same as just deleting the millions of rows right away. It's very likely that the table will be locked for the duration of the entire delete, and inserts will be affected, and possibly selects as well (unless they are dirty reads)
What Pop is trying to do is to chunk it up, but you need a few more bits thrown in the loop.
Each delete must be within it's own transaction. As written, all is one giant implicit tran, thus locks and resources would be held for the duration. After each delete, if all went well, you should commit. When you do, locks are released, and other users have an opportunity to 'sneak by'.
It can also be benificial to manage the log in between each delete iteration. If logsize is an issue, it must be backed up in between in order to be truncated to keep the filesize down. For this to be possible there can be no open transactions either. Just look out if you choose to dump with no_log, that it doesn't invalidate the 'ordinary' backup strategy.
/Kenneth
April 13, 2005 at 8:55 am
Kenneth, could you elaborate on the following?
>Each delete must be within it's own transaction. As written, all is one giant implicit tran, thus locks and resources would be held for the duration.
I tried with the example below, and did not get that result. Am I doing something wrong? While the batch is running I switch to a second QA window and execute sp_lock, and I do not see any locks building up.
---
if object_id('orders') is not null
drop table orders
go
select * into orders from northwind.dbo.orders
go
/* Uncomment transaction handling code to run in single transaction */
-- begin tran
SET ROWCOUNT 80
WHILE 1 = 1
BEGIN
delete from orders where datediff(d, orderdate, '1998-02-01') > 10
IF @@ROWCOUNT = 0
BREAK
waitfor delay '00:00:01.000'
END
SET ROWCOUNT 0
-- commit tran
April 14, 2005 at 7:31 am
Chris, I stand corrected. Apparently it isn't quite as bad as I first thought when wrapping the delete in rowcount. There are still TAB X locks taken, but they're pretty hard to catch, as they seem to be released pretty quick as well...
Did som very unscientific testing on a 9 million row table and deleteing every third row...
I go have a beer now
/Kenneth
April 14, 2005 at 8:08 am
Ok, thanks for the verification.
April 14, 2005 at 8:25 am
Good discussion guys ! Thanks for all the replies. So, I guess I found what I'm looking for, and I needed to explicitly commit the tran.
DECLARE @x INT
SET @x = 1
SET ROWCOUNT 1000
WHILE @x > 0
BEGIN
BEGIN TRAN
delete from events
where datediff(day, EVENT_DATE, getdate()) > 7
SET @x = @@rowcount
waitfor delay '00:00:01.000'
COMMIT TRAN
END
Thanks again.
-Pat
April 14, 2005 at 9:31 am
I would move the delay statement after the COMMIT otherwise you will be holding the lock for too long!
...
WHILE @x > 0
BEGIN
BEGIN TRAN
delete from events
where datediff(day, EVENT_DATE, getdate()) > 7
SET @x = @@rowcount
COMMIT TRAN
waitfor delay '00:00:01.000'
END
hth
* Noel
April 14, 2005 at 9:34 am
I think that's meant to give the log time to truncate, but I guess it probably truncates after the COMMIT anyway, so maybe you're right.
April 14, 2005 at 9:37 am
yes that's the purpose for the delay but you have an explicit transaction opened that you should commit ASAP.
* Noel
April 14, 2005 at 9:41 am
Woops ! I read your reply to quickly and misread it. sorry. yes, you're right, and thanks for the tip.
-Pat
April 14, 2005 at 12:42 pm
Also, I'm assuming the same principle could be used for large INSERTS as well, right?
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply