slow DELETE!

Question

slow DELETE!

aruram

SSCrazy

Points: 2379
More actions
August 29, 2007 at 10:29 am

#62072

Hi,
I need to delete about 10 million rows from a 30 col table containg 30 million rows,using the folowing code.
The DELETE takes more than 25 mins. Is there anything I can do to speed it up ?
Thanks
(
BEGIN TRAN
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

DECLARE @start_id int
DECLARE @end_id int
DECLARE @total int
SET @start_id = (select min(id) from table)
SET @end_id = @start_id + 1000000
SET @total =(select max(id) from table)

WHILE @end_id <= @total
BEGIN
  DELETE table with(rowlock)
  WHERE ID between @start_id and @end_id and [month] < '01/01/2006'
  OPTION (MAXDOP 1)
   set @start_id = @end_id + 1
  set @end_id = @start_id + 1000000
END
COMMIT TRAN
)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Sql Junkie-204042 Hall of Fame Points: 3874 More actions · Answer 1

Couple of things to check:

What index on there on the table. Index would be rebuilt as deletion takes places which would slow the process. An option to consider would be drop the index and add the index after the deletes.

Are there foreign keys with delete constraints ?

aruram SSCrazy Points: 2379 More actions · Answer 2

Thanks for your reply..

The table has a clustered index. Will dropping it speed up the DELETE?

No FK constraints on the table.

Sql Junkie-204042 Hall of Fame Points: 3874 More actions · Answer 3

Leave the clustered index. You may need to add an index on the month column.

Also the log size will grow. This might give you some more insight.

http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=65&messageid=222360

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 4

Have you considered making smaller transactions? The incremental deletes you're doing get committed only at the very end of a transaction. Somthing like

(

DECLARE @start_id int

DECLARE @end_id int

DECLARE @total int

SET @start_id = (select min(id) from table)

SET @end_id = @start_id + 1000000

SET @total =(select max(id) from table)

WHILE @end_id <= @total

BEGIN

BEGIN TRAN

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

DELETE table with(rowlock)

WHERE ID between @start_id and @end_id and [month] < '01/01/2006'

OPTION (MAXDOP 1)

set @start_id = @end_id + 1

set @end_id = @start_id + 1000000

COMMIT TRAN

END

)

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

noeld SSC Guru Points: 96590 More actions · Answer 5

In addition: making this in into an uncommited isolation sounds risky!!!

* Noel

JMG SSC Eights! Points: 886 More actions · Answer 6

Try this instead.

While 1 = 1

Begin

Begin Tran

Set ROWCOUNT 10000

Delete From [TABLE_NAME] where [CONDITION]

If @@ROWCOUNT 0

Commit Tran

Set ROWCOUNT 0

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 7

Is 25 mins for 10 million rows really that awful?

In any case, there's two parts to optimize.

The first is usually easier than the second.

First part is the search part of the delete, since in order to delete something, you need to find it first.

Here indexes can help, and also (as you're already doing) 'chunking' the total

into smaller pieces. The trick is to find the optimal size of the 'chunks'.

This is 'select/search' optimization.

Second part is more or less hardware dependant.

The deleted rows cause write I/O in the table and in the transaction log, and on top of this index maintenance as well.

So, this is highly dependant on the performance of your disks/controllers and stuff like that.

Also, things that is helpful in the first part, may be hurtful here. (eg extended index maintenance)

Though, in general a clustered index is 'good' more helpful than hurtful all the time nowadays.

All in all, it's a balance operation with a fair amount of trial and error to determine the optimal tresholds of your particular system. If you're chasing the 'absolute best' that your box can perform on this particular delete, then you probably need to do some empirical testing.

If it's a one-shot deal, it seems like a waste of time.

If this is a recurring job, then it may be worthwile to investigate how to optimize it best.

/Kenneth

Bob Fazio SSChampion Points: 10794 More actions · Answer 8

I agree that 25 min in my opinion is fairly respectible. A trick to getting the count.

select @RowCount=max(rowcnt) from tblBigTable where id = object_id('tblBigTable')

-- Run for no more than 10 minutes

select @EndTime = dateadd(mi,@RunMinutes,getdate())

SET ROWCOUNT @NumDelete

delete -- TOP (@NumDelete) -- 2005 only

FROM tblBigTable

where datetime < dateadd(year,-1,getdate())

while @@ROWCOUNT <> 0

BEGIN

IF getdate() > @EndTime

BREAK

SET ROWCOUNT @NumDelete

delete -- TOP (@NumDelete) -- 2005 only

FROM tblBigTable

where datetime < dateadd(year,-1,getdate())

END

Dave Balsillie SSCertifiable Points: 5589 More actions · Answer 9

Given what your code appears to be doing (looping through the table, 10 Million rows at a time, looking for records to delete), an alterternative approach might help:

Consider partitioning the table, if possible around your heaviest I/O requirements. For example, would it make sense to partition by date, then simply swap the partition away?
Consider / check the ratio of deletes vs data left over, in particular around a practical partitioning scheme. It may be more efficient to restage the "left behind" records into a temp table, then swap the partitions.

Dave Balsillie SSCertifiable Points: 5589 More actions · Answer 10

Given what your code appears to be doing (looping through the table, 10 Million rows at a time, looking for records to delete), an alterternative approach might help:

Consider partitioning the table, if possible around your heaviest I/O requirements. For example, would it make sense to partition by date, then simply swap the partition away?
Consider / check the ratio of deletes vs data left over, in particular around a practical partitioning scheme. It may be more efficient to restage the "left behind" records into a temp table, then swap the partitions.