slow DELETE!

  • Hi,

    I need to delete about 10 million rows from a 30 col table containg 30 million rows,using the folowing code.

    The DELETE takes more than 25 mins. Is there anything I can do to speed it up ?

    Thanks

    (

    BEGIN TRAN

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

     

    DECLARE @start_id int

    DECLARE @end_id int

    DECLARE @total int

    SET @start_id = (select min(id) from table)

    SET @end_id = @start_id + 1000000

    SET @total =(select max(id) from table)

     

    WHILE @end_id <= @total

     BEGIN

        DELETE table with(rowlock)

      WHERE  ID between @start_id and @end_id and [month] < '01/01/2006'

      OPTION (MAXDOP 1)

        set @start_id = @end_id + 1

      set @end_id = @start_id + 1000000

     END 

     COMMIT TRAN

    )

  • Couple of things to check:

    What index on there on the table. Index would be rebuilt as deletion takes places which would slow the process. An option to consider would be drop the index and add the index after the deletes.

    Are there foreign keys with delete constraints ?

     

  • Thanks for your reply..

    The table has a clustered index. Will dropping it speed up the DELETE?

    No FK constraints on the table.

  • Leave the clustered index. You may need to add an index on the month column.

    Also the log size will grow. This might give you some more insight.

    http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=65&messageid=222360

     

  • Have you considered making smaller transactions?  The incremental deletes you're doing get committed only at the very end of a transaction. Somthing like

    (

     

    DECLARE @start_id int

    DECLARE @end_id int

    DECLARE @total int

    SET @start_id = (select min(id) from table)

    SET @end_id = @start_id + 1000000

    SET @total =(select max(id) from table)

    WHILE @end_id <= @total

     BEGIN

    BEGIN TRAN

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

        DELETE table with(rowlock)

      WHERE  ID between @start_id and @end_id and [month] < '01/01/2006'

      OPTION (MAXDOP 1)

        set @start_id = @end_id + 1

      set @end_id = @start_id + 1000000

     COMMIT TRAN

     END 

    )

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • In addition: making this in into an uncommited isolation sounds risky!!!


    * Noel

  • Try this instead.

    While 1 = 1

    Begin

    Begin Tran

    Set ROWCOUNT 10000

    Delete From [TABLE_NAME] where [CONDITION]

    If @@ROWCOUNT 0

    Commit Tran

    Set ROWCOUNT 0

  • Is 25 mins for 10 million rows really that awful?

    In any case, there's two parts to optimize.

    The first is usually easier than the second.

    First part is the search part of the delete, since in order to delete something, you need to find it first.

    Here indexes can help, and also (as you're already doing) 'chunking' the total

    into smaller pieces. The trick is to find the optimal size of the 'chunks'.

    This is 'select/search' optimization.

    Second part is more or less hardware dependant.

    The deleted rows cause write I/O in the table and in the transaction log, and on top of this index maintenance as well.

    So, this is highly dependant on the performance of your disks/controllers and stuff like that.

    Also, things that is helpful in the first part, may be hurtful here. (eg extended index maintenance)

    Though, in general a clustered index is 'good' more helpful than hurtful all the time nowadays.

    All in all, it's a balance operation with a fair amount of trial and error to determine the optimal tresholds of your particular system. If you're chasing the 'absolute best' that your box can perform on this particular delete, then you probably need to do some empirical testing.

    If it's a one-shot deal, it seems like a waste of time.

    If this is a recurring job, then it may be worthwile to investigate how to optimize it best.

    /Kenneth

     

     

  • I agree that 25 min in my opinion is fairly respectible.  A trick to getting the count.

    select @RowCount=max(rowcnt) from tblBigTable where id = object_id('tblBigTable')

    -- Run for no more than 10 minutes

    select @EndTime = dateadd(mi,@RunMinutes,getdate())

    SET ROWCOUNT @NumDelete

    delete -- TOP (@NumDelete) -- 2005 only

    FROM tblBigTable

    where datetime < dateadd(year,-1,getdate())

    while @@ROWCOUNT <> 0

    BEGIN

     IF getdate() > @EndTime

      BREAK

     SET ROWCOUNT @NumDelete

     delete -- TOP (@NumDelete) -- 2005 only

     FROM tblBigTable 

     where datetime < dateadd(year,-1,getdate())

    END

     

  • Given what your code appears to be doing (looping through the table, 10 Million rows at a time, looking for records to delete), an alterternative approach might help:

    • Consider partitioning the table, if possible around your heaviest I/O requirements. For example, would it make sense to partition by date, then simply swap the partition away?
    • Consider / check the ratio of deletes vs data left over, in particular around a practical partitioning scheme. It may be more efficient to restage the "left behind" records into a temp table, then swap the partitions.

     

  • Given what your code appears to be doing (looping through the table, 10 Million rows at a time, looking for records to delete), an alterternative approach might help:

    • Consider partitioning the table, if possible around your heaviest I/O requirements. For example, would it make sense to partition by date, then simply swap the partition away?
    • Consider / check the ratio of deletes vs data left over, in particular around a practical partitioning scheme. It may be more efficient to restage the "left behind" records into a temp table, then swap the partitions.

     

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply