Optimizing: Delete of 1.000.000 of rows

  • Hello,

     

    Can anybody give me some help on how can i optimize the following delete.

    I have run it a lot of times but each 30.000 40.000 deleted rows it gives me one deadlock.

    Server: Msg 1205, Level 13, State 50, Line 1

          Transaction (Process ID 55) was deadlocked on {lock} resources with another process and has been chosen as the deadlock victim. Rerun the transaction

     

    Have yo any idea on how can i perform this delete without affecting the database performance of my database?

     

    Thanks and regards,

    Jorge

     

    DECLARE @intRowsDelete int

    SET @intRowsDelete = 1 -- Force first iteration

    WHILE @intRowsDelete > 0

    BEGIN

       SET ROWCOUNT 10000    -- Size of each batch

       delete from Tablename where (Treated = 1) AND DateInserted < '2006-12-01 00:00:00.000'

       AND (ErrorID IS NOT NULL)

       SELECT @intRowsDelete = @@ROWCOUNT -- Actual number deleted

       SET ROWCOUNT 0        -- Reset batch size

       -- Optional delay here if you want to allow other user access to the database

       WAITFOR DELAY '000:00:05'

    END

     

    Thanks a lot.

    Regards,
    Jorge Manuel Mendes

  • Do you have an index for DateInserted and/or ErrorID ?

    otherwize i guess your deletequery performs a tablescan ! this way (dead)locking may occur more frequent !

    you can create an index just to support your cleanup operation and remove it afterward.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Thanks a lot,

    it was what i've already done and it function very nice.

    Regards,

    Jorge Mendes

    Thanks a lot.

    Regards,
    Jorge Manuel Mendes

  • What I have also noticed is that if you have a clustered index on the date column, the execution plan woulkd be ideal, i.e might do a clustered index delete.

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • Also watch out for foreign keys.  Deletes have to verify a foreign key constraint is not being violated and sometimes this can take a long time.

  • just to add to Michael Earl's reply :

    The guideline for any RDBMS is that by default one should provide exact foreign-key indexes (in dependant objects offcourse) unless it is prooven that it hurts performance TO much. (and one is willing to pay the price at parentdeletetime)

    you may even concider to actualy only create the fk-index at delete-cycle time and remove it after the cycle.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • A few more suggestions:

    -Temporarily disable any triggers on the table.

    -Make DateInserted the first column in the where clause

    -Create a composite index using DateInserted and Treated columns

    -Update Statistics

    What does the execution plan look like when you run your script?

  • be very carefull if you want to disable triggers and/ or constraints because that goes for _every_ user of the database for the time you've desabled them !!

    basicaly meaning, if you disable something from your database, you should first disable _all_ other's from manipulating data during the time you are performing your maintenance and don't forget to enable everything you have disabled earlier on ! And be sure you don't enable stuff that have been disabled for other reasons !!

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • 1) There are a number of 'improvements' to your query:

    set rowcount outside the query

    use explicit transactions and error checking

    SET ROWCOUNT 10000    --I might go with 50K here?

    declare @introwsdelete int, @err int

     

    WHILE 1 = 1

    BEGIN

      BEGIN TRAN

       delete from Tablename where (Treated = 1) AND DateInserted < '2006-12-01 00:00:00.000'

       AND (ErrorID IS NOT NULL)

     

       SELECT @err = @@ERROR, @intRowsDelete = @@ROWCOUNT -- Actual number deleted

      IF @err <> 0

       BEGIN

          ROLLBACK TRAN

          PRINT 'ERROR!!'

          BREAK

       END

     

       COMMIT TRAN

       IF @introwsdelete = 0

       BEGIN

           BREAK  --done with all deletes

       END

       -- Optional delay here if you want to allow other user access to the database

       WAITFOR DELAY '00:00:05'

    END

     

    set rowcount 0

     

     

    2) you may consider dropping ALL indexes except one on tablename, dateinserted, Errorid, especially if deletes total a significant fraction of your total number of rows.  When done, rebuild indexes.  THis is much faster than index maintenance during deletes.

     

    TheSQLGuru

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks a lot everybody.

    Regards,

    Jorge

    Thanks a lot.

    Regards,
    Jorge Manuel Mendes

  • It seems you already have you answer, but FWIW...

    I had a similar problem, but I handled it differently due to my application/environment. See my post http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=9&messageid=329377#bm330390

    One thing you should think about is using a table lock hint. This way your delete can only start if it can gain an exclusive table lock. If that is not possible, lower the number of deleted done per batch. When you try and delete large number of rows you might have many indexes that also need locks and if your table is fragmented, SS will have to create way too many locks....giving rise to deadlocks. Also, I would either rebuild index or reindex ( drop/recreaate) the tables and as other stated create a covering index to make deleteting records easier w/o table scan or massive index scans.

    --Frank

     

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply