What way faster to delete?

  • There are 2 tables, tblBig (20 million rows) and tblSmall (1000 rows), both have clustered index on id column. I need to delete from tblBig rows where id exists in tblSmall. Actually, I tryed both ways, I mean: 1) delete tblBig where id in (select id from tblSmall) and 2) delete tblBig from tblBig b inner join tblSmall s on b.id=s.id. To compare effeciency of these queries used Execution plan and Statistics IO but, frankly, did not get what way is faster. Any tips are welcome. Thanks

  • Make sure that all foreign key constraints referencing tableBig (Table being deleted from) are either dropped, or set to nocheck, this will speed up your delete considerably.

    If your just comparing the two different queries for delete, I think the performance of each would be similar, Which looks like you have shown.

     

  • Ray's correct.  Since you are already doing a check for existance in tblsmall you should ensure that SQL doesn't also do a double check for existance (maintaining referential integrity) in tblsmall...  Also holds for other referenced tables.

    In the wider context, how many rows do you think would be deleted at a time?  Are any other users making use of these tables whilst the delete is occurring (assuming the delete takes a while)?  If they are in use, are they being read or updated?

    If a long running transaction is an issue for you, you could attempt to break the transaction into smaller chunks by using a while loop, set ROWCOUNT and using @@ROWCOUNT to test if the loop should continue or not.

  • Thanks, guys. I'll definately delete foreign key from tblBig. Regarding others Ian questions:eventually I need to delete from tblBig 15 million rows (from 20)and from 3 related tables. TblSmall is a chunk of data (by 1000 rows) I use to delete from tblBig. Unfortunately tblBig is in use permanently- by read and update. One more time, Ray and Ian, thanks for help.

  • If you are going to delete 15 million rows out of 20 million, then there is a possiblility where you can instead insert the remaining 5 million into another table, then rename the original table as _OLD and rename the new table as the original table.

    It would depend on how your Big table is being used in your application(s). If there is a window during which the Big table could be treated as if it were a Read-Only table - then you can use this approach, and it will be much faster.

    Regards

    CK


    I feel the need - the need for speed

    CK Bhatia

  • Thanks. Actually I've  already done this way

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply