Speeding up a huge delete

  • Hi all,

    I've got a 120 mil row table I need to trim down as follows:

    delete webproxylog

    where week_number = 36

    There are 10 distinct values in the 'Week_number' column, evenly distributed down the length of the table, so 12 million rows peer week.  I need to delete weeks 36-40, inclusive (and retain 41-46)...asap!!!

    Is there a more effective way of doing this delete?  I also have a 'date' column and can delete by day if that's somehow faster.

    Thanks,

     

    Jaybee.

  • SELECT * INTO Temp_CurrTable FROM dbo.CurrTable WHERE 1 = 0

    INSERT INTO (Col1, Col2) Temp_CurrTable SELECT Col1, Col2 FROM dbo.CurrTable where Week Between 41 and 46 (or use the date column, just use the clustered index for this).

    Recreate indexes (starting with clustered), checks constraints and defaults.

    DROP old table, rename new table.

     

    You can also have a gradual delete that you could run which would not affect the performance of the server but it could take over 1 day to delete all that data... and you said it's urgent so I'll let you decide which one to use!

  • I would suggest that you create the temp table with the required data.

    and then re-create indexes on them.

     


    Kindest Regards,

    Sureshkumar Ramakrishnan

  • I'm afraid there's no room for a temp table...yes we need some extra disks and I've had them ordered, but meanwhile I have to get these deletes done quickly..

     

     

    JB

  • 1 : Take full backup of the db (or at least that table >> bcp to text file and zip).

     

    2 :

    SET ROWCOUNT 10000

    DECLARE @rc AS INT

    SET @rc = 1

    WHILE @rc > 0

    BEGIN

    DELETE FROM dbo.BigTable WHERE ClustCol <= @EndDate

    SET @rc = @@ROWCOUNT

    WAITFOR DELAY 00:00:01 --give at elast 1 second for other operations to complete, I'd suggest 2 seconds if you're not in a big hurry

    END

    SET ROWCOUNT 0

    Take more frequent log backups as you'll fill it faster.

  • Does that code imply I'll need to make the Date column a clustered key?  There is no Clustered key on the table.

  • Don't have to but I would strongly suggest using where condition that can use an index!!

     

    Make it as narrow as possible... since you seem to be short on space ATM!

     

    EDIT 7777 POSTS

  • Sounds like a plan so far mate...I'd better ask, would the time taken to re-sort the rows exceed the time that the usage of a clustered index would save?

    Tried doing straight deletes of Week 36 (which I imagine is a tenth of the db), forget to keep clearing out the log more often than it's hourly schedule, and the damn thing fell over after a 5 hour run (presumably a 2.5 hour rollback) Used what little time I had left after returning from the sauna to retry the delete, with logdumps between sips of of Starbucks Christmas splosh, but it's probably collapsed again. 

    The DB is at 85% of mdf capacity, not sure how much of the disk is left over, but in any case, I'll find out tomorrow. 

    My advice to ALL you DBA's...don't take on a half-baked system without warning the bean counters of some serious expenses to come - most of them into your own pockets!

    JB

     

  • Looks like you need to add into the loop 2 statements before WAITFOR:

    BACKUP LOG WITH TRUNCATE_ONLY

    DBCC SHRINKDATABASE

    Sounds terrible, but it seems to be the only exit from this dead end.

    It's gonna take ages to run, but it will lead yo to freedom.

    Use it only everything else failed. And don't forget about FULL BACKUP prior this!!!

    _____________
    Code for TallyGenerator

  • Well not only would it take time but it would require 250% of the space the table occupies now (while doing the work).  So I would withdraw this solution in your case.

     

    A non clustered index might be a good option but I would try something else first :

    How long does it take to run the delete query once (with set rowcount 10000)?

    If that speed is acceptable, then you can have it run then wait 1 second before running again.  Assuming it runs in 1 second, then you delete about 18 000 000 rows / HOUR.  So you should be done in less than 4 hours at this pace.  I doubt you'd get it running much much faster than that.

  • Also implied with this  >>> run off hours if at all possible!!!!!

  • And for SHRINKDATABASE specify you need to shrink LOG file only.

    Then you can run it not only off hours. It will methodically delete records by small portions and let everybody work with the database.

    _____________
    Code for TallyGenerator

  • I thought a Clustered Key was just a teeeny object in one of the Sys tables that stipulates the re-sorting of the heap into the order of one of the columns?  I think the NON-Clustered keys are the ones with 2.1 x the data...jeeez, it's been years since I did my MCP in all this.

    I haven't tried the rowcount idea at all yet, primarily because I can't figure out how to integrate the syntax...I'm an admin thrown in to babysit a live system that should still be in dev.  Though that 18m rows per hour sounds pretty damn tempting!!

    JB

  • It takes 2.5X times (IIRC) while it resorts the data, it basically makes a copy of the table in tempdb while working this out.  Then it also has to write the operations to log files.  So it's expensive with that much data and that little free space.

     

    18 M/hour is only if the delete of 10K rows takes only 1 sec.  Which I have no idea if it will... With a clustered index it would certainly take around that but without it I don't know.  That's why i asked you to run the statement once to see how it goes.

     

    If you need more help implementing this then we'll need more info (Table definition) so we can write the script for you.

  • Not sure how shrinking the LOG file will help clear out the .mdf - I've been dumping it periodically today...as far as I remember it is about 300Gb in size, but don't quote me, it's been a very long day and 9.42pm is no time to be worrying about bloody databases when a young lady is giving you her "Come to bed" eyes!!

    Stuff it, I'll pick this one up in the morning guys, really appreciate your help tonight!!

    JB.

Viewing 15 posts - 1 through 15 (of 47 total)

You must be logged in to reply to this topic. Login to reply