Deleting rows from a table...best way?

  • Hi All,

    Another question from me....

    I need to delete rows from a table that are older than 7 days/a week. I have been testing it using this query:

    Delete from SiteEvents

    WHERE TimeStamp < DATEADD(DAY,DATEDIFF(Day,-7,GETDATE()),-7)

    Would there be a more efficient way of doing this? It's taking a while (there is alot of data in there)but just thought I would ask more experienced people if maybe I am doing something cackhanded/long winded.

    Many thanks!

  • Hi Karen,

    There are a number of strategies; which one is best depends on your precise circumstances. You can find a good general technique in this SSC article: http://www.sqlservercentral.com/articles/67898/. There is some important extra advice in the discussion thread associated with that article.

    If you are deleting a large fraction of the rows in the table, it might be quicker to copy the rows you want to keep out, truncate the table, and then reload the saved rows. If you choose this option, be sure to meet the conditions for minimally-logged bulk load as described here: http://msdn.microsoft.com/en-us/library/ms190422(v=sql.100).aspx

    Going forward, if this is Enterprise Edition, you might consider partitioning the table to make archiving (or deleting) old data practically instantaneous. There is a good description of bulk loading techniques with partitioning in this paper: http://msdn.microsoft.com/en-us/library/dd425070.aspx

    Last point, it is usually good to avoid using TSQL keywords as column names. Timestamp is one such keyword (though it is deprecated in favour of the new name 'rowversion').

    Paul

  • Hi

    Why is there an extra 7 in the code?

    I would use the ffg script to get midnight a week ago:

    select dateAdd(dd,datediff(dd,0,getdate()-7),0)

    but if I wanted to keep everything from this week only:

    > dateAdd(dd,datediff(dd,0,getdate()-6),0):ermm:

  • Hi Terry,

    I ended up changing it to this:

    DELETE FROM SiteEvents

    WHERE TimeStamp < DATEADD(DD, -7, GetDate())

    This works fine. I now of course have to do some investigation into lock strategies since it did cause a teeny tiny blip when it ran on Friday.

    Thanks both for your help and links it is much appreciated!

  • I have found that if you delete about 5000 or more rows, you will cause a table lock and a lockout of other applications that need to modify records in a table.

    A way I found to avoid this situation is to create a SQL loop that deletes the TOP 4000 rows in conjunction with your WHERE clause on the timestamp column, then wait for a few milliseconds, then repeat the loop, until @@ROWCOUNT = 0 is returned, at which point your deletion operations are completed.

    LC

  • Hi crainlee2, thanks for this I had also read this point too.

    I deleted anything over 7 days 3000 rows at a time in the end and it actually didn't take that long. Now the job runs hourly and it's never more than 2000 rows although I am still keeping an eye on it.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply