best way to delete two million rows

  • hi guys, what is the best way to delete two million rows from a table that the date created is after january 5th?, someone told me to do an insert the rows to a temp table then truncate the original table and then insert the rows from the temp, does that make sense?, or just to do a

    delete *

    from tablename

    where datecreated>= '1/5/08'

  • DBA (3/10/2008)


    hi guys, what is the best way to delete two million rows from a table that the date created is after january 5th?, someone told me to do an insert the rows to a temp table then truncate the original table and then insert the rows from the temp, does that make sense?, or just to do a

    delete *

    from tablename

    where datecreated>= '1/5/08'

    What is the proportion of the number of deleted rows and the number of total rows? Copying the data and then truncating, copying back is a good alternative, but you cannot always truncate a table (see BOL for the restrictions). Disabling your non-clustered indexes during the delete or during moving the data back, and then rebuilding them will also speed things up. Also it is worth considering deleting in smaller transactions. Also, when moving data back, do try to use minimally logged operations (select into, ...) and switch to bulk recovery mode before, then switch back to full.

    Regards,

    Andras


    Andras Belokosztolszki, MCPD, PhD
    GoldenGate Software

  • - that may be a good option if 2M is the majority or rows in your table.

    - If not, you may want to use a while loop to just delete in small chuncks to

    avoid massive locking or lock-escalation.

    while 1 = 1

    begin

    delete top ( 10000 ) *

    from tablename

    where datecreated>= '1/5/08'

    if @@rowcount = 0 BREAK

    end

    If you use this with simple logging, chances that your log-files grow are

    small. (you should run a backup before !)

    - after your delete opration, you should consider to run table maintenance (rebuild indexes) and refresh the statistics.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • hi my plan is to run this script once, delete 2 million rows out of 20 million, then create another query to do it every 3 minutes. so then the script you created should be for the second one correct?

  • I agree with both above. I typically use batches to remove small amounts of rows since I can't always move the data and truncate the table, but that would be a great way to do it.

    I have had large transactions (1M rows and above), run a long time, hours, while the same 1M rows done 1000 at a time run in minutes.

  • DBA (3/10/2008)


    hi my plan is to run this script once, delete 2 million rows out of 20 million, then create another query to do it every 3 minutes. so then the script you created should be for the second one correct?

    You might as well start with it from the first time on.

    (see Steves reply)

    Maybe just pick a higher top number.

    If you delete 2M rows in one transaction:

    - your log-file(s) may grow (even with simple logging).

    - massive locking will occur.

    - ongoing applications will suffer when scanning or lock escalation occur.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ok... so this is the query i have to run first to delete the 2 million rows out of the 20

    ALTER INDEX IX_Messages_1 ON tblMessages DISABLE

    GO

    DElete *

    from tblMessages

    where

    datecreated>= '1/5/08'

    ALTER INDEX IX_Messages_1 ON tblMessages enable

    GO

    ALTER INDEX IX_Messages_1 ON tbMessages REBUILD

    GO

    and then i will have this sp to run every 5 minutes

    CREATE PROC usp_DELETE_Old_Messages

    AS

    BEGIN

    SET NOCOUNT ON

    while 1 = 1

    begin

    delete top ( 100 ) *

    from tblMessages

    where datecreated>= '1/5/08'

    if @@rowcount = 0 BREAK

    end

    Please tell me if it looks ok .

  • Looks good, assuming that the index you are disabling is a non-clustered index. Also, delete statements do not need the "*" part.

    Regards,

    Andras


    Andras Belokosztolszki, MCPD, PhD
    GoldenGate Software

  • Looking good.

    Indeed the delete statement does not need the "*" :blush:

    Think DRP: start with a full db backup for this operation !

    Since this proc only has this single while loop with one delete statement, I'd just put the while loop in a sqlagent job statement.

    I prefer straight forward job information.

    If - on the other hand - you have no control over the job or its content, I'd also prefer having it in a proc that I control :Whistling:

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • thanks guys!!! i apreciate it. 🙂

  • hi again, Server: Msg 156, Level 15, State 1, Line 4

    Incorrect syntax near the keyword 'top'.

    when i try to run the loop query, any idea why?

  • DBA (3/10/2008)


    hi again, Server: Msg 156, Level 15, State 1, Line 4

    Incorrect syntax near the keyword 'top'.

    when i try to run the loop query, any idea why?

    Are you running on a SS2005 server? (this is a 2005 forum). The error message suggests that you are on SQL Server 2000. On SQL Server 2000 you cannot use DELETE TOP ... 🙁

    Andras


    Andras Belokosztolszki, MCPD, PhD
    GoldenGate Software

  • yes i am so sorry i am working on a sql server 2000, so then how would i just get the top 100?

  • DBA (3/10/2008)


    yes i am so sorry i am working on a sql server 2000, so then how would i just get the top 100?

    You could issue the statement:

    SET ROWCOUNT 100

    delete .... (without the top ..)

    SET ROWCOUNT 0

    Regards,

    Andras


    Andras Belokosztolszki, MCPD, PhD
    GoldenGate Software

  • If you ARE on 2000, you can use SET ROWCOUNT instead.

    Meaning -

    ....

    SET ROWCOUNT 10000

    While 1=1

    Begin

    Delete

    from tblMessages

    where datecreated>= '1/5/08'

    if @@rowcount = 0 BREAK

    end

    SET ROWCOUNT 0

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply