March 18, 2010 at 10:35 am
I want to delete 2 million records from my table(based upon a condition)....what is the easiest way to do???
March 18, 2010 at 10:52 am
Well, the EASIEST way is to simply write a DELETE statment that does what you want. 😉
Since you are asking here I assume you are really asking about the MOST EFFICIENT way to delete 2 million rows.
There are multiple different methods depending on the size of the table.
If you are deleting most of the rows, I recommend creating a new table with only the rows you want to keep (using SELECT INTO), and then drop the old table and rename the new table to the same name as the old. Note that this will require extra care if there are any constraints or indexes on the table.
If the table is very big (more than 10 million rows) you can delete the data in several passes. Use SET ROWCOUNT to limit the amount of data deleted by each DELETE statement and run it several times.
March 18, 2010 at 11:04 am
sandip.vs (3/18/2010)
I want to delete 2 million records from my table(based upon a condition)....what is the easiest way to do???
Assuming that you read Stefan's post, and you need to perform a delete of these records (vs. copy into a new table as Stefan suggests), then this falls into one of those very few times where doing something in a loop is actually desirable... otherwise you'll end up with a table lock while deleting those records all at once.
do it something like this:
SET ROWCOUNT 10000
while exists (select 1 from MyTable where condition = true)
delete from MyTable where condition = true
This will delete 10,000 records at a time until all 2 million have been deleted.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
March 19, 2010 at 6:03 am
Stefan_G (3/18/2010)
If the table is very big (more than 10 million rows) you can delete the data in several passes. Use SET ROWCOUNT to limit the amount of data deleted by each DELETE statement and run it several times.
Much better to use TOP. The use of ROWCOUNT with anything other than SELECT is deprecated and will be removed from the next version of SQL Server.
Books Online - SET ROWCOUNT
Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next release of SQL Server. Do not use SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. Also, for DELETE, INSERT, and UPDATE statements that currently use SET ROWCOUNT, we recommend that you rewrite them to use the TOP syntax.
Also see Limiting Result Sets By Using TOP and PERCENT for additional reasons to avoid SET ROWCOUNT with data modification statements.
Be careful to take account of locking thresholds when deleting in batches. Lynn Pettis wrote a good article on exactly this process, see Deleting a Large Number of Records[/url]
Paul
March 19, 2010 at 6:31 am
Paul White NZ (3/19/2010)
Much better to use TOP. The use of ROWCOUNT with anything other than SELECT is deprecated and will be removed from the next version of SQL Server.
Nice. I started using this technique with SQL 2000 and I had not noticed that it was deprecated.
Thank you for that info.
/SG
March 19, 2010 at 6:39 am
Stefan_G (3/19/2010)
Nice. I started using this technique with SQL 2000 and I had not noticed that it was deprecated. Thank you for that info.
You are welcome. It comes as a surprise to many! Lynn's article is worth a read too - even for those very familiar with the basic method.
March 19, 2010 at 7:20 am
WayneS (3/18/2010)
sandip.vs (3/18/2010)
I want to delete 2 million records from my table(based upon a condition)....what is the easiest way to do???Assuming that you read Stefan's post, and you need to perform a delete of these records (vs. copy into a new table as Stefan suggests), then this falls into one of those very few times where doing something in a loop is actually desirable... otherwise you'll end up with a table lock while deleting those records all at once.
do it something like this:
SET ROWCOUNT 10000
while exists (select 1 from MyTable where condition = true)
delete from MyTable where condition = true
This will delete 10,000 records at a time until all 2 million have been deleted.
Whether you use TOP or SET ROWCOUNT, this will still keep the table pretty busy especially when it gets to the area of the table that people are using the most. My recommendation is that you add a WAITFOR DELAY of at least a second or two to let other processes "in".
--Jeff Moden
Change is inevitable... Change for the better is not.
March 19, 2010 at 9:14 am
and dont forget to set SET @@ROWCOUNT 0
after deletion
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
March 19, 2010 at 9:34 am
I found it might be better to use TOP syntax in the end anyway. The ROWCOUNT setting appears to be not taken into consideration by the query optimizer and this may affect execution plans for the queries.
I am not sure if it is OK to post link here to my blog, but I wrote a post about this some time ago.
Regards
Piotr
...and your only reply is slàinte mhath
March 19, 2010 at 9:35 am
Bhuvnesh (3/19/2010)
and dont forget to set****
after deletion
No, no, no! Please read my prior post on why not to use SET ROWCOUNT for this!
The command is SET ROWCOUNT anyway, not SET @@ROWCOUNT...:crazy:
March 19, 2010 at 9:44 am
Piotr.Rodak (3/19/2010)
The ROWCOUNT setting appears to be not taken into consideration by the query optimizer and this may affect execution plans for the queries.
Yes it mentions that in the second of the links I posted earlier. It's a bit more complex than it seems though - you might have noticed extra TOP operators in query plans with "IsRowCount = True". Craig Freedman covers it briefly here.
The other thing I dislike about SET ROWCOUNT is that its value stays in effect for any triggers fired by the operation. Unless the trigger was coded with an explicit SET ROWCOUNT 0, the results can be unfortunate.
March 19, 2010 at 9:52 am
Thanks Paul, I didn't see this post before. I really enjoy Craig's posts about execution plans.
Regards
Piotr
...and your only reply is slàinte mhath
March 19, 2010 at 10:37 am
ncodner (3/19/2010)
@WayneS - you're saying if I do the deletion in a loop there will be no locks on the table? Is this better than specifying 'no lock' in the query?
I would encourage you to read Lynn's article referenced earlier in the thread.
March 19, 2010 at 12:11 pm
First, I want to thank Paul for referencing my article and the nice comments regarding it as well.
Second, Jeff makes a good point as well if this is a heavily used table. If you look at the code I wrote in my article, you will notice that if the database is not using the simple recovery model, I have inserted a transaction log backup during the delete process. This was done to help control the size of the databases' transaction log. You could use that same logic to include a WAITFOR DELAY as well or in place of the transaction log backup.
Viewing 15 posts - 1 through 15 (of 30 total)
You must be logged in to reply to this topic. Login to reply