March 24, 2008 at 4:41 pm
I wanted to delete millions of records in the development enviornment based on some date every month. Is there any way to not use transaction log database. I know when we delete, it logs in the transaction log database and then to main database.
If I don't use transaction log then it will be pretty fast for millions of rows.
any idea will be appreciated.
March 24, 2008 at 4:52 pm
If these rows make up your entire table, use truncate and you log will not grow.
Truncate will not work if your table is parent table for others.
Otherwise you may want to switch to simple logging and use delete-batches for the operation. Your logfile(s) will not grow if you keep these transactions small enough to fit into the current size.
Declare @BatchSize int
Set @BatchSize = 5000 -- Modify as needed !!!
Set nocount on
declare @RowsDeleted bigint
Declare @MyRowcount bigint
set @RowsDeleted = 0
while 0 = 0
begin
DELETE top ( @BatchSize )
FROM
WHERE
set @MyRowcount = @@rowcount
if @MyRowcount = 0 break
select @RowsDeleted = @RowsDeleted + @MyRowcount
-- just to see it advancing ..
-- % = modulo
if 0.000 = @RowsDeleted % 100000.000 print convert(varchar(26), @RowsDeleted) + ' - ' + convert(varchar(26), getdate(),121)
end
Print '[' + cast(@RowsDeleted as varchar(15)) + '] Rows deleted.'
Since size does matter, did you consider partitioning this table.
That way you may be able to just drop a partition of data. (sliding window)
Check BOL.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
March 24, 2008 at 5:04 pm
Thanks buddy,
I like the idea of deleting in a batch in that case transaction log files will not grow.
I cannot use truncate option because there is a condition associated with it.
Good stuff.
March 24, 2008 at 9:27 pm
In addition, deleting a smaller batch is faster. This helps to avoid blocking the other processes as the deletion will not hold the table for too long.
May add waitfor 5 seconds in the while loop after each batch. Just give the other process a chance.
Agree with ALZDBA, partitioning is a way to go for 2005.
March 25, 2008 at 1:24 pm
I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.
I came up with another idea.
1. First copy the data to flat file which we want to keep
2. Truncate the table
3. Import data from flat file to the table.
This way is much faster.
any suggestions guys. ?
March 25, 2008 at 1:33 pm
This is what I have done in the past; using bcp to drop the data from the table into a file, truncate the table and then reimport the data into the table again.
Tip - order the data when you are exporting it, in the order of the primary key; and then when you are importing it use the -h (hint) switch to define the order. This makes life much nicer.
But as to whether or not it is quicker will depend on the volume of data you are working with.
March 26, 2008 at 2:43 am
balbirsinghsodhi (3/25/2008)
I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.I came up with another idea.
1. First copy the data to flat file which we want to keep
2. Truncate the table
3. Import data from flat file to the table.
This way is much faster.
any suggestions guys. ?
Indeed, it depends on the "fraction" of data you want to keep.
maybe even a temptb may help out. (sort the data according your clustering key columns)
And keep in mind to use an "order by [clust-key-col(s)]" if you insert the data back into the original table and to use bulkinsert if possible.
Doing things this way, you may avoid an extra table maintenance (reindex) step.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
March 26, 2008 at 12:34 pm
This suggestion assumes that the volume of data which you wish to retain is reasonably small.
1. Set simple recovery mode on the host database.
2. Run sp_rename to rename the original table.
3. Create a new table with the original table name without indexes.
4. INSERT INTO the empty table those rows which you with to retain.
5. Create your indexes on the new table.
6. Run sp_createstats or manually create your stats.
7. Grant privileges to match those of the original table along with any constraints.
8. Drop the original table as long as you've coded to check @@ERROR=0.
9. Reset to your original recovery mode.
March 26, 2008 at 4:06 pm
Thanks guys.
Here what I have done.
1. Create temp table to store data which I have to keep
My tempdb and Log is on separate drive and have lots of space.
so space is not a problem.
2. Truncate Real Table
3. Drop all the indexes on Real Table
4. Insert into Real table from temp table.
5. Create indexes on Real table
This is faster and doing good.
Thanks for all your help. you guys rock.
I have noticed one thing that playing with large data is not a joke.
March 26, 2008 at 4:39 pm
Thats great.
I would just like to note one thing about this solution. You can potentially lose the data that you want to save if your batch fails following the table truncate and before the insert of the data from tempdb into the previously truncated table has completed since the truncate deallocates pages by removing the pointers to those pages. So, if an page is reallocated it can't be reclaimed.
Just a heads up.:)
March 26, 2008 at 7:24 pm
balbirsinghsodhi (3/25/2008)
I am trying to delete records in a batch and it's taking forever. there are 65 million rows in one table and I am deleting based on some condition and that table is joining with some other tables.I came up with another idea.
1. First copy the data to flat file which we want to keep
2. Truncate the table
3. Import data from flat file to the table.
This way is much faster.
any suggestions guys. ?
Another way maybe do the select into and keep the data you need, drop the original table and rename... I am sure others have suggested to you this too.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply