January 6, 2004 at 4:00 pm
What is the best technique for deleting and arciving. I currently run a basic script that
1. Populates a tempory table with a list of orders to archive.
2. Archives those orders' records to a set of archive tables on another server.
THis is dones using an INSERT join like...
INSERT INTO arc_server.db..arc_table
SELECT e.* from source_table e
join temp_table d on e.order_num = d.order_num
3. Deletes the source records. It uses a DELETE join similar to the INSERT above.
Would I be better off to use replication, dts, or bcp out to txt file and bcp back into the archive tables. I would suppose dts is the best. If so, should I be employing parallel threads, table locks or any other best practice. Thanks for any advice.
smv929
January 6, 2004 at 5:58 pm
I like your method. It's simple, fast, almost bulletproof. Other methods will be more complex and have more places for things to go wrong.
The biggest issue is to be sure there are no errors before performing the delete. Might include a check that joins the two tables and looks for xx rows before doing the delete. @@rowcount from teh insert can give you the xx to look for.
January 7, 2004 at 12:34 am
Maybe you should take locking into consideration.
Your methode will do fine, but when used with big volumes, the delete might lockout the table. You could avoid this by using smaler chunks.
First fill your archive table, then :
set nocount on
declare @rowcount int
set @rowcount = 1
while @rowcount > 0
begin
begin tran TopDelete
delete e
from source_table e
inner join (select top 5000 d.order_num
from source_table s
inner join temp_table d on s.order_num = d.order_num ) SelTop
-- I suppose Order_Num is your unique key ?
on e.order_num = SelTop.order_num
set @rowcount = @@rowcount
commit tran TopDelete
end
When using simple archive logging, this will also minimize growth of your log-files.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
January 7, 2004 at 8:59 pm
alzdba,
Thanks. That's an elegant approach to batching a large result set. There's one complication: Many of the tables I delete/archive are Order-related. The order_header table has one record per order. However, the order_detail table has multiple orders per line. An order may have from 1, 5, 20, 50 (for one customer, up to 1500 lines).
So the above script will delete exactly 5000 records from the order_header record, which is great. However, the problem is when the source table is order_detail. The DELETE statement will allow many more records than 5000 to be deleted since order_detail has multiple lines per order. Is there a similar way to limit the batch to 5000 when deleting from the order_detail table? Thanks again.
smv929
January 8, 2004 at 12:28 am
What I do now in such cases is I determine the average of dependant rows per dependant table for one parent row and adjust the top-statement. I also document this in the script !
So basicaly what I do is give it a number of parent-id's on which it should perform dependant deletes.
Some control is better than none.
You can, however, also perform these delete based on a "TopSel" part where you select the primary-keys of you dependant table when joined with the targeted parent table. Then your top-n will be exactly top-n and because you've deleted the previouse top-n dependant-rows, the next run will perform deletes to the next top-n rows,...
e.g.
-- replace the delete query in the while loop with this one
delete ChildTb
from T_Dependant1 ChildTb
inner join
(select top 1000 D.PkCol1, D.PkCol2 --all PK-columns
from T_Dependant1 D
inner join T_Parent P
on P.Pk = D.Related_Pk
-- ? order by D.PkCol1, D.PkCol2 --all PK-columns
) SelTop
on ChildTb.PkCol1 = SelTop.PkCol1
and ChildTb.PkCol2 = SelTop.PkCol2
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply