June 29, 2022 at 12:55 pm
Hello friends,
I am planning to setup some process for archival. The configuration part I have thought of but execution wise all I can think of sequential. I want to make it generic but can't think of a way around. Setup I am thinking of us something like this:
1. Configuring table with base table and history table with other information like date time column and batch size etc.
2. Procedure which will create dynamic SQL and move data to history table batch wise and then purge from base tbl.
3. For every batch, log will be generated in a log table.
Problem is the proc will work sequentially. How can I make it run in parallel so that multiple tables can be processed in parallel and eventually without creating any lock issues can log into the log tbl. Any suggestions ?
June 29, 2022 at 1:39 pm
Why do you want to use dynamic SQL? Why not a procedure specific for each table? How many tables are you archiving?
You can't make either a procedure or a job execute multiple SQL statements simultaneously in themselves. But you can achieve this by starting a separate job for each batch you want to run in parallel, which you could do in a stored procedure or a master/control job. The starts will be sequential, but unless they run instantaneously, the net effect will be multiple jobs executing simultaneously.
June 29, 2022 at 3:24 pm
Instead of using an INSERT then DELETE for this process - you can use DELETE ... OUTPUT instead. This deletes the data from the main table and outputs what has been deleted into the history table.
The problem with any archive/purge scheme is schema drift. As soon as you have a new column added to the base table that has not also been added to the archive table you have a problem. If you have written the code using '*' (very, very bad idea) it will break because that new column doesn't exist. If you then figure it would just be easier to add the new column to the archive then you run into the issue of adding the new columns in the correct order - and if that doesn't happen then your process breaks or you end up copying data into the wrong archive column.
The other problem is new tables that have been created with RI to the table(s) you are archiving/purging. As soon as that new table has been added and there is related data in that data - the process breaks (which won't occur immediately - at least as long as your minimum purge requirement).
So - with all that said, using dynamic SQL to build the DELETE/OUTPUT with the columns from the source into the destination - and an additional procedure that validates the destination has all the columns can be done to handle the schema drift. The RI would then just be adding the additional table into the process when it breaks - which is much simpler than trying to code a solution to identify RI and add the new table to the process.
Final thoughts: Instead of using a traditional archive/purge process - consider using a temporal table. A temporal table will take care of your history table automatically keeping everything separate as needed.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply