June 3, 2016 at 4:20 am
Hi SSIS Experts,
I am new to SSIS, i got a task to archive the data from production to Archive DB and then delete the data from production keeping 13 months of data in production. There is around 300+ tables of these i have to archive around 50 tables. Out of these 50 table 6 tables have size around 1 TB.
Listing out the the 2 methods which we are planning.
1. Using 50 data flow tasks in a sequence container.
2. Using SELECT * FROM...INSERT INTO.. where table name and column name can be stored in some configuration table and through loop we can archive the data.
Which will be the better option?
Is there any other better method so please let me know.
What precautions(Performanec tips) i have to take while doing the archive process so that it should not affect the Production server?
Please give your suggestion
Thanks
June 3, 2016 at 9:41 am
Just for illustration purposes, let's assume you have a transactional table called Sales. Create another table called SalesHistory.
use StoreOps;
create table Sales
(
SaleID int not null identity(1,1) primary key
, SaleDate datetime default getdate()
);
create index ix_Sales_SaleDate on Sales ( SaleDate );
use StoreOpsArchive;
create table SalesHistory
(
SaleID int not null primary key
, SaleDate datetime
);
You can use either of the following statements to simultaneously delete from one table while inserting the same deleted rows into another table having the same columns.
delete from Sales
output deleted.SaleID, deleted.SaleDate
into StoreOpsArchive.SalesHistory ( SaleD, SaleDate )
where SaleDate < dateadd( month, -13, SaleDate );
If the archive table has exactly the same column layout, the the following is an even more simplified, because it doesn't refernece specific column names.
delete from Sales
output deleted.*
into StoreOpsArchive.SalesHistory
where SaleDate < dateadd( month, -13, SaleDate );
The initial purge / archive operation will likely take considerable time and transaction log space, because you may be deleting millions of accumulated rows. However, if you run subsequent operations daily, then it will be more manageable.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
June 3, 2016 at 9:44 am
So this would basically be a SSIS workflow consisting of T-SQL tasks; one for each table you want to archive.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
June 4, 2016 at 12:42 am
Thanks Eric for the suggestion...it is better than INSERT INTO option. To implement this shall i go with 2nd method where i am using configuration table to store the table names and columns names (for archive condition most preferably this column will be date field) and then using while loop to process each tables.
Can you give suggestions how to implement transaction or logging so that anything goes wrong it will be properly logged.
June 6, 2016 at 6:22 am
The DB's are not in same server they are in different server and there is no linked server.
If i use this DELETE with OUTPUT clause it will not work even if i provide the server.DB.schema.table
So i guess i have to go with INSERT INTO Select * statement.
Any suggestions.
June 6, 2016 at 7:17 am
SQL006 (6/4/2016)
Thanks Eric for the suggestion...it is better than INSERT INTO option. To implement this shall i go with 2nd method where i am using configuration table to store the table names and columns names (for archive condition most preferably this column will be date field) and then using while loop to process each tables.Can you give suggestions how to implement transaction or logging so that anything goes wrong it will be properly logged.
Single statements are contained within an implicit transaction. So, in the following example, if an error occurs midpoint, both the delete from Sales and the insert into SalesHistory will be rolled back.
delete from Sales
output deleted.SaleID, deleted.SaleDate
into StoreOpsArchive.SalesHistory ( SaleD, SaleDate )
where SaleDate < dateadd( month, -13, SaleDate );
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
June 6, 2016 at 7:41 am
If it is same server, it is good to use TSQL as Eric suggested. If it across servers, you can even use a data flow to leverage the SSIS pipeline for performance. Yes you might end up in creating data flow tasks for each table.
For transaction, you may need to enable DTC for this.
https://msdn.microsoft.com/en-us/library/cc304421.aspx
This ideally is background job and this would be a rare case where I would prefer a row by row logic to minimize blocking.
It is good to design a configurable batch, and filter criteria for archival
June 7, 2016 at 6:28 am
Thanks for the reply.
i am started creating package for this archival process, but it getting
error variable "USER:Tablename" does not contain a valid data object ssis
Following steps i had done to create package
1. Config table created to store the table name
CREATE TABLE [dbo].[Config](
[ID] [int] NULL,
[Tablename] [varchar](50) NULL,
[Columns] [varchar](50) NULL
) ON [PRIMARY]
GO
INSERT INto [dbo].[Config]
SELECT 1,'dbo.Emp','Date'
UNION ALL
SELECT 2,'dbo.address','Date'
1. Create variable Tablename, datatype =object, scope =package
2.Create variable SingleTablename, datatype =string, scope =package,value= dbo.Emp
3.Created Execute SQL Task
SQL Statement = SELECT Tablename from dbo.config
ResultSet = Single Row
In Result Set Tab
Variable name = User::Tablename, Resultname = 0
4. Added FOR EACH LOOP Container
Enumerator = For Each ADO enumerator
ADO object source variable = User::Tablename
Enumeration mode = Rows in the first table
Variable mappings
User::SingleTablename, Index =0
5.Added dataflowtask in FEL container
Drag the source OLEDB provided the connection string
Data access mode = Table name or view name variable
Variable name = User::SingleTablename
DRAG the destination OLEDB provided the connection string
Data access mode = Table name or view name variable- Fast load
Variable name = User::SingleTablename
Mapping is correct.
Package saved and executed
Error is coming like this "USER:Tablename" does not contain a valid data object ssis
While debugging the data is coming in this variable USER:Tablename .
i checked twice the variable is declared only once
Any suggestions will be helpful
Thanks
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply