June 29, 2016 at 2:52 am
Context
Looking to implement a database archival process that could be used across different product databases.
SQL Server 2008 R2 - Standard Edition
Product database schema are subject to minor but regular change
Main product databases keep data for the legally required time.
No sensitive data (e.g. PCI) is stored.
Unlikely that archived data will be queried - a restore to a new database can be used for this purpose.
Previously used methods of archiving data
Have looked at other archival methods in past
- Product_Archive Database
- Partitioned table (Std Edt limitation)
Have used the following method
- Backup full database using SQL Server backup
- Delete from live database the database that is no longer required (older than x date)
- Advantages - Simple
- Disadvantages - Manual, full backup is large, will backup format be available in newer server versions in 3,5,7,9 years time?
Alternative method
PowerShell archive script
Configuration file for product database listing tables to be archived and the archive criteria (Records < DateColumn field). Table order within the configuration file respects the referential integrity constraints
For each table
- Use the SQL Server Management Object (SMO) api to script out the table schema
- Use bcp to create a format file for the table being archived
- Use bcp to extract archive data to comma separated value format
- Run TSQL to delete archived records from the database
- Zip all output to an archive zip
A parallel archive restore script would be used to load archived data to a new database if required.
Powershell restore script
Unzip archive zip
For each table in the archived zip
- Use SQL Server Management Object (SMO) api to create tables based on output schema creation script created from the archive script.
- Use bcp to reload data to newly created table from archive csv file using the bcp format file created from the archive script.
Advantages
- Scripted
- Data archive format is basic (comma separated value)
- Schema is stored along with the archive data
- Data archived is targeted to specific tables
Disadvantages
- Slightly more complex
Any thoughts on this proposed method? Viable? Other advantages/disadvantages?
Thanks
June 29, 2016 at 7:21 am
Any reason to not just back up the database and then clean out the old data from the live copy?
A backup has the advantage of maintaining point-in-time consistency between tables, where a data extract won't do that.
The other advantage is simplicity of restoration.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
June 29, 2016 at 9:05 am
Definitely more complex, but has the advantage of not being dependent on tools in the future. It's got more places things could break, but they could also be recovered.
However, there's one thing I'd note. Sometimes you don't want data to be recovered 9 years. Tax law is 7, IIRC, and other industries might want less (or more). I'd get a legal opinion for the company and then decide. These days I wouldn't keep the data beyond what I had to, specifically for lawsuits. I once had to restore a 9 year old database, and management wasn't happy that we could.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply