Archive older data

Question

Archive older data

farax_x

Hall of Fame

Points: 3579
More actions
May 11, 2011 at 9:59 am

#389502

Hi all,
I have a server(Sql server 2008 R2) with large amount of data (about 2TB). How can I divide my data to multiple servers ? and hold this state dynamically for example main server always have data from 6 month ago til now and older data move to archive server ?
I would b glad to have your opinion or document resources about this solutions.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Mike01 SSChampion Points: 11395 More actions · Answer 1

Hopefully the table and the Archive able are partitioned and have the same layouts, constraints, PKs, filegroupings etc. If this is the case, then you could swap out the oldest partition in the table and swap it into the archive table.

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

farax_x Hall of Fame Points: 3579 More actions · Answer 2

Mike01 (5/11/2011)
Hopefully the table and the Archive able are partitioned and have the same layouts, constraints, PKs, filegroupings etc. If this is the case, then you could swap out the oldest partition in the table and swap it into the archive table.

tnx for your answer,

Should i write query(use linked server) for transfer data between servers or Sql server has a proper tool to doing this tasks?

in other words ?Any solution or tool for doing this exist?

And I wanna know What solutions for the isolation and maintenance of new data and old in separated servers suggested in enterprise scale ? I would b glad to if u can introduce an article about this type of solutions.

tnx

SQLRNNR SSC Guru Points: 281334 More actions · Answer 3

Have the business requirements for data retention and when data should be archived been clearly defined?

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

farax_x Hall of Fame Points: 3579 More actions · Answer 4

SQLRNNR (5/11/2011)
Have the business requirements for data retention and when data should be archived been clearly defined?

Yes, I have 2 servers, one for live data and other one for archived data.

live data are those data that their creation date is after specified period of time(for example 6 month) and archived data are older (archived) that is the main property is that they are read-only (no delete or update needed).

I want to separate live data from those live data that are being in archive period and transfer to archive server monthly.

The aim of this strategy is to improve performance, because SQL server (in live server) deal with small amount of data.

farax_x Hall of Fame Points: 3579 More actions · Answer 5

farax_x

Hall of Fame

Points: 3579

May 12, 2011 at 2:22 am

#1323511

No one comments?

farax_x Hall of Fame Points: 3579 More actions · Answer 6

farax_x

Hall of Fame

Points: 3579

May 12, 2011 at 2:26 pm

#1323968

No one comments?

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 7

It isn't that simple - I've been doing this type of stuff for over 12 years with SQL Server, since 7.0.

Here's a doc I wrote about partitioning - it was part of a presentation on issues concerning archiving and such

http://www.grumpyolddba.co.uk/gonein60ns/GoneIN60nsWeb.pdf

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

vikingDBA SSCertifiable Points: 5356 More actions · Answer 8

From reading the original post, I'm not seeing that the poster is using partioning.

If that is the case, I would suggest the following for a purely home-grown solution. I by no means intend this to be considered "best practice", just something I would consider doing.

If I were doing this, and I have thought about it often, I would create a column in the table to be cleaned, and the new table on the other server, with the name of something like CleanStatus, being an integer, and a default value of 0.

1)First, make sure all records have a 0 in them (UPDATE tablename SET CleanStatus=0)

2)Identify any needed records to clean, and place a 1 in the CleanStatus column

UPDATE tablename

SET CleanStatus=1

WHERE somedate > #months old --pseudocode

3)Then move the records to the new server/database/table.

INSERT INTO newtable

SELECT * FROM sourceserver.owner.sourcetable WHERE CleanStatus=1

Would use a Linked Server to pull this over.

4)Confirm that the number moved is the number with a 1 on the CleanStatus on the source and destination tables to confirm that you

brought over the same number of records.

5) If so, can reset the new table CleanStatus=0.

6) Delete source table records with the CleanStatus=1, since you have confirmed they have been moved to the other database.

This way you only mess with the ones you have positively identified.

Just my suggestion.

Bob

Mike01 SSChampion Points: 11395 More actions · Answer 9

First off, a 2Tb table should be partitioned, otherwise any type of management, querying etc... will be a nightmare.

Adding a column, then updating it, moving it to another table and then deleting is a lot of touches of the same data. If the table isn't partitioned, then the updating alone will take a long time and alot of transaction log. Plus any issues that existing processes might run into because of the new column.

Let's assume that the table isn't partitioned. I think the first step would be to figure out what the retention date is (i.e. retain 6 months, so any records older than today - 6 months get archived). Then copy these records to the new table and delete them from the existing. This process should probably be run daily (weekly at the latest), to process the minimum amount of records.

If the table is partitioned (and I'm assuming it would be by date), we could switch the partition out to a "staging table" for a new partition. Then copy the data from the "staging table" to the Archive table. As a last step, we can drop\Truncate the "staging table"

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 10

There are a few shortcomings with partitioning not least of which is the issue of the partitioning column having to be part of the clustered key. This means if you have a PK ( say an int ) you cannot enforce uniqueness except by using a trigger or indexed view, because all unique keys must inlcude the partitioning column. You also can't use full text with a partitioned table and you can't rebuild the indexes on-line.

He did actually say 2TB database. The real problem with archiving isn't the data moving it's the removing old data - and this is where partitioning is useful.

If you have a table gaining say 4 million rows a day trying to delete those is a nightmare.

All I'm saying is that although you're quite right that partitioning is the way to go, and yes big tables that are partitioned offer lots of advantages it's not always quite so clear cut as that.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

farax_x Hall of Fame Points: 3579 More actions · Answer 11

farax_x

Hall of Fame

Points: 3579

May 16, 2011 at 8:17 am

#1325075

tnx a lot for your attentions 😀

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 12

1) Federated databases, where you have data spread across different physical servers, was NEVER a fully functional and useful thing in SQL Server. If you need to be able to report or access all data on all machines that is what you need. But it isn't good solution.

2) 2TB of data is NOT something that you NEED to scale across multiple servers using modern hardware. Design things correctly and get sufficient hardware and 2TB of data on SQL Server be no problem at all.

3) Hire a consultant to help you a) figure out what you REALLY need and b) help you get there. You have little hope of operating at that scale without experience and training on how to get there from where you are.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

farax_x Hall of Fame Points: 3579 More actions · Answer 13

TheSQLGuru (5/17/2011)
1) Federated databases, where you have data spread across different physical servers, was NEVER a fully functional and useful thing in SQL Server. If you need to be able to report or access all data on all machines that is what you need. But it isn't good solution.
2) 2TB of data is NOT something that you NEED to scale across multiple servers using modern hardware. Design things correctly and get sufficient hardware and 2TB of data on SQL Server be no problem at all.
3) Hire a consultant to help you a) figure out what you REALLY need and b) help you get there. You have little hope of operating at that scale without experience and training on how to get there from where you are.

Tnx for your answer,

1) You r right, deal with different physical servers is so hard but I'm system developer (programmer) and handle this problem in my application by using different connection strings that known by data creation date

2)I have some limitations here that cant have my favorite servers and hardware and I have to use items that is provided. I recently get this system, unfortunately this system have a bad database structure and design that cause this slowness in this scale.

3)Its a good idea but I'm not the decision here ! I would really glad to learn what r archiving data solutions in world that I can use !