Need assistance to copy data from Prod to test server

  • Hi All,

    I like to know the best way or suggestion to complete the following task:

    I have a 500 GB DW DB and need to feed a test server on a weekly basis with last two quarter of data

    Please let me know the best way to complete this task.

    Thanks

    Dave

  • Hi Dave,

    My first question would be, whether you need all of the tables within the database, it maybe better to isolate a number of key tables then script those to copy the data from server to server. This maybe the prefered option even if you have to bring all the tables.

    In it simplest form, you could try something like (This assumes the both tables have columns Col1, Col2, Col3, Dated);

    INSERT INTO [TestDatabase].[TestSchema].[TestTable] --Table you want to populate

    SELECT Col1, Col2, Col3, Dated --Columns you want to copy across

    FROM [LiveDatabase].[LiveSchema].[LiveTable] --Table where the original data is

    WHERE Dated > (GETDATE() -180) --Filter to get just the last 180 days

    Ta

    David

    ============================================================
    David

    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Why dont you use SSIS for loading the data between servers?

  • To move only a sub-set of the data you'll have to build a manual process. SSIS is the obvious choice, but you could use straight T-SQL, PowerShell or any other type of programming language that lets you run queries between the structures. But, only moving a sub-set of the data will be difficult to set up and maintain. It will be much easier to copy all the data through a backup and restore, and then clean out the data you don't want.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks all.

    I need all the tables, there are close to 480 tables in the db

  • Do all the tables have the same columns?

    Do all the tables have a single column for filtering the last quarters?

    Is this a Kimball design with fact and dimension?

  • Please see below my answers:

    Do all the tables have the same columns? NO

    Do all the tables have a single column for filtering the last quarters? yes there is date column on them

    Is this a Kimball design with fact and dimension? I was informed that it is star design

    So If I use Backup and restore I need to clean the records to keep track of 2 quarters by coding and do Insert on a weekly basis plus delete old records, correct ?

    Any thing else I need to know

  • How much storage do you have available? If you do a backup, restore, and delete, unless you shrink the database, the DB in the test environment will still be 500 GB.

    How much data are we talking about? Is the data moving over a relatively small amount or are there hundreds of thousands or millions of records? If there is a huge amount of data, I would probably not do a T-SQL script.

    If you have enough storage and if you can have data older than 2 quarters (even if you don't use it), why not just do backups and restores to the test environment and then leave it as is?

  • qew420 (2/25/2014)


    Please see below my answers:

    Do all the tables have the same columns? NO

    Do all the tables have a single column for filtering the last quarters? yes there is date column on them

    Is this a Kimball design with fact and dimension? I was informed that it is star design

    So If I use Backup and restore I need to clean the records to keep track of 2 quarters by coding and do Insert on a weekly basis plus delete old records, correct ?

    Any thing else I need to know

    A normal star design would have a single fact table which has data loaded every day, week, etc. the other tables would be dimension and require a full (or incremental) load at every refresh. I would double check wth development if the need all +400 tables incrementally loaded. If so, then yes, this is a significant effort.

    Depending on how many quarters you have in the original database, deleting the remainder could take a significant amount of time. If you partition all the tables on the date column, you could switch out in seconds, but you will still need 100% of the space to perform the restore each refresh. Not a very elegant approach.

    I hate to even ask, but do the tables have foreign keys? If so, that will complicate matters even more.

    In the past, and starting next week actually, I have transfered entire datawarehouse systems to new platforms using .Net code to look through hundreds of tables. This will still take a significant amount of time each refresh.

    In general, I would recommend moving towards a datamart approach where data is ETL into a new structure which enables data to be loaded on a more frequent basis, such as daily. The job will run a few minutes to hours daily and avoid an outage on each full refresh.

  • Thanks SSC and all others

    I think ETL approach will be the best until MS comes up with a Wizard to solve this

    Maybe backup only six month and restore (I know I'm doing day time dreaming ):hehe:

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply