Urgent - Database Backup (with minimal data)

  • Hi All,

    I have a production database with 5 terabyte size.

    and now there is a need to have another server (Dev/QA) with the same structure from production BUT with 5% of the data from prduction.

    5% is not a must as long as the data in the new server in each table should be very less , compared to production.

    there is an existing abc.BAK file to work with.

    Your ideas thanks

    Thanks

    Daniel

  • Can't do it. You must either restore and delete the excedent. Backup, ship to QA and restore.

    Red-Gate has a tool that reads to bak file and "hosts" it so that you don't have to restore it. Maybe that's a valid alternative here.

  • Thanks for your quick reply,

    Restore and delete records?

    since the size is big, it takes a lot of time to restore it on another server.

    plus copying over the file on the network takes a long time.

    advice.

  • Have a look at the redgate tool Ninja mentioned, or consider something manual (like RedGate's SQLDataCompare)

    There's no way to take a backup of a portion of the data in each table in a database

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Move to sql 2008 ent or sql 2008 R2 standard.

    You will have access to data and backup compressions which in theory could cut that time by a factor of 2 to 5.

    In the meantime, I don't see any way around going through that cycle at least once.

    Once you script out the "tear down" of the data, you could always make a point to reload that db periodically with more fresh data.

  • Plan Z might be to simply script out the DB. Then use a data generation tool.

    It could work for dev work, but QA would need to real data. So it doesn't help there.

  • Sorry, but I dont have an easy solution for you... But here is what we do...

    We have a 2 server dev environment. 1 is for our daily/weekly builds which is generated from build scripts (schema and sample data and the other is a preQA deployment system. Again built from scripts.

    Our QA environment is an exact copy of Live (schema/data/HW) so that not only functional tests can be performed, but also performance tests.

    If it was me and I only needed 5% of the data I would look at building some scripts that generate the data...

  • sql_lock (5/4/2011)


    Sorry, but I dont have an easy solution for you... But here is what we do...

    We have a 2 server dev environment. 1 is for our daily/weekly builds which is generated from build scripts (schema and sample data and the other is a preQA deployment system. Again built from scripts.

    Our QA environment is an exact copy of Live (schema/data/HW) so that not only functional tests can be performed, but also performance tests.

    If it was me and I only needed 5% of the data I would look at building some scripts that generate the data...

    Same train of thoughts here... with only 5% it means dev work... which requires just a couple rows to make sure it works.

    Performance is another story. I could see a point in leaving that part at QA after the code returns the correct results.

  • The amount of work to DELETE your data down to 5% of the size of production is probably just as much work to INSERT data up to 5% of production. I'd go that route. It's going to be much quicker and easier to create a blank database and then add data to it. It will also be much, much easier to recreate that database in multiple environments which just won't be possible going the other way.

    And yeah, doing that you could use Red Gate Data Compare[/url] to pull data from your production system.

    I would not necessarily recommend using Red Gate Virtual Restore for this situation. In all my tests it restores the database in about 60% of the time, which is great. And it uses almost no disk space initially (it needs enough to roll forward any transactions completed during the backup process). However, when you go to delete 95% of your data, you're going to need 95% of the space in order to store those differentials in your stub database files. Further, it will have to reconcile the data for you on the fly and removing 95% of a 5tb is really not going to function well. So this is not a situation where that particular tool is going to help. Sorry.

    A Red Gate tool you might want to look at in addition to SQL Data Compare is the SQL Data Generator[/url]. This can be used to insert and create data into an existing data model. It's very programmable and I use it all the time for creating this sort of test data sets. You can even have it run queries against your production data as part of it's own data load (although that can be slower than generating data).

    Disclosure: For those who don't read signatures, I work for Red Gate.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • You can mount your production database on the QA server with Red Gate's Virtual Restore (http://www.red-gate.com/products/dba/sql-virtual-restore/). There are also a few other products that can do something similar (one from Idera as well). It will create minimal sparse files on the QA server to store writes/changes and read from the backup files.

    Disclosure: I work for Red Gate.

    The problem with saying 5% of the data is that you can't just take the top 5% of data from production because of FKs/data relationships. You need a "related" 5%, which is more problematic to get. That means custom development work to remove the 95% information and keep what you need.

  • Thanks All,

    Appreciated

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply