August 18, 2010 at 1:41 pm
Hi Experts,
I ran into this challenge recently and thought you might seen this before.
I got a huge database with 1.5 TB in size. It's a pain to backup/restore it on dev server for developers to play around with. I already scripted out the empty database onto the dev server with lookup tables populated. Now the challenge is: Is there a way to select just portion/subset of the prod data and populate them onto the empty database on dev server. This portion of data must satisfy all the FK contraints. For example:
Order table
Customer table
State table
City table
Order references to Customer
Customer references to State, City
State references to City
lets say today developers ask me to populate the empty database on dev server with data such as:
All orders for all customers who resides in MN that has city as Minneapolis only, so they can do some development and testing for those orders only
How do I dynamically reverse engineering or loop get all these data then generate some sort of INSERT statements in order of dependency?
Is there any scripts or tools out there to accomplis this goal?
I tried DataSubset from Grid-Tools and it seems to provide exactly what i want, but the tools have many bugs.
my ultimate goald is: pass in a query like this
select order.*
from order
inner join customer on order.customerid = customer.customerid
inner join state on customer.stateid = state.stateid
inner join city on state.cityid = city.cityid
where state.name = 'MN' and city.name = 'Mpls'
....then a tool/script will be able to figure out only what tables it needs to touch and only what data it need to extract (not all data)
In example above, lets say there are 100 orders from 10 customers in 1 state in 1 city, then a tool/script will only generate insert statements for those data only (not more and not less). To make matter more complicated, i also like the tool/script be able to extract data from any other table that references to those data tables from my query (i.e. Order, Customer) as well, .e.g. Invoice (invoice.orderid = order.orderid), Region (region.customerid=customer.customerid), etc... Lets say there are 80 invoices and 3 regions that references to those 100 orders and 10 customers.
Thank in advance,
Thanh
August 18, 2010 at 10:27 pm
best will be to write your own script to get the subset of the data. you can populate the parent tables first and then keep on moving down the heirarchy for data insertion. you can use top clause in the insert select statements
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply