Migration of the same data in several directions

  • Hi guys.

    I have a large industrial application. There are 3 environments: PROD, REL and DEV (actually they don't relate to the software development process, databases are just used for different stages of data entry). I need to develop a functionality which copies data between these databases (DEV to PROD, PROD to REL, REL to PROD, REL to DEV). What exactly and when has to be copied is set up by site administrators. For example they can specify that "All data related to this product will be copied from PROD to REL at September 1st". (note, I have to copy only data, database schemes are same). In addition I have to coly some inages (flat files in file system).

    There are several problems. Not all data have to be copied, but only data for specific product, so, I cannot just overwrite the databases. Instead of it I need to select required data from source DB and copy it to the destination one. (It is about 80 tables). Also, I cannot just copy data as it is, I have to change the data a bit during migration. For example if I insert Product in the destination database, it will have ProductId which is not the same as in source database because it is an identity column. In addition, some tables have XML columns which contain Ids of entries in database, so these XMLs have to by changed too. And finally, customer needs the possibility to copy data in several directions. In short, migration logic is going to be quite complex.

    I'm looking for the best way how to achieve this purpose. For now, I have 4 options:

    1). Create stored procedure to copy all data. It will be a tree of stored procedures which call each other to copy all data. It is the easiest way for me. But the problem is that obviously I need to create linked servers for all environments, hardcode names of databases on these linked servers. In that case I will have to create 6 identical sets of stored procedures (1 set for each direction).

    2). Create SSIS package. In that case I will be able to pass connections to source and desctination databases. But, creation of such complex SSIS package is a really hell. The second problem is performance, I think.

    3). Use any 3rd party tool. I don't know exactly which tool, but maybe this one http://www.red-gate.com/products/sql-development/sql-data-compare/.

    4). Something else....

    Maybe somebody has ran into similar problem already? Could you share your opinion with me?

  • I'm sure the majority of responses will be to use SSIS, I'd probably use stored procs.

    Linked servers are not the end of the world, for example It's quite possible to create a proc which will create linked servers and if you're you're only doing rpc calls then they are very good. Mainly I think linked servers have a bad name due to cross server queries which can ( usually ) perform very badly.

    You can name a linked server and link that name to a server - therefore you could use the same proc call e.g. "remoteserver" just takes a different ip address, that way your one proc could be used for all calls - you're just changing the underlying address. ( hope that makes sense )

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • How will you decide if a product is already in the target database? Obviously, you can't use the ProductId so you have to use the "natural key". And what is suppose to happen if you find one and the new data don't match the original ones? Will the source database "always win" or will there be a logic that PROD has higher priority than REL than DEV? If so, a decision need to be made how to react if there is the same Product but with different values when copying data from DEV to PROD: will it just be ignored or will a process be triggered to get the actual data from PROD to DEV?

    Depending on the business requirements it might be possible to have just one set of procedures that could create the linked server on the fly and clean up afterwards. But this concept also requires a few more thoughts (e.g. how to deal with concurrency if the sproc is called again while it's still running but with a different target)

    I'm not sure if it's a valid concept that still guarantees data consistency within one db. Even if it is, it'll take quite a bit of work to make it reliable and robust.

    SSIS is definitely an option to handle it technically. Depending on the skills level available it is probably at the same level of complexity as the stored procedure approach is. The complexity is not driven by the tool but rather by the business process 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply