Optimized way to insert data in table on another server

  • Hello.

    We have 2 application whose database reside in different servers say Server A and Server B. We have requirement to insert huge amount of data from A to B. However this transfer is based on certain event on application on Server A which the app on server B will not know And this data will be generated in SQL only and transferred to DB on Server B.

    Kindly help in what will be the best way to transfer the data.

    Thanks and Regards,

    Saumik

    • This topic was modified 3 years, 9 months ago by  saum70.
  • If you're going to use linked servers to transfer between tables, keep in mind that depending on which server runs the SQL to transfer rows, the performance difference can vary substantially. If performance matters and I'm using linked servers, I try to make sure I'm "pulling" from the source server to the destination server, that is, the destination server runs the SQL that then pulls data from the source server.

    In your case, this means that, if you plan on using linked servers, that destination server will have to be "informed" somehow to start moving rows from the source server's table.

    I have included a good page that explains the performance differences between running inserts on the source server to a remote destination server, or running inserts on the destination server from the remote source server. If you can't manage to arrange things that way, you'll probably want to avoid linked servers and go for maybe an ssis job or dot net. Just my 2 cents worth!

    https://www.sqlserverscience.com/performance/slow-inserts-across-linked-server/

  • Hi,

    Thanks for reply. I had thought of the same of pulling the data from source. the flow will be as

    Source stp calls Destination stp. This destination stp connect and pull data from source table. All this happens within a session

    However will this not be issue of loop back reference

    Regards,

    Saumik

     

  • I'm not sure of the entirety of restrictions of running remote stored procedures. If a boatload of programming depends on functionality that I'm not sure of, I just go ahead and write a small test. If there are performance concerns, it might be worth putting a small but measureable workload in the test.

    I'm not sure about the loop back bit. I think if you had one stp call another stp that subsequently calls the first stp, things might not work well. But if one stp calls another stp that then does something that does not reference the first stp then I'm thinking that there would be less call for concern.

    Obviously I wouldn't perform a test that has the potential to destroy production servers, that isn't something I would recommend. Its up to you to decide whether you can safely test your ideas or not.

     

  • As an alternate solution, you could have an SSIS package that pulls data from server A and pushes it to server B and you MAY be able to use SSIS to handle the transforms.  Alternately, SSIS could call the stored procedure on server A to get the data or however you get the data.  SSIS pulls all the data into memory and then pushes it out to the destination server.  If this is TONS of data and will overfill the RAM on the SSIS server, you may want to break it up into smaller chunks for moving the data across (by day, by week, by month, by serial number, by something... partition up the data and move it in chunks).  Then on Server A, when the application or event on Server A is triggered, it starts the SSIS package to move the data across.

    Personally, I am not a huge fan of linked servers.  There are a lot of performance considerations with them and you can get weird query performance using them.  One example of weird performance - if you use the 4 part name for the linked server connection, it will work fine as long as the estimate of 1 row is close.  The other scary bit is with RPC Out being turned on.  Then your linked server needs to connect as someone who has permissions for the stored procedure calls.  If you pick a SQL account, you need to be VERY careful with permissions as anyone who can connect to the SQL instance can then run anything against the remote server that the account has permissions to.

    My opinion, SSIS is a safer option at the cost of some RAM management (SSIS runs in separate memory space than SQL Server, so you need to leave some open memory for SSIS or it will page to disk and performance will drop drastically).  And due to licensing, SSIS needs a SQL license, so you can't just put SSIS on its own box without buying another license for SQL Server (do double check that with your SQL licensing rep as you may have a deal that I am unaware of).

    Downside of SSIS is if the metadata changes, the package will fail to run.  By metadata, I mean column names, column datatypes, table definitions (depending on how the package is built), etc.  SSIS is VERY fussy about things remaining consistent.  It is easy to fix when this happens (open the package in VS and open all objects that have a warning and correct any errors), but it is a pain in the butt.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply