Linked Server Connections

  • I've got two SQL servers, each has a linked server configured to the other, and I want to transfer data between them using the linked server connections.  I realize there are other ways to move the data but I'm primarily trying to understand the dramatic difference between the following scenarios.

    For simplicity let's say that my servers are named ServerA and ServerB, I have a table (MyTable) in database MyDB on both servers whose schema is the same on both servers and that I have a large (5 million rows) amount of data on server B that I want to move to server A.

    Scenario 1 - *PUSH*

    Using Query Analyzer I connect to Server B and issue

    INSERT INTO ServerA.MyDB.dbo.MyTable

    SELECT * FROM MyTable

    -- moves the data from ServerB to ServerA and takes a few hours

    Scenario 2 - *PULL*

    Using Query Analyzer I connect to Server A and issue

    INSERT INTO MyTable

    SELECT * FROM ServerB.MyDB.dbo.MyTable

    -- moves the data from ServerB to ServerA and takes a few minutes

    Can anyone help me understand why there is such a large disparity between these two scenarios?  Thanks.

    Steve Gutzwiller

  • how about the activity on both servers?

    I don't think the difference can be that marked (few hours vs few minutes) when that is the case either the source and destination tables had a different amount of rows or the activity (CPU/MEM/DISK) is very different

    Just my $0.02  


    * Noel

  • I'd check the number of rows transefered between the 2 servers. One might have only a few k and the other 10M rows...

  • Isn't that part of what I just said ?

     


    * Noel

  • Maybe... not what I understood on the first read.

  • Then you haven't lived until you get a table that has a cross join for all columns in the table as indexes! And the other table has no indexes! This can be the cause for minutes vs. hours.

    Rule of thumb, my experience is that the process is always faster when executed on the server receiving the data.

    Andy

  • Which would make sens because the local server executing the select could use the best plan instead of shipping all the data and letting the other server sort it out.

  • Server utilization and number of rows transfered aren't the problem.  I've seen this issue when moving data from a staging server to production while the rest of our system is down.  I've got the only connection.

    I think its got something to do with how the query gets executed through the linked server in each case.  In the case of the PUSH the Query Plan shows a "Remote Insert" step which doesn't sound terribly efficient and while the query is running I see calls to sp_cursoropen being made from the source server which doesn't sound terribly efficient.  I don't see the same kind of activity in the PULL scenario.

    I know the queries get executed differently which might explain the large difference in execution time but I still don't know WHY they get executed differently.

    Steve Gutzwiller

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply