July 11, 2005 at 2:03 pm
I've got two SQL servers, each has a linked server configured to the other, and I want to transfer data between them using the linked server connections. I realize there are other ways to move the data but I'm primarily trying to understand the dramatic difference between the following scenarios.
For simplicity let's say that my servers are named ServerA and ServerB, I have a table (MyTable) in database MyDB on both servers whose schema is the same on both servers and that I have a large (5 million rows) amount of data on server B that I want to move to server A.
Scenario 1 - *PUSH*
Using Query Analyzer I connect to Server B and issue
INSERT INTO ServerA.MyDB.dbo.MyTable
SELECT * FROM MyTable
-- moves the data from ServerB to ServerA and takes a few hours
Scenario 2 - *PULL*
Using Query Analyzer I connect to Server A and issue
INSERT INTO MyTable
SELECT * FROM ServerB.MyDB.dbo.MyTable
-- moves the data from ServerB to ServerA and takes a few minutes
Can anyone help me understand why there is such a large disparity between these two scenarios? Thanks.
Steve Gutzwiller
July 11, 2005 at 2:52 pm
how about the activity on both servers?
I don't think the difference can be that marked (few hours vs few minutes) when that is the case either the source and destination tables had a different amount of rows or the activity (CPU/MEM/DISK) is very different
Just my $0.02
* Noel
July 11, 2005 at 2:59 pm
I'd check the number of rows transefered between the 2 servers. One might have only a few k and the other 10M rows...
July 11, 2005 at 3:30 pm
Isn't that part of what I just said ?
* Noel
July 11, 2005 at 5:18 pm
Maybe... not what I understood on the first read.
July 12, 2005 at 12:49 am
Then you haven't lived until you get a table that has a cross join for all columns in the table as indexes! And the other table has no indexes! This can be the cause for minutes vs. hours.
Rule of thumb, my experience is that the process is always faster when executed on the server receiving the data.
Andy
July 12, 2005 at 6:31 am
Which would make sens because the local server executing the select could use the best plan instead of shipping all the data and letting the other server sort it out.
July 12, 2005 at 11:28 am
Server utilization and number of rows transfered aren't the problem. I've seen this issue when moving data from a staging server to production while the rest of our system is down. I've got the only connection.
I think its got something to do with how the query gets executed through the linked server in each case. In the case of the PUSH the Query Plan shows a "Remote Insert" step which doesn't sound terribly efficient and while the query is running I see calls to sp_cursoropen being made from the source server which doesn't sound terribly efficient. I don't see the same kind of activity in the PULL scenario.
I know the queries get executed differently which might explain the large difference in execution time but I still don't know WHY they get executed differently.
Steve Gutzwiller
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply