June 3, 2008 at 1:49 am
What are the performance issues while using linked server by using join quries?
I have more than 60 lakhs of records in that server table. So i would try do inner join from that server table to local table. I know that this will not good query performance. But how can i do the above process with good performance.
Regards,
KVGaneshbabu
June 3, 2008 at 3:47 am
Not sure what a lakh is, but certainly back on 2000 if you joined to a linked server it had a habit of bringing the whole table back and doing the join locally. Which tended to hurt a bit on large tables...
June 3, 2008 at 3:53 am
60 lakhs = 6 million 🙂
June 3, 2008 at 10:45 am
I am facing the same problem as well, i got a small number of records and when i can retrive the records from a linked server using Joins, it takes a bit more time espically when other server is Busy, more time means, the process which takes 2 minutes to run under normal conditions, takes 30 + minutes to run if the server is busy
this is on 2005 64 bit,
June 3, 2008 at 12:32 pm
MS changed something with linked servers in sql 2005 that caused several performance hits that didnt exist in 2000. I have several queries that use to take seconds across the link that now take several minutes.
I try to run the queries on the server the data resides on the pass the results over the linked server
June 3, 2008 at 12:58 pm
Are these join queries run during the day in production or can they be batched for overnight processing?
June 3, 2008 at 1:27 pm
If your local tables have "little" data you may want to specify "REMOTE" when doing the join so that it happens on the server with large tables.
-Noel
* Noel
June 4, 2008 at 3:44 am
Is there any article that we can read to understand how linked server works on 2005, hope we can get some answers from that, and try to improve performance
June 4, 2008 at 10:06 am
I've used "multi-server" queries and transactions pretty extensively in SQL 2000, and although 2005 has added features that would probably make some of the things that I've done a little bit "easier," the basic principles still apply. There really is nothing unusual or difficult about writing a query that joins data across multiple servers. The key, just as when trying to optimize queries on a single server, is to limit the amount of rows that have to be scanned. The only difference here, is that scanning can take place either on the server where the rows exist (cheap) or on a remote server (expensive).
Utilize OpenQuery, for example, to perform a grouping of data in a large remote table, before it is moved across the network, rather than grouping it locally using a four part name (linked server.database.owner.table). Use temp tables to load data from a local database to a remote database, or vice versa, and then join to the temp table, again using OpenQuery, so that all of the scanning of data is done "locally."
If you want more help on this particular query, post some more info. Show me the query and a sample result set, and the number of rows from each table that meet the conditions of your query.
June 5, 2008 at 5:06 am
Thats what i do for some operations Eric, but there is still a problem in accessing data, say last week i had a problem to copy 100K rows into a temp table from a remote server, this Usually takes 2 to 3 seconds, but last week on one particular day it took me 26 minutes, the only reason is that the remote server is busy using all its CPU, how can i over come this problem?? have u faced problems like this before??
Cheers
June 5, 2008 at 7:06 am
Do you have to have the data live and real-time or can you work from a daily snapshot? I had a similar situation with a previous employer. Fortunately, my environment worked well from a daily snapshot so I copied the data from the linked server nightly when both servers were relatively idle. During the day, the process took over an hour. At night, it went down to roughly 15 minutes.
June 5, 2008 at 3:50 pm
Well, yes and no. I've had to deal with a variety of performance issues, but I haven't specifically faced this issue. I've more commonly faced i/o problems than CPU utilization issues.
That said, which of these options sounds most promising to you:
1) Schedule a job to run when the CPU is not busy to cut down on the possibility of a transmission error.
2) Provide real-time monitoring of CPU utilization so that you can warn a user before trying to get the data that it may take "longer than usual".
3) Prioritize the job of getting the data so that it will still work well even if the CPU is over-utilized.
If you have another idea on how to solve, maybe I can still be of assistance, but I at least have some ideas on each of the above 3.
June 5, 2008 at 4:35 pm
I think you guys might be missing something... Yep, I'm sure that you're preaching to the choir when it comes to running jobs at night, etc...
The real problem the OP pointed out was that their app sucked for performance during the day and was fine at night. That means that having users login to the system is causing the system great pain for performance. It's got nothing to do with night runs... it has to do with normal daily use of the application and the hardware/pipline it uses. Could be some bad code in the application itself... could be something stupid like a coffee pot plugged into the same power source as a router (took us weeks to find THAT one).
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply