4 table join on a linked server taking too long... Is there a best practice in this sitch?

  • In a nightly process, I'm executing a SELECT that JOINs 4 tables on a linked server. Sometimes this takes 25 minutes, and sometimes it hangs indefinitely. When it hangs, I can see a call to sp_cursorfetch that's blocking a lot of other processes. That's bad.

    I'll admit that the amount of data being moved from the linked server to my local machine every night is "too much". Ultimately, we'll fix that using a CDC approach, but for the time being, is there a best practice for moving SELECTed data from one SQL2005 server to another?

    Should I be using OPENQUERY? BCP out, then BCP in? Should I move the tables 1 at a time and reconstruct the JOIN on my local machine? (Last would be 3 or 4 times the amount of data being moved, but I'm wondering if doing the JOIN on the linked server is causing the issue...)

    What's the best practice here? THANKS SSC!

    -- author's note: I think the blocking process on the linked server is "sp_cursorfetch". I'm going from memory. could be another similar. definitely sp_%cursor%. Thanks.

  • What is the datatype that you are joining on? Number (int, smallint, etc..) types seem to join better. Also, if you can generate a subset of data either in an OPENQUERY locally, or a view remotely you will likely have better performance. Also, a 4 table join with a linked server is probably not a good idea, you might write the output of the 3 tables into a temp table or table variable and then join that to the linked server table..

    I have also have a case where nothing I could do would get better performance so I build an SSIS package to pull a subset of data locally, with a period just outside what all the queries would need. It was painful, but that package ran about 5 minutes a night when I didn't care about performance and the queries then ran like lightning..

  • In these cases I have created stored procedures on the source and then called the stored procedures. I have seen dramatic performance improvement in these cases.

    Are any of the joins on a local table?

  • get rid of the linked server !

    Why: query stability !

    Keep in mind if sqlserver cannot find a "by default" good plan, it may pull over all data of the individual linked objects and resolve the join at the local instance !

    Your query also may suffer "linked server" (b)locking and off course you are one of the many remote server users !

    That's also one of the reasons I always avoid "remote" server queries.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I come across this kind of issues almost daily, since I'm working on a SQL Server app that reads data from AS/400 with linked server queries. It wasn't me to choose this strategy, it was already that way when I was hired...

    I changed lots of queries from 4-part names to OPENQUERY and the improvement is impressive.

    When I have to move huge amounts of data, I create a communication table with the same definition as the data I expect from the query and I pull data with a SSIS package. It changed in some case from 20 min to 30 seconds.

    I would give it a try.

    -- Gianluca Sartori

  • Jack, none of the tables in the JOIN clauses are local. This would support your approach, no?

    Thanks for the suggestions all!

  • Yes, the fact that all the tables are on the linked server would support creating a stored procedure on that server.

  • Maybe just create a view (on the remote server) and query that.

    Test it and see what the effects are.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Depending on the size of the data, copying it all local and then doing the query could be significantly faster. This is especially true if you can restrict the data copied with appropriate where clauses.

    What about setting up replication to keep local copies up to date? I have a client that has gotten tremendous perf gains in a system that once used LOTS of remote server queries. They are just a complete and total DOG when it comes to perf, and present other problems such as blocking as well.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks gang. We'll have something implemented soon, I'll provide performance metrics on what I find.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply