Linked Server Difference between Openquery /Exec At/Four part Linked Server

Question

Linked Server Difference between Openquery /Exec At/Four part Linked Server

deepak.a

SSCertifiable

Points: 5330
More actions
May 13, 2011 at 12:28 am

#250889

Hi All,
Have often to join or query data from one server another using a Linked Server, I need to know whats the difference
between using an Open Query or an Exec at or four part Linked server also called Distributed query
Which of the below three will give best performance and what are the prons and cons
> Open Qyuery
> Linked server four part queries are also called distributed queries
> Exec At
Thanks And Regards
Deepak.A

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 1

aha -- not a simple answer as it just depends. If you start making distributed queries across servers you have to get the indexing and queries absolutley right to avoid unwanted data loads.

Federated servers usually partitioned views, I've never deployed these in a production system but I have done a lot of testing .. it can work well. Openquery I've always avoided but I forget now why ( of course things might have changed ).

Your question doesn't really sound as if you're looking for a business solution otherwise you'd already have a good idea of which way to go, perhaps you could be a bit more precise on what you hope to achieve?

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

deepak.a SSCertifiable Points: 5330 More actions · Answer 2

colin.Leversuch-Roberts (5/13/2011)
aha -- not a simple answer as it just depends. If you start making distributed queries across servers you have to get the indexing and queries absolutley right to avoid unwanted data loads.
Federated servers usually partitioned views, I've never deployed these in a production system but I have done a lot of testing .. it can work well. Openquery I've always avoided but I forget now why ( of course things might have changed ).
Your question doesn't really sound as if you're looking for a business solution otherwise you'd already have a good idea of which way to go, perhaps you could be a bit more precise on what you hope to achieve?

What i wish to know exactly is , when we want perform the Remote statement executions on federated server

what are the best guide lines to follow. currently im using four part Linked serevr for this purpose some times this is

resulting in bad xecution plans requiring large amounts of data from remote servers/queries to reterive one or few records

even though we have indexed columns in the where clause.

and now im looking to use Exec At or open query. Af course i cant replcae Four part Linked Server query to open query or exec At

Directly. because of many reasons like in exec At we cant join local table and remote table because it send the query remote server to execute.

so can you please suggest few guidelines here to optimize the remote query.

Thanks

Deepak.A

Syed Jahanzaib Bin hassan SSChampion Points: 11410 More actions · Answer 3

If you can use Replication then go to this because linked server have performance issue always because network will involve and you will fetch whole data from the Server A to Server B.Fast solution is Replicate your desire tables on Server B and then apply queries on it and NOTE it distributed queries always require 2 Phase commit,In the SELECTION of the data from 1 server to another then there is no any distributed transaction but when you write a transaction of DML and apply commit and rollback then there must Distrubuted transaction for Example

There is no any 2 phase commit

========================

Select tb1.* from tab1 tb1 inner join [serverA].[db1].[dbo].[tb2] tb2 on tb1.id = tb2.id

There is a 2 phase commit

====================

Insert into tab1

select * from [serverA].[db1].[dbo].[tb2]

that is an example,we can say DML from 1 Server to another is must be a distributed

DO NOT use linked Server for daily routine queries,if you dont have enough resources major is NETWORK

My Blog
www.aureus-salah.com

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 4

yes that's the problem with federated servers. I asked because I wasn't sure if your model was partitioning across the servers. For queries across servers I tend to execute a rpc call to bring back the remote data and then minipulate the results with the host server - over the years I've found this is the best performing although not always the most elegant.

Generally with federated servers deployed as partitioned views you need to include the partioned column(s) in the where clause of your query otherwise as you've discovered the results can be poor. Allan Mitchell first showed this many years ago at an early UK SSUG meeting, probably ten years ago now.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 5

Syed Jahanzaib Bin hassan (5/13/2011)
If you can use Replication then go to this because linked server have performance issue always because network will involve and you will fetch whole data from the Server A to Server B.Fast solution is Replicate your desire tables on Server B and then apply queries on it and NOTE it distributed queries always require 2 Phase commit,In the SELECTION of the data from 1 server to another then there is no any distributed transaction but when you write a transaction of DML and apply commit and rollback then there must Distrubuted transaction for Example
There is no any 2 phase commit
========================
Select tb1.* from tab1 tb1 inner join [serverA].[db1].[dbo].[tb2] tb2 on tb1.id = tb2.id
There is a 2 phase commit
====================
Insert into tab1
select * from [serverA].[db1].[dbo].[tb2]
that is an example,we can say DML from 1 Server to another is must be a distributed
DO NOT use linked Server for daily routine queries,if you dont have enough resources major is NETWORK

The above post is very poor advice and there are too many wrong statements within it to bother trying to correct it.

Depending (there is that word again) on the volume of data changes a setup involving transactional replication can utilize infinitely more network and system resources than periodically running a well-crafted query using Linked Servers...by the way some transactional replication setups utilize Linked Servers under the covers to push updates to subscribers.

With that out of the way...

You can safely use Linked Servers in a production setting but do your research and test before settling on a solution. As with most things there are good and bad ways to use Linked Servers.

To try to compare and contrast the methods you asked about:

- OPENQUERY guarantees the SQL will execute on the remote server and only bring back the results from that query to the local server

- EXEC() AT is similar to OPENQUERY in that the SQL will always execute on the remote server except that you cannot use the results locally within a JOIN...you can use them however to INSERT them into a local table. Also, EXEC() AT allows you to provide SQL in a variable whereas OPENQUERY will not accept a variable which many times forces the use of dynamic SQL.

- The Four-part Linked Server is a bit more intuitive to write than the other two IMO but it is much more sneaky...I do not know all the internals but suffice it to say that you have less control over where queries are executed when using four-part naming. I typically avoid using them within a query involving JOINs and only use them to bring data back from a remote table to store in a local temp/staging table for later use...something like INSERT INTO #temp_table (col1, col2) SELECT remote_col1, remote_col2 FROM linked_server_name.remote_database.remote_schema.remote_table WHERE remote_col1 = 'some_constant' ;

Like GrumpyOldDBA I opt to pull all remote data into local tables and then work with them, whether that means executing an INSERT INTO...EXEC() AT providing a complex JOIN query for remote execution or an INSERT INTO...SELECT* FROM [four-part linked table]. Somehow using the divide-and-conquer and only bringing over the remote data I absolutely need makes me think I have more control over what's going on 😉

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Syed Jahanzaib Bin hassan SSChampion Points: 11410 More actions · Answer 6

Depending (there is that word again) on the volume of data changes a setup involving transactional replication can utilize infinitely more network and system resources than periodically running a well-crafted query using Linked Servers...by the way some transactional replication setups utilize Linked Servers under the covers to push updates to subscribers.

Transaction Replication utilize network at the time of single transaction complete on the Publisher,those statements are not correct then explain it

What is the best for the performance

1-You query to a Database which one exist on the same Server ?

2-You query to a Database which one exist on another Server ?

prove it ? how 2 one is best as you said ?

My Blog
www.aureus-salah.com

deepak.a SSCertifiable Points: 5330 More actions · Answer 7

colin.Leversuch-Roberts (5/13/2011)
yes that's the problem with federated servers. I asked because I wasn't sure if your model was partitioning across the servers.

yes we have partitioned across and also we have added check constraint on each table primary keys(i.e ([ID]>=(1000000000000000.) AND [ID]<=(1999999999999999.)) ,([ID]>=(2000000000000000.) AND [ID]<=(2999999999999999.)))

For queries across servers I tend to execute a rpc call to bring back the remote data and then minipulate the results with the host server - over the years I've found this is the best performing although not always the most elegant.

i have done few sceneria's but for some use case i cant able to implement RPC

Generally with federated servers deployed as partitioned views you need to include the partioned column(s) in the where clause of your query otherwise as you've discovered the results can be poor. Allan Mitchell first showed this many years ago at an early UK SSUG meeting, probably ten years ago now.

Yes , we have created the DPV (Distributed partition views) the problem im facing with this is i'm not always able to use the partitioned column in jojn clause or in where clause some times i may need join / filter based on different columns along with the partitioned column , in such case im facing the remote query parametrization problem and join is performing in local first it is copying huge volume of data from the remote and then it is performing the join in the local site.

can you please share how to solve this problems , what are the steps we need to follow in this kind of situations

Thanks,

Deepak.A

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 8

Syed Jahanzaib Bin hassan (5/13/2011)
Depending (there is that word again) on the volume of data changes a setup involving transactional replication can utilize infinitely more network and system resources than periodically running a well-crafted query using Linked Servers...by the way some transactional replication setups utilize Linked Servers under the covers to push updates to subscribers.
Transaction Replication utilize network at the time of single transaction complete on the Publisher,those statements are not correct then explain it

You are correct...and with a table handling a high-volume or INSERT, UPDATE or DELETE transactions the resulting network utilization to keep subscribers up to date can be substantial...more substantial than say a Linked Server query run only once or a small number of times per day.

Syed Jahanzaib Bin hassan (5/13/2011)
What is the best for the performance
1-You query to a Database which one exist on the same Server ?
2-You query to a Database which one exist on another Server ?

Of course a local query will be easier for SQL Server to deliver than a remote query...however that was not the point. The point is that adding replication to an environment can use more network and system resources than periodically running a well-crafted query using Linked Servers. Please re-read my post. I posted in response to your barely intelligible comment:

DO NOT use linked Server for daily routine queries,if you dont have enough resources major is NETWORK

I can hardly decipher what your comment means however it appears as if you are recommending replication in favor of Linked Servers without knowing the volume of data in question and the frequency with which the query using Linked Servers would need to run. Replication may be a good way to go, however you do not know all the variables yet you still made a blanket statement about choosing replication over Linked Servers...therefore your advice is poor.

prove it ? how 2 one is best as you said ?

You say to prove it? I have seen you respond that way to other people that have disputed your posts as well. I doubt I (or anyone else) could prove anything to you in this forum. You don't bother reading the posts on this site carefully and understanding them before spouting off or acting as a Google-proxy and posting abunch of unrelated links with no explanation as to why you posted them. If I tried to prove anything to you I would expect more of what just happened...that you would partially read my post, jump to some inaccurate conclusion, and then present some contrived example that proves something tangential to the topic in an attempt to discredit my comments. Good luck with that approach in life...I hope no one is paying you for assistance with SQL Server.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 9

yup I agree with you opc.three. You can't just declare replication will solve this that and the other. when databases/tables are large making changes or re-initialising the objects can become a nightmare - no matter what bandwidth you have once tables approach a TB this type of stuff is difficult. Federated Servers are often so because of geographical requirements or load ( for instance ).

I don't have an answer for the poor distributed query I'm afraid - I'd suggest you might look into partitioning the tables in an attempt to reduce the active data sets. I don't know if you can pre aggregate or prepare the data to try to improve matters.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 10

colin.Leversuch-Roberts (5/16/2011)
yup I agree with you opc.three. You can't just declare replication will solve this that and the other. when databases/tables are large making changes or re-initialising the objects can become a nightmare - no matter what bandwidth you have once tables approach a TB this type of stuff is difficult. Federated Servers are often so because of geographical requirements or load ( for instance ).

Another very good point. Replication introduces numerous challenges into the equation not the least of which is the init piece, especially for large tables.

I don't have an answer for the poor distributed query I'm afraid - I'd suggest you might look into partitioning the tables in an attempt to reduce the active data sets. I don't know if you can pre aggregate or prepare the data to try to improve matters.

Me neither. If you do not have the ability to force the use of the partitioning key in all queries when using 4-part naming then I'm with GrumpyOldDBA, I would also recommend trying to break down the problem so you can reduce the initial remote data set. Partitioning so you don't involve all data on the remote server is a very good idea and should be transparent to your current queries.

Another option is to completely switch from issuing queries with 4-part object names to start using stored procedures for data retrieval. The stored procedures could execute queries that make use of alternate non-partition-key indexes on the remote server(s) via EXEC() AT and/or OPENQUERY pulling that data into local temp/staging tables before working with it to build the resultset the user really wants. This could potentially give you more granular control over the size of the initial dataset pulled from the remote servers.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato