Developer making me look ignorant

Question

Developer making me look ignorant

Elliott Berkihiser

SSCarpal Tunnel

Points: 4068
More actions
March 23, 2011 at 2:38 pm

#247321

It's the DBA's vs the Developers, again.
No, not really. But I was asked a good question by one of our developers and I could use some help with it.
The issue is how data is handled during the processing of a linked server query. The subject query will join eleven tables, with most of them on the "linked" server. The tables are wide, but only a few columns will actually be utilized in the query. The challenge is to design the process so that the minimum of data will cross the pipe. The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.
Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?
Thanks for your help,
Elliott

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 1

Start by looking at the execution plan to see exactly what it is doing.

However, it is likely that it is pulling back too much data, so consider doing a call to a remote stored procedure to optimize the remote processing and minimize the traffic.

Maxim Picard SSCrazy Points: 2052 More actions · Answer 2

Hi Elliot,

I think this may help:

http://www.sql-server-performance.com/tips/linked_server_p1.aspx

There are a lot of caveats when running a distributed query you have a very wise developer on your team to ask this question before the performance problem occurs 🙂

Elliott Berkihiser SSCarpal Tunnel Points: 4068 More actions · Answer 3

Thanks for both replies. The query hasn't been written yet, so we haven't been able to use the execution plan, but it appears clear that we should create some processes on the remote server to do some of the processing/filtering there.

And yes, I am very lucky to have a great relationship with Development. Sometimes when I say "no, there's a better way", they actually listen!

Cheers,

Elliott

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Elliott Berkihiser (3/23/2011)
The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.

It can, but in most cases it won't.

Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?

No, views (unless indexed) are just cached select statements and (afaik for remote) get inlined during compile.

What I normally recommend for this kind of scenario is OPENQUERY rather than 4-part naming. Take the piece that should be remote and send it to the remote server, don't let the servers decide.

p.s. Go Developers! 😀

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 5

I'm with Gail. Most of the time OPENQUERY puts the processing on the remote machine. You can get some filtering done there and then only bring back the data you need (Go Marketers).

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Elliott Berkihiser SSCarpal Tunnel Points: 4068 More actions · Answer 6

Gail, Grant

Thanks.

I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.

I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?

(Gail, if you've already answered that, I apologize, but I didn't grasp it)

Elliott

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 7

Elliott Berkihiser (3/23/2011)
Gail, Grant
Thanks.
I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.
I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?
(Gail, if you've already answered that, I apologize, but I didn't grasp it)
Elliott

The view won't make that any different than simply defining those columns in the SELECT criteria. Views don't store data. They just store a query, which retrieves data pretty much as any other query. When you query a view, you're just querying a query, no filtering is done except as you define it.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 8

OPENQUERY is your friend. It guarantees the query will be executed on the remote server.

With 11 tables, most of which are on the remote server, you have your hands full that's for sure. The only complaint I have with OPENQUERY is that it will not accept a variable for the query parameter.

Here are a couple more options for you to consider.

1. Use a remote stored procedure:

If enough of the work can be done on the remote server without having access to the local data, yet still reduce the dataset to a size you are comfortable transmitting over the pipe, then you could explore creating a stored procedure on the remote server. You could then execute the remote procedure using four-part naming and pipe the results into a local temp table.

IF OBJECT_ID(N'tempdb..#tmp_table') > 0

DROP TABLE #tmp_table

GO

CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;

GO

INSERT INTO #tmp_table

EXEC [.\SQLEXPRESS_2008].test.dbo.test_remote_proc

SELECT *

FROM #tmp_table

go

2. Use EXEC...AT:

Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:

IF OBJECT_ID(N'tempdb..#tmp_table') > 0

DROP TABLE #tmp_table

GO

CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;

GO

DECLARE @query NVARCHAR(MAX) = N'SELECT 1 AS id, ''my name'' as name;' ;

INSERT INTO #tmp_table

(

id,

name

)

EXEC (

@query

) AT [LINKED_SERVER_NAME] ;

SELECT *

FROM #tmp_table ;

GO

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 9

opc.three (3/23/2011)
2. Use EXEC...AT:
Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:

... ooooh, that's neat. I hadn't seen that one before.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Elliott Berkihiser (3/23/2011)
I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?

By the time the execution starts, there's no mention of the view left. Let's look at an example just with one server. Assume for a moment that there are two tables, T1 and T2, each with 10 columns. A view is defined

CREATE View V1

AS

SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6

FROM t1 inner join t2 on t1.c1 = t2.c2

WHERE t2.c10 is not null

Then a query is executed against that view

SELECT * FROM V1 WHERE c6 > 0

During parsing, the name of the view is replaced by the definition

SELECT * FROM (

SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6

FROM t1 inner join t2 on t1.c1 = t2.c2

WHERE t2.c10 is not null

) V1 WHERE c6 > 0

That query is what is then optimised and executed.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

WayneS SSC Guru Points: 95460 More actions · Answer 11

Craig Farrell (3/23/2011)
opc.three (3/23/2011)
2. Use EXEC...AT:
Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:
... ooooh, that's neat. I hadn't seen that one before.

Neat indeed. I've learned something... 🙂

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2