March 23, 2011 at 2:38 pm
It's the DBA's vs the Developers, again.
No, not really. But I was asked a good question by one of our developers and I could use some help with it.
The issue is how data is handled during the processing of a linked server query. The subject query will join eleven tables, with most of them on the "linked" server. The tables are wide, but only a few columns will actually be utilized in the query. The challenge is to design the process so that the minimum of data will cross the pipe. The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.
Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?
Thanks for your help,
Elliott
March 23, 2011 at 2:44 pm
Start by looking at the execution plan to see exactly what it is doing.
However, it is likely that it is pulling back too much data, so consider doing a call to a remote stored procedure to optimize the remote processing and minimize the traffic.
March 23, 2011 at 2:44 pm
Hi Elliot,
I think this may help:
http://www.sql-server-performance.com/tips/linked_server_p1.aspx
There are a lot of caveats when running a distributed query you have a very wise developer on your team to ask this question before the performance problem occurs 🙂
March 23, 2011 at 3:03 pm
Thanks for both replies. The query hasn't been written yet, so we haven't been able to use the execution plan, but it appears clear that we should create some processes on the remote server to do some of the processing/filtering there.
And yes, I am very lucky to have a great relationship with Development. Sometimes when I say "no, there's a better way", they actually listen!
Cheers,
Elliott
March 23, 2011 at 3:20 pm
Elliott Berkihiser (3/23/2011)
The developer asked me whether the SQL engine on the remote server would do some of the work and only send filtered results back to the originating server, or whether the entire table would need to be accessed and filtered on the home server.
It can, but in most cases it won't.
Good question. And if it turns out that the remote SQL engine won't do any work, can we employ it more effectively by creating views on the remote tables that will pre-extract just the columns we are interested in?
No, views (unless indexed) are just cached select statements and (afaik for remote) get inlined during compile.
What I normally recommend for this kind of scenario is OPENQUERY rather than 4-part naming. Take the piece that should be remote and send it to the remote server, don't let the servers decide.
p.s. Go Developers! 😀
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 23, 2011 at 3:59 pm
I'm with Gail. Most of the time OPENQUERY puts the processing on the remote machine. You can get some filtering done there and then only bring back the data you need (Go Marketers).
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 23, 2011 at 4:05 pm
Gail, Grant
Thanks.
I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.
I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?
(Gail, if you've already answered that, I apologize, but I didn't grasp it)
Elliott
March 23, 2011 at 4:11 pm
Elliott Berkihiser (3/23/2011)
Gail, Grant
Thanks.
I understand how OPENQUERY directs the processing to the remote machine, and I'll look at that.
I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?
(Gail, if you've already answered that, I apologize, but I didn't grasp it)
Elliott
The view won't make that any different than simply defining those columns in the SELECT criteria. Views don't store data. They just store a query, which retrieves data pretty much as any other query. When you query a view, you're just querying a query, no filtering is done except as you define it.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 23, 2011 at 4:16 pm
OPENQUERY is your friend. It guarantees the query will be executed on the remote server.
With 11 tables, most of which are on the remote server, you have your hands full that's for sure. The only complaint I have with OPENQUERY is that it will not accept a variable for the query parameter.
Here are a couple more options for you to consider.
1. Use a remote stored procedure:
If enough of the work can be done on the remote server without having access to the local data, yet still reduce the dataset to a size you are comfortable transmitting over the pipe, then you could explore creating a stored procedure on the remote server. You could then execute the remote procedure using four-part naming and pipe the results into a local temp table.
IF OBJECT_ID(N'tempdb..#tmp_table') > 0
DROP TABLE #tmp_table
GO
CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;
GO
INSERT INTO #tmp_table
EXEC [.\SQLEXPRESS_2008].test.dbo.test_remote_proc
SELECT *
FROM #tmp_table
go
2. Use EXEC...AT:
Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:
IF OBJECT_ID(N'tempdb..#tmp_table') > 0
DROP TABLE #tmp_table
GO
CREATE TABLE #tmp_table (id INT, name VARCHAR(100)) ;
GO
DECLARE @query NVARCHAR(MAX) = N'SELECT 1 AS id, ''my name'' as name;' ;
INSERT INTO #tmp_table
(
id,
name
)
EXEC (
@query
) AT [LINKED_SERVER_NAME] ;
SELECT *
FROM #tmp_table ;
GO
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 23, 2011 at 4:20 pm
opc.three (3/23/2011)
2. Use EXEC...AT:Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:
... ooooh, that's neat. I hadn't seen that one before.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 24, 2011 at 2:01 am
Elliott Berkihiser (3/23/2011)
I was curious about the situation with Views. Since the main filtering we want is to select only a few columns, but most of the rows, wouldn't a view on the remote machine have the effect of limiting the amount of data sent back to the originating server?
By the time the execution starts, there's no mention of the view left. Let's look at an example just with one server. Assume for a moment that there are two tables, T1 and T2, each with 10 columns. A view is defined
CREATE View V1
AS
SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6
FROM t1 inner join t2 on t1.c1 = t2.c2
WHERE t2.c10 is not null
Then a query is executed against that view
SELECT * FROM V1 WHERE c6 > 0
During parsing, the name of the view is replaced by the definition
SELECT * FROM (
SELECT t1.c1, t1.c2, t2.c4, t2.c5, t2.c6
FROM t1 inner join t2 on t1.c1 = t2.c2
WHERE t2.c10 is not null
) V1 WHERE c6 > 0
That query is what is then optimised and executed.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 24, 2011 at 8:13 am
Craig Farrell (3/23/2011)
opc.three (3/23/2011)
2. Use EXEC...AT:Another technique I have found useful is to use the EXEC...AT expression because it let's you use a variable and you can still pipe the results into a local temp table:
... ooooh, that's neat. I hadn't seen that one before.
Neat indeed. I've learned something... 🙂
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply