query performance vs server to server latency

Question

Post reply

query performance vs server to server latency

sqldriver

SSChampion

Points: 10440
More actions
October 9, 2013 at 10:48 am

#299621

I've tried running two different queries: one that was based on a much smaller query that didn't scale, and one that I thought would scale better, since it used a CTE to pre-process the data set to test. I've run both separately, and cancelled each after an hour. Both columns are varchar and neither is nullable. Running different portions of each query returns quick results. It's comparing ~56k rows to ~588k rows.
The Thing That Shall Not Scale:
select a1.sam_batch, count(*) as Records
from aww_sample a1,
[p42].sample.dbo.wireless_blocks p2
where substring(a1.sam_phonenumber, 4, 9) = p2.match
and a1.sam_batch between 437 and 449
group by a1.sam_batch
From Query To Eternity:
with cc(sam_batch, phone) as (
select sam_batch, SUBSTRING(sam_phonenumber, 4, 9)
from aww_sample
where sam_batch between 437 and 449
)
select cc.sam_batch, count(*)
from cc,
[p42].sample.dbo.wireless_blocks p2
where cc.phone = p2.match
group by cc.sam_batch
So then I re-wrote things a touch and ran from the server with the larger table. It completed in 2 seconds (mostly) flat:
with cc(sam_batch, phone) as (
select sam_batch, SUBSTRING(sam_phonenumber, 4, 9)
from [phlsql].capabilities.dbo.aww_sample
where sam_batch between 437 and 449
)
select cc.sam_batch, count(*) as Records
from cc,
sample.dbo.wireless_blocks p2
where cc.phone = p2.match
group by cc.sam_batch
Is there a smart SQL reason for this, or is the network the culprit?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 1

erikd (10/9/2013)
Is there a smart SQL reason for this, or is the network the culprit?

Yes it could be either OR there are many reasons to explain your concern(actually surprise) here.

Couple of days back, i have seen a post replied by paul white that how the art of writing (re-designing) of query can boost the performance.

its basically how the data will be handled by memory buffer

or table spooling/ work tables happeining in tempDB.

how the index is taking help or retrieving data with the help of clustered index (key lookup and then sorting happening)

So there are other factors too. i have spent time to search that post so that i can refer that here but couldnt retreive it . but i hope that either GAIL , JEFF or PAUL will hit this post and do the explanation. 🙂

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Sean Pearce SSCoach Points: 15999 More actions · Answer 2

/* ON THE SERVER phlsql */

CREATE PROCEDURE dbo.FetchData

AS

SELECT sam_batch, SUBSTRING(sam_phonenumber, 4, 9) AS phone

FROM capabilities.dbo.aww_sample

WHERE sam_batch BETWEEN 437 AND 449;

GO

/* ON THE SERVER p42 */

CREATE TABLE #Results (sam_batch [DATATYPE], phone VARCHAR(9));

INSERT INTO #Results

EXEC [phlsql].capabilities.dbo.FetchData;

SELECT r.sam_batch, COUNT(*) as Records

FROM #Results r

INNER JOIN sample.dbo.wireless_blocks p2

ON r.phone = p2.match

GROUP BY r.sam_batch;

The SQL Guy @ blogspot[/url]

@SeanPearceSQL

About Me[/url]

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 3

Sean Pearce (10/10/2013)
/* ON THE SERVER phlsql */ CREATE PROCEDURE dbo.FetchData AS SELECT sam_batch, SUBSTRING(sam_phonenumber, 4, 9) AS phone FROM capabilities.dbo.aww_sample WHERE sam_batch BETWEEN 437 AND 449; GO /* ON THE SERVER p42 */ CREATE TABLE #Results (sam_batch [DATATYPE], phone VARCHAR(9)); INSERT INTO #Results EXEC [phlsql].capabilities.dbo.FetchData; SELECT r.sam_batch, COUNT(*) as Records FROM #Results r INNER JOIN sample.dbo.wireless_blocks p2 ON r.phone = p2.match GROUP BY r.sam_batch;

This is probably the best solution because network traffic to support the join is minimised. There's another way to do it, using a join hint to send the small table to the server hosting the big table:

SELECT

a1.sam_batch,

COUNT(*) as Records

FROM [p42].sample.dbo.wireless_blocks p2

INNER REMOTE JOIN aww_sample a1

ON SUBSTRING(a1.sam_phonenumber, 4, 9) = p2.match

AND a1.sam_batch BETWEEN 437 AND 449

GROUP BY a1.sam_batch

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

sqldriver SSChampion Points: 10440 More actions · Answer 4

Thanks, Sean and Chris. It was a one off query, otherwise I would have probably moved some stuff around. But it got me thinking about what the heck was taking so long.

That's a neat trick with the remote hint. Another simple thing I've seen on here that I'm mad at myself for not knowing.

I'm going to try it out with another query that kind of gets beat up by comparing to another remote table. I was, as a novice looking at execution plans, sort of surprised to find the two remote queries taking up 49% of the entire plan a piece; mostly because I thought parts of the query were bad enough to take up more than 1% 😉