Join performance

Question

Post reply

Join performance

Confusing Queries

Ten Centuries

Points: 1137
More actions
February 1, 2012 at 9:58 am

#252146

The join below takes 18 seconds on development box and on production box its taking more than 5 mins. As I do not have enough access to execution plan on production box. I posted execution plan from development box. Number of rows are same both in development and in production. We can not afford that much time for this join.
GEN_SUBORDERS has 35 million rows
STAGE_RETURNS_CLS_TMP has 2283 rows.
I have also attached Execution plan.
Select
S.SITE_ID,
S.SUBORDER_ID,
S.SUBORDER_RECEIVED_DATE
FROM
dbo.GEN_SUBORDERS S INNER JOIN
dbo.STAGE_RETURNS_CLS_TMP C ON C.DS_ORDER_ID = S.SUBORDER_ID

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 1

Please ask someone who does have sufficient permission to get the actual execution plan from the production server.

Is that the entire query? No filters, no aggregation?

Have you tried exists, since it appears that you don't actually want any columns from the one table?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ernest Libertucci Ten Centuries Points: 1394 More actions · Answer 2

From what information you have given, I believe you should look at indexing for this.

At a minimum, you would probably want indexes with the join columns as the left-most column to support the join, e.g. gen_suborders.suborder_id and stage_returns_cls_tmp.ds_order_id.

In SQL 2005 or later you can use include as well and add get_suborders.site_id and gen_suborders.suborder_received_date to an index on site_id for an index seek.

Of course, the topic of indexing is a complex one and there are many considerations when adding new ones, so knowledge of the domain can affect these choices (existing clustered index, available disk space, select vs insert/update/delete etc).

Confusing Queries Ten Centuries Points: 1137 More actions · Answer 3

can you please explain this further...I did not get it.... "In SQL 2005 or later you can use include as well and add get_suborders.site_id and gen_suborders.suborder_received_date to an index on site_id for an index seek."

Geoff A SSChampion Points: 11511 More actions · Answer 4

include is a statement that is used when creating the index.

USE AdventureWorks2008R2;

GO

CREATE INDEX IX_Document_Title

ON Production.Document (Title, Revision)

INCLUDE (FileName);

jcb SSCertifiable Points: 6574 More actions · Answer 5

Someone droped the index at production?

If can only make selects at production box try a select top 10 id column order by id column for each table.

if its too slow then there are no indexes for it since it ill need to perform a full table scan.

Confusing Queries Ten Centuries Points: 1137 More actions · Answer 6

This is the task we are performing....sometimes it takes 40 to 45 minutes.....and sometimes it just takes 1 minutes...

I am also attaching actual execution plan on production...it did not take long though but, I want to over come those high bumps of 40 or 45 mins.

UPDATE

STAGE_RETURNS_CLS_TMP

SET

MATCH_FLAGS = CASE WHEN MATCH_FLAGS&1 = 1 THEN 4 ELSE MATCH_FLAGS + 4 END,

SITE_ID_CLEAN = S.SITE_ID,

DS_ORDER_ID_CLEAN = S.SUBORDER_ID,

ORDER_CREATED_DATE = S.SUBORDER_RECEIVED_DATE,

DML_OPCODE = 2,

DML_UPDATE_DATE = GETDATE()

FROM

STAGE_RETURNS_CLS_TMP C INNER JOIN

GEN_SUBORDERS S ON C.DS_ORDER_ID = S.SUBORDER_ID

WHERE

C.MATCH_FLAGS = 1 AND

C.DS_ORDER_ID_CLEAN IS NULL

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 7

The plans are identical, so it's not missing indexes or anything like that.

Have you checked for blocking when it runs slow? Have you checked what the wait_type is when it runs slow.

p.s. What is that update query? If doesn't match the execution plan (which is a straight select)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 8

One thing I did notice is that the memory grant for the hash join is much smaller on production. Possibly greater workload so less available memory. Have you checked for hash spills (SQLTrace)?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Geoff A SSChampion Points: 11511 More actions · Answer 9

if that query ran multiple times and you are setting your isolation level to READ COMMITTED, it would explain its 40 minute run time behavior.

Jared SSC Guru Points: 61793 More actions · Answer 10

What happens when you put your where conditions into your join?

Jared
CE - Microsoft

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 11

SQLKnowItAll (2/1/2012)
What happens when you put your where conditions into your join?

If you're talking about the update, nothing will happen. It's an inner join, conditions in the where or join are completely equivalent in results and performance.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jared SSC Guru Points: 61793 More actions · Answer 12

GilaMonster (2/1/2012)
SQLKnowItAll (2/1/2012)
What happens when you put your where conditions into your join?
If you're talking about the update, nothing will happen. It's an inner join, conditions in the where or join are completely equivalent in results and performance.

Good to know! I suppose the optimizer realizes to treat it this way? Because in my head it made sense in an order of operations, so my guess is that the optimizer knows what's going on and treats it as such.

Jared
CE - Microsoft

Ernest Libertucci Ten Centuries Points: 1394 More actions · Answer 13

Both filters in the WHERE clause refer to the smaller table, so they won't impact performance so much. I would recommend creating the following index on the development server and testing, since by the query plan it is running an index scan on the large table and this would use an index seek, which is much faster. I also agree with the previous suggestion that blocking could be an issue if multiple updates are happening against the same table, but we don't know this from the information given.

CREATE INDEX IX_YourNamingConvention

ON dbo.GEN_SUBORDERS (SUBORDER_ID)

INCLUDE (SITE_ID, SUBORDER_RECEIVED_DATE);

If you use a separate filegroup for indexes, create it on that filegroup, of course.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 14

This is something of a long shot based on observation. There are two key changes.

UPDATE c SET

MATCH_FLAGS = CASE WHEN c.MATCH_FLAGS&1 = 1 THEN 4 ELSE c.MATCH_FLAGS + 4 END,

SITE_ID_CLEAN = S.SITE_ID,

DS_ORDER_ID_CLEAN = S.SUBORDER_ID,

ORDER_CREATED_DATE = S.SUBORDER_RECEIVED_DATE,

DML_OPCODE = 2,

DML_UPDATE_DATE = GETDATE()

FROM STAGE_RETURNS_CLS_TMP C

WHERE EXISTS (SELECT 1 FROM GEN_SUBORDERS S WHERE C.DS_ORDER_ID = S.SUBORDER_ID)

AND C.MATCH_FLAGS = 1

AND C.DS_ORDER_ID_CLEAN IS NULL

Edit: changed to where exists

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]