Issue with a slow Query

Question

Issue with a slow Query

T.Ashish

SSCarpal Tunnel

Points: 4248
More actions
June 9, 2014 at 12:06 am

#307978

Hi all,
I have attached a script and its execution plan for your review.
We are working for a long time on optimizing this script, and latest modification I did was splitting script into small blocks and using UNION ALL.
This helped a lot but still it takes some 45 mins on prod environment. I'm not able to find anything more in this script to optimize.
Kindly review and suggest something in this script. Any help will be highly appreciated.
Regards.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

T.Ashish (6/9/2014)
Hi all,
I have attached a script and its execution plan for your review.
We are working for a long time on optimizing this script, and latest modification I did was splitting script into small blocks and using UNION ALL.
This helped a lot but still it takes some 45 mins on prod environment. I'm not able to find anything more in this script to optimize.
Kindly review and suggest something in this script. Any help will be highly appreciated.
Regards.

The optimizer is timing out attempting to find a good plan for this complicated query, see "Reason for Early Termination of Statement" in the property tab of the SELECT in the plan. What this means is that the plan isn't guaranteed to be "good enough", and there's usually no point in analysing it further. A typical response to this finding would be to break the statement up into two or more separate queries which can be properly analysed, and running intermediate results into #temp tables.

Your statement consists of four queries UNIONed together using the ALL keyword (UNION ALL) except for the last query, where the ALL keyword is omitted. Are you sure you know the implications of this - even though the plan shows the cost of the deduplication as only 1% of the total?

The four queries are variations on a theme, only the WHERE clause is different. I'd recommend you isolate the core query including any columns later used as join/filter predicates and run the results of this query into a #temp table. Then either query the temp table four times, or merge the four remaining WHERE clauses - you will probably find the first approach easier to implement than the second. If you do implement this approach, you're likely to get a good enough plan for the core query - which will be worthwhile investigating for possible optimisations (note the warnings in the existing plan - "No join predicate").

Here are a few tips to help with splitting up the query:

Ensure that the core query is exactly the same for all four of the unioned queries.

Work with each query separately

Use a CTE to represent the core query.

Something like this:

;WITH CoreQuery AS (

SELECT

wck.work_item_check_id as object_id,

pd.mode_id,

p.personal_id,

[wck_worker_id] = wck.worker_id,

[p_personal_id] = p.personal_id,

[jp_cord_id] = jp.cord_id,

[wo_new_owner_id] = wo.new_owner_id,

[w_worker_id] = w.worker_id

FROM dbo.person p

INNER JOIN dbo.work_item_check wck

on wck.buyer_code = p.company_code

INNER JOIN bsr1_offline.dbo.work_item_offline pd

on pd.object_id = wck.work_item_check_id and pd.mode_id in (870, 1350) and pd.activity_id = 10179

LEFT JOIN dbo.worker w

on w.worker_id = wck.worker_id

LEFT JOIN dbo.work_order wo

on wck.work_order_id = wo.work_order_id

LEFT join dbo.job_performance jp

on wo.job_performance_id = jp.job_performance_id

WHERE p.role_id !='5'

AND (p.access_to_all_site != 'ALL' OR p.access_to_all_bu != 'ALL' OR p.access_to_all_checkpoint != 'ALL')

AND p.access_to_flag = 1

AND NOT EXISTS (

SELECT 1 FROM bsr1_offline.dbo.offline_staging boos

WHERE boos.object_id = wck.work_item_check_id and boos.mode_id = pd.mode_id and boos.personal_id = p.personal_id)

)

SELECT *

FROM CoreQuery

WHERE 1 = 1

AND

(wck_worker_id <> '' AND

(

p_personal_id IN (jp_cord_id, wo_new_owner_id)

OR EXISTS (select 'x'

from dbo.worker_checkpoint wcc

where wcc.worker_id = w_worker_id

and wcc.active_flag = 1

and wcc.supervisor_id = p_personal_id)

)

This part of the exercise helps to confirm that the core query has been correctly identified. It also helps you to identify columns in the WHERE clauses which will need to be in the output from the core query. Do this for each of the four queries until you've identified all of the columns required for each of the WHERE clauses. When you're done, run the results of the query into a #temp table and query it four times, something like this:

SELECT

object_id,

mode_id,

personal_id

FROM #temp

WHERE 1 = 1

AND

(wck_worker_id <> '' AND

(

p_personal_id IN (jp_cord_id, wo_new_owner_id)

OR EXISTS (select 'x'

from dbo.worker_checkpoint wcc

where wcc.worker_id = w_worker_id

and wcc.active_flag = 1

and wcc.supervisor_id = p_personal_id)

)

UNION ALL

-- etc.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

T.Ashish SSCarpal Tunnel Points: 4248 More actions · Answer 2

Thanks Chris.

Yes, I had already tried creating # table as you advised. This was showing lot of improvement in i/o stats as main part was executing only 1 time instead of 4 times. But It could not show any major time gain so I had to revert.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

T.Ashish (6/9/2014)
Thanks Chris.
Yes, I had already tried creating # table as you advised. This was showing lot of improvement in i/o stats as main part was executing only 1 time instead of 4 times. But It could not show any major time gain so I had to revert.

I'm not sure what you're saying here. You have to split up or otherwise simplify the original query, one way or another, because the optimizer is timing out.

The simplest way would be to separate the four unioned queries and run their results into one or more #temp tables. That will give you four fairly simple plans to analyse.

The method most likely to improve performance is to identify the common part and run it only once, as I described earlier. Even if you get no immediate performance gain, you are at least in a position where you can tweak using clues from the plans - which you can't do with any confidence with the query as it is.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

wall str33t SSC Veteran Points: 253 More actions · Answer 4

Take one of the slowest queries and add top 1000 to it and run it without the unions and start picking away the pieces to which is the bottleneck. Change your "in" clauses to a (value1 or value2), always avoid "in" clauses. Next make sure that the cols in the where clauses are indexed properly

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

wall str33t (6/9/2014)
Change your "in" clauses to a (value1 or value2), always avoid "in" clauses.

Um, why?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004329 More actions · Answer 6

wall str33t (6/9/2014)
...always avoid "in" clauses

Since they're frequently faster than regular Inner Joins, why would I want to do that?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

wall str33t SSC Veteran Points: 253 More actions · Answer 7

In clauses are generally slower because people write them instead of joins, so they don't scale well, but in your example they are prob faster than separate or statements.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

wall str33t (6/9/2014)
Take one of the slowest queries and add top 1000 to it ...

Using TOP to choke the result set can dramatically change the execution plan whether or not the query has an ORDER BY clause, in which case you would end up, in effect, tuning a different query to the one you're interested in.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

wall str33t (6/9/2014)
In clauses are generally slower because people write them instead of joins,

Are you sure about that? Got some reproducible examples which show that 'ins are generally slower than joins'?

Also, are you distinguishing between IN (val1, val2, val3, ...) and IN (SELECT ...), which are processed completely differently?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ScottPletcher SSC Guru Points: 101097 More actions · Answer 10

You might want to give the query below a try, although it will output a given row only once, even if it would have matched multiple conditions in the original query.

select

wck.work_item_check_id as object_id, pd.mode_id, p.personal_id

from

dbo.person p

inner join dbo.work_item_check wck on wck.buyer_code = p.company_code

inner join bsr1_offline.dbo.work_item_offline pd on pd.object_id = wck.work_item_check_id and pd.mode_id in (870, 1350) and pd.activity_id = 10179

LEFT JOIN dbo.worker w on w.worker_id = wck.worker_id

LEFT JOIN dbo.work_order wo on wck.work_order_id = wo.work_order_id

LEFT join dbo.job_performance jp on wo.job_performance_id = jp.job_performance_id

where

p.role_id !='5' and (p.access_to_all_site != 'ALL' OR p.access_to_all_bu != 'ALL' OR p.access_to_all_checkpoint != 'ALL') and p.access_to_flag = 1

and not exists (select 1 from bsr1_offline.dbo.offline_staging boos where boos.object_id = wck.work_item_check_id and boos.mode_id = pd.mode_id and boos.personal_id = p.personal_id)

and

wck.worker_id != '' and

1 = case

when

(

p.personal_id = jp.cord_id or p.personal_id = wo.new_owner_id

or exists (select 'x'

from dbo.worker_checkpoint wcc

where wcc.worker_id = w.worker_id and wcc.active_flag = 1 and wcc.supervisor_id = p.personal_id)

)

then 1

when

((

(p.access_to_all_site = 'ALL'

OR exists (select 'x' from dbo.person_site ps where ps.site_id = wo.site_id and ps.personal_id = p.personal_id))

and (p.access_to_all_bu = 'ALL'

OR exists ( select 'x' from dbo.person_bu pb where pb.bu_id = wo.bu_id and pb.personal_id = p.personal_id))

and exists ( select 'x' from dbo.worker_checkpoint wcc , dbo.person_checkpoint pcc

where wcc.worker_id = w.worker_id and pcc.personal_id = p.personal_id and wcc.checkpoint_id = pcc.checkpoint_id)))

then 1

when

(p.personal_id = jp.cord_id or p.personal_id = wo.new_owner_id

or exists (select 'x' from dbo.worker_checkpoint wcc

where wcc.worker_id = w.worker_id and wcc.active_flag = 1 and wcc.supervisor_id = p.personal_id))

or

((p.access_to_all_site = 'ALL'

OR exists (select 'x' from dbo.person_site ps where ps.site_id = wo.site_id and ps.personal_id = p.personal_id))

and (p.access_to_all_bu = 'ALL'

OR exists ( select 'x' from dbo.person_bu pb where pb.bu_id = wo.bu_id and pb.personal_id = p.personal_id))

and (p.access_to_all_checkpoint = 'ALL'))

then 1

when

((p.personal_id = jp.cord_id or p.personal_id = wo.new_owner_id or p.personal_id = jp.owner_id or p.personal_id = wo.creator_id)

or

((p.access_to_all_site = 'ALL'

OR exists (select 'x' from dbo.person_site ps where ps.site_id = wo.site_id and ps.personal_id = p.personal_id)

)

and

(p.access_to_all_bu = 'ALL'

OR exists (select 'x' from dbo.person_bu pb where pb.bu_id = wo.bu_id and pb.personal_id = p.personal_id)

)

and

((p.access_to_all_checkpoint = 'ALL'

OR exists ( select 'x' from dbo.work_order_checkpoint wocc , dbo.person_checkpoint pcc

where wocc.work_order_id = wo.work_order_id and pcc.personal_id = p.personal_id and wocc.checkpoint_id = pcc.checkpoint_id)

or exists ( select 'x' from dbo.worker_checkpoint wcc , dbo.person_checkpoint pcc

where wcc.worker_id = w.worker_id and pcc.personal_id = p.personal_id and wcc.checkpoint_id = pcc.checkpoint_id)))))

then 1

else 0 end

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeff Moden SSC Guru Points: 1004329 More actions · Answer 11

GilaMonster (6/10/2014)
wall str33t (6/9/2014)
In clauses are generally slower because people write them instead of joins,
Are you sure about that? Got some reproducible examples which show that 'ins are generally slower than joins'?
Also, are you distinguishing between IN (val1, val2, val3, ...) and IN (SELECT ...), which are processed completely differently?

Second that... Let's see some proof.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

wall str33t SSC Veteran Points: 253 More actions · Answer 12

Also, are you distinguishing between IN (val1, val2, val3, ...) and IN (SELECT ...), which are processed completely differently?

Yes, this is what I was referring to when I mentioned joins

wall str33t SSC Veteran Points: 253 More actions · Answer 13

Keep in mind that the slow query may extend passed your t-sql here, it could be a result of poorly placed indexes, not enough indexes, fragmentation, the list goes on and on about how the server is also managed

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

wall str33t (6/10/2014)
Also, are you distinguishing between IN (val1, val2, val3, ...) and IN (SELECT ...), which are processed completely differently?
Yes, this is what I was referring to when I mentioned joins

If you were referring to the IN (SELECT ...) form, then why did you say to change IN (<comma-delimited list>), which is processed in a completely different way? Also, do you have that proof of IN being slower than a join?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass