Help with VERY slow update

Question

Help with VERY slow update

Shurkadze

SSC Eights!

Points: 942
More actions
July 7, 2011 at 12:44 pm

#243716

Thanks in advance to those with knowledge and willingness to help.
This problem has been baffling me for days.
I have a simple update like
UPDATE t
SET
col1 = t2.col,
col2 = t3.col,
col3 = t4.col,
col4 = t5.col,
col5 = t6.col,
col6 = t7.col
FROM
<table 1> t
INNER JOIN <table 2> t2 ON
...
LEFT JOIN <table 3> t3 ON
...
LEFT JOIN <table 4> t4 ON
...
LEFT JOIN <table 5> t5 ON
...
LEFT JOIN <table 6> t6 ON
...
LEFT JOIN <table 7> t7 ON
...
The table <table 1> has 5M rows.
The tables <table 2> thru <table 7> have between 1M and 30M rows. Indexes are all created and tables had "UPDATE STATISTICS" run.
The above update IS VERY SLOW (It takes hours).
Just to compare when I run the following update
UPDATE t
SET
col1 = 1,
col2 = 'xxx',
col3 = 1,
col4 = 'xxx'
col5 = NULL,
col6 = NULL
FROM
<table 1> t
INNER JOIN <table 2> t2 ON
...
LEFT JOIN <table 3> t3 ON
...
LEFT JOIN <table 4> t4 ON
...
LEFT JOIN <table 5> t5 ON
...
LEFT JOIN <table 6> t6 ON
...
LEFT JOIN <table 7> t7 ON
...
thus replacing column references with scalar values the update takes 30 seconds.
Notice: the only difference between the 2 updates are the SET statements, all the rest including FROM and JOINS are the same.
I observed that the execution plan for the 1st update uses many levels of parallelism while the 2nd update is one level.
All great ideas would be highly appreciated.
Alexander.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

It seems like the indexes available can't be used in the first statement since the indexes are not covering indexes (e.g. an index for table2 should not only include the columns used as join predicates but also -preferred as included column- the column t2.col. The same applies to the rest of the tables.

But without seeing the actual update statement including table def and index def for all tables involved it's hard to tell.

The parallelism you noticed most probably are caused by index scans. We would need the actual execution plan to analyze any further.

There are several options to improve the update (including the divide'n'conquer approach) but as mentioned we'd need more details.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Lowell SSC Guru Points: 323518 More actions · Answer 2

check your join criteria as well... any chance something could be invalid, like JOIN Table2 T2 t1.ID = t1.ID (when it should be t1.id = t2.id?)

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

roshan.zanwar SSChasing Mays Points: 617 More actions · Answer 3

Hi,

I think you should check your joining condition.

The columns which you are using in joining condition are all indexed or you are joining on non indexed column????

One more thing, Are you going from parent table to child table while joining?????

Thanks,

Roshan

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 4

How fast does it return results when you replace the update with a select ?

Are there Nulls in the columns on which your joining the tables.

Jayanth Kurup[/url]

roshan.zanwar SSChasing Mays Points: 617 More actions · Answer 5

Hi,

I think you should check your joining condition.

The columns which you are using in joining condition are all indexed or you are joining on non indexed column????

One more thing, Are you going from parent table to child table while joining?????

Thanks,

Roshan

Shurkadze SSC Eights! Points: 942 More actions · Answer 6

How fast does it return results when you replace the update with a select ?

Are there Nulls in the columns on which your joining the tables.

When I run COUNT (*) using the FROM condition it returns in 30 sec.

roshan.zanwar SSChasing Mays Points: 617 More actions · Answer 7

Obviously, COUNT(*) will return u the result faster as it just counts number of rows.

When using count(*) you are just joining the tables & counting number of rows.

While in update you are selecting values from each table & depending on your condition you are updating those values.

Of-course this action will take more time than just counting number of rows.

Shurkadze SSC Eights! Points: 942 More actions · Answer 8

roshan.zanwar (7/7/2011)
Obviously, COUNT(*) will return u the result faster as it just counts number of rows.
When using count(*) you are just joining the tables & counting number of rows.
While in update you are selecting values from each table & depending on your condition you are updating those values.
Of-course this action will take more time than just counting number of rows.

Please note: the 2 updates I am referring to in my original post have the same FROM and JOIN conditions. The only difference are the SET columns: the 1st one uses columns from the tables used in FROM clause, the other one (for time comparison) uses scalar values.

The 1st one runs in hours, the 2nd in minutes.

Any great ideas?

Alexander.

Martin.Cairns SSC-Addicted Points: 498 More actions · Answer 9

With LEFT JOINs if they play no part in the query (i.e. no columns are used in the query) the QP with remove them from the query plan which would explain the runtime difference between Q1 and Q2.

I would try doing the updates a column/table at a time to see if one of the joins between the tables is causing the problem or the overall complexcity of the number joins in the update.

Martin Cairns

G.Lakshmanan Grasshopper Points: 19 More actions · Answer 10

Hi,

It seems to be Missing index.

Solutions:Along with your query

Use Database Engine Tuning Advisor to get more clarity of recommendations .If U feel if it is really worth, Then test in Dev server first then implement in Production.

Cheers...

Jayanth_Kurup SSC-Insane Points: 22967 More actions · Answer 11

The fact that the select is returning data faster , the chances are your facing blocking issues during the update.

Jayanth Kurup[/url]

LutzM SSC Guru Points: 107049 More actions · Answer 12

To repeat what I stated earlier:

It seems like the indexes available can't be used in the first statement since the indexes are not covering indexes (e.g. an index for table2 should not only include the columns used as join predicates but also -preferred as included column- the column t2.col. The same applies to the rest of the tables.
But without seeing the actual update statement including table def and index def for all tables involved it's hard to tell.
The parallelism you noticed most probably are caused by index scans. We would need the actual execution plan to analyze any further.
There are several options to improve the update (including the divide'n'conquer approach) but as mentioned we'd need more details.

@Shurkadze

Did you try to apply the index concept I mentioned? Or can you at least provide the actual exectuion plan as an .sqlplan file?

@G.Lakshmanan

I strongly vote against using DTA. Most of the "suggestions" I got until I refused to use it where -friendly spoken- "misleading".

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Shurkadze SSC Eights! Points: 942 More actions · Answer 13

LutzM (7/8/2011)
To repeat what I stated earlier:
It seems like the indexes available can't be used in the first statement since the indexes are not covering indexes (e.g. an index for table2 should not only include the columns used as join predicates but also -preferred as included column- the column t2.col. The same applies to the rest of the tables.
But without seeing the actual update statement including table def and index def for all tables involved it's hard to tell.
The parallelism you noticed most probably are caused by index scans. We would need the actual execution plan to analyze any further.
There are several options to improve the update (including the divide'n'conquer approach) but as mentioned we'd need more details.
@Shurkadze
Did you try to apply the index concept I mentioned? Or can you at least provide the actual exectuion plan as an .sqlplan file?
@G.Lakshmanan
I strongly vote against using DTA. Most of the "suggestions" I got until I refused to use it where -friendly spoken- "misleading".

Please explain: what are covering indexes? How does setting of values to indexed columns (or not) affect the execution time?

Thanks, Alexander.