November 6, 2009 at 7:19 am
Need help with this store procedure. It sometimes runs ok, others it never finishes. Developer needs to run it manually after waiting for hours. I suspect part of the reason is that is doing lot of inserts on non clustered indexes but I am not sure. Or, some locks on the soon to be updated rows but I have no evidence of that.
The two main tables are called col_details and col_eclipse. The 1st table has six indexes, 5 of them are non clustered and are being severely updated during the process.
I've attached the MS-SQL plan. By the way, is SQL2008 Enterprise
Thanks in advance!
PS:
For this part of the code, when the INSERT starts to retrieve the soon to be updated rows ...
update om
set om.System = 'SIQP Indirect',
om.Valid = 'Y',
om.MCC = e.MCC,
om.Note = null
from col_detail om, col_eclipse e
where om.DealID = e.DealID
--new business models added #GG - v2.4 18/02/2008
and e.BusinessModel in ('Volume Channel','Volume Promotion','Value Indirect','Global Indirect')
and om.[timestamp] >= @dt_temp
I am suggesting this
Non clustered indexes on
col_detail.DealID
col_eclipse.BusinessModel
col_detail.Timestamp
Which implies change TimeStamp column from Clustered to Non clustered index. The other two suggestions need to be created. Like I previously said, not sure about how to improve the other portion of the query plan, the Index Update, Query Plan shows several of them ...
November 6, 2009 at 9:30 am
I cannot look at the queryplan at the moment but:
1. In SQL2008, using MERGE is safer than UPDATE FROM as it avoids non-deterministic problems
by throwing an exception if a one to many join occurs.
MERGE INTO col_detail om
USING col_eclipse e
ON om.DealID = e.DealID
AND e.BusinessModel IN ('Volume Channel','Volume Promotion','Value Indirect','Global Indirect')
AND om.[timestamp] >= @dt_temp
WHEN MATCHED
THEN UPDATE
SET [System] = 'SIQP Indirect',
[Valid] = 'Y',
[MCC] = e.MCC,
[Note] = null;
2. Looking at the JOIN conditions you could try the following indexes:
CREATE NONCLUSTERED INDEX col_detail_timestamp_DealID
ON col_detail ([timestamp], DealID)
GO
CREATE NONCLUSTERED INDEX col_eclipse_DealID_BusinessModel
ON col_eclipse (DealID, BusinessModel)
GO
3. When updating a lot of rows there may well be blocking. If there is, look into doing the update
in batches of say 1000. (MERGE TOP (1000) INTO...)
November 6, 2009 at 10:01 am
Thanks Ken
If you have the chance to check the query plan later, please do.
I believe your Index creation code is the same I'm proposing, right? Three separate non clustered indexes for each of the columns participating on the join condition ... am I right?
I've read about the new MERGE statement, but haven't seen it in action. Your recommendation is interesting. I'm willing to check how query plan change on development environment but still not sure why the intermittent issue .... 2 or 3 days during the month, running for hours ... other days just take min. SP runs every day at night.
November 6, 2009 at 10:05 am
jocampo (11/6/2009)
I believe your Index creation code is the same I'm proposing, right? Three separate non clustered indexes for each of the columns participating on the join condition ... am I right?
No, I am suggesting a composite nonclustered index on each table to cover the join.
ps If your UPDATE FROM is coded correctly, I doubt if MERGE will make much difference to the query plan. MERGE is safer as it catches sloppy, and potentially dangerous, one to many joins.
November 6, 2009 at 10:16 am
Why a composite and not three different Indexes? what could be the difference in terms of performance ... also, you mentioned MERGE could avoid non deterministic issues, can you elaborate more?
My suspicion is that the sp takes too much time 'cause something is creating locks at the execution time. Developer told me they have a front end application that runs some SELECT statements against the same table but I have no technical data to check it yet.
November 6, 2009 at 10:22 am
jocampo (11/6/2009)
Why a composite and not three different Indexes?
Because a 3-column index means that SQL can do a seek on all three columns in one go. 3 individual index means that SQL will pick 1, seek on that and then look up the rest of the columns and filter. Very rare that SQL will use 3 indexes on 1 table for a single query (unless there's an OR)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 6, 2009 at 10:29 am
A merge join (lookup in BOL) can only use one index on each table. The join and filter conditions are:
col_detail.[timestamp] -- first to get om.[timestamp] >= @dt_temp
col_detail.DealID - used to join to col_eclipse
col_eclipse.DealID - used to join to col_detail
col_eclipse.BusinessModel - used to filter col_eclipse
Even with these indexes, the optimizer may decide not to use them based on the STATS.
If the range of [timestamp]'s varies a lot for each call of the SP, you may also have a problem with
parameter sniffing. The quick way to check this is to put 'WITH RECOMPILE' at the start of the SP.
This blog gives a rough outline of the problems with UPDATE FROM:
http://sqlblog.com/blogs/hugo_kornelis/archive/2008/03/10/lets-deprecate-update-from.aspx
November 6, 2009 at 1:24 pm
Thanks everyone for the hints.
did you folks have the chance to check the attached query plan? ... any suggestions? still I wonder why the sp takes 19hrs some days and minutes other... expired index stats maybe, 'cause the indexes?
Thanks,
November 6, 2009 at 2:50 pm
Parameter sniffing?
Stale statistics?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 6, 2009 at 3:59 pm
It would be better if you posted actual execution plan, not the estimated one.
If I may ask a few questions:
Is stg_upfront.TotalValueUSD a varchar field? Why?
Why do you exclude scientific notation - are these data invalid?
Is stg_upfront.StartDate a character field? Why?
Regards
Piotr
...and your only reply is slàinte mhath
November 7, 2009 at 7:09 am
That's the real one 🙂 ... we ran that on dev environment and I captured that info ...
November 7, 2009 at 7:46 am
Why don't I see then actual rows counts? 🙂
Piotr
...and your only reply is slàinte mhath
November 7, 2009 at 10:55 am
jocampo (11/7/2009)
we ran that on dev environment and I captured that info ...
How? If you pulled it later out of the plan cache it won't have the run-time information and that's what we need to see.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 7, 2009 at 1:16 pm
I know what you guys mean
This is the estimated, I did not run it myself. But I had the opportunity to see the sp in action and the information I provided in the attachment did not change too much.
This is what I did ...
I asked the developer to run it manually ... I went to Management Studio and from the running queries, I pulled out the query execution plan ... the developer run it from his own SQL console ....
November 7, 2009 at 9:19 pm
jocampo (11/7/2009)
I asked the developer to run it manually ... I went to Management Studio and from the running queries, I pulled out the query execution plan ... the developer run it from his own SQL console ....
So you queried a DMV to get the plan? That will get you the estimated exec plan (the plan without run-time information). What we're asking for is the actual plan. Run the sp (or get the dev to) with the "Include actual execution plan" option enabled and save the plan generated by that. That is the only way to get the run-time info with the plan.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply