8hr hour query ?!?!?!?!?!?!?!

  • Hey everyone. I am having a rather bad day. Must be blind or something. The following statement is taking roughly 8 hours to run to completion.

    DELETE FROMContactGroupMember

    FROM ContactGroupMember AS CGM

    INNER JOIN #FinalResults AS FR ON CGM.ContactGroupId = FR.ContactGroupID

    WHERE cgm.ContactId <> fr.OwnerID

    ContactGroupMember as 2 index. Clustered on ContactGroupID, ContactID. Nonclustered on ContactID. The table also has roughly 17m records in the table.

    CREATE TABLE ContactGroupMember

    (ContactGroupID INT NOT NULL REFERENCES ContactGroup(ID)

    , ContactID INT NOT NULL REFERENCES Contact(ID)

    , PRIMARY KEY CLUSTERED (ContactGroupID, ContactID)

    )

    #FinalResults has 0 indexes. Roughly 600-700k records max on the table, and usually closer to 100k.

    CREATE TABLE #FinalResults

    (ContactGroupID INT

    , OwnerID INT

    )

    Anyone got any thoughts on why it would be taking centuries to complete this statment? I have a passing thought that it has to do with the cgm.ContactId <> fr.OwnerID. However, I am not certain. If I am correct, what would be the best way to rewrite this statement to not take centuries to complete? This isn't a one time complete, but rather a process that should be getting run a few hundred times and hour at minimum.

    Ultimate goal of the code is I need to delete from ContactGroupMember where the ContactGroupMember.ContactGroupID and ContactGroupMember.ContactID are not in the #FinalResult List

    So

    ContactGroupMember

    GroupID, ContactID

    1,1

    1,2

    1,3

    #FinalResults

    GroupID, OwnerID

    1,2

    1,3

    1,4

    After delete, ContactGroupMember should contain 1,2 and 1,3 records. (I have another insert for the 1,4 record).

    2005 database.

    Thanks,

    Fraggle

  • Personally, I'd start investigating your wait_state on the spid running the process and looking into your locks. You sound blocked, not slow.

    EDIT: If you're sure it's just slow, we need to see the execution plans. Actual by preference, estimated if nothing else.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (12/14/2011)


    Personally, I'd start investigating your wait_state on the spid running the process and looking into your locks. You sound blocked, not slow.

    EDIT: If you're sure it's just slow, we need to see the execution plans. Actual by preference, estimated if nothing else.

    WaitType is CXPacket. No blocking. I am working on an actual, but the estimated is garbage b/c the temp table doesn't have any data.

    Fraggle.

  • Try running it with a maxdop hint OPTION (MAXDOP 1) and see how that changes things.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • btw, if I've understood you properly

    DELETE FROM ContactGroupMember

    WHERE NOT EXISTS (SELECT 1 FROM #FinalResults AS FR WHERE ContactGroupMember.ContactGroupId = FR.ContactGroupID AND ContactGroupMember.ContactId = fr.OwnerID)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Fraggle-805517 (12/14/2011)


    Hey everyone. I am having a rather bad day. Must be blind or something. The following statement is taking roughly 8 hours to run to completion.

    DELETE FROMContactGroupMember

    FROM ContactGroupMember AS CGM

    INNER JOIN #FinalResults AS FR ON CGM.ContactGroupId = FR.ContactGroupID

    WHERE cgm.ContactId <> fr.OwnerID

    Seems like an accidental CROSS JOIN (Double Triangular join, in this case) with a touch of the "Halloween Effect" (essentially, a cursor within a cursor behind the scenes). See Gail's article at the second link in my signature line below for how to post the execution plan so we can read it.

    As Gail said, setting MAXDOP to 1 will usually fix these types of things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/14/2011)


    with a touch of the "Halloween Effect" (essentially, a cursor within a cursor behind the scenes).

    ????

    Something you've named 'halloween effect'? Because what's normally referred to as Halloween problem is purely for updates and related to the possibility of updating rows multiple times, and SQL has rules built in so that it will never fall prey to the Halloween problem

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/15/2011)


    Jeff Moden (12/14/2011)


    with a touch of the "Halloween Effect" (essentially, a cursor within a cursor behind the scenes).

    ????

    Something you've named 'halloween effect'? Because what's normally referred to as Halloween problem is purely for updates and related to the possibility of updating rows multiple times, and SQL has rules built in so that it will never fall prey to the Halloween problem

    No, of course I didn't name it that. And, yes, it does happen for Deletes and, no, SQL doesn't have all the rules you think it might to prevent Halloweening on updates. I've fixed it many a time in SQL Server. Granted that it's normally because of people writing the UPDATE without the target of the update in the FROM clause, but SQL Server gives no warning and certainly doesn't prevent it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (12/14/2011)


    btw, if I've understood you properly

    DELETE FROM ContactGroupMember

    WHERE NOT EXISTS (SELECT 1 FROM #FinalResults AS FR WHERE ContactGroupMember.ContactGroupId = FR.ContactGroupID AND ContactGroupMember.ContactId = fr.OwnerID)

    This will generate different results to the original DELETE statement, but it does match the description "I need to delete from ContactGroupMember where the ContactGroupMember.ContactGroupID and ContactGroupMember.ContactID are not in the #FinalResult List".

    With larger tables like these and to reduce contention I'd do this in two steps:

    SELECT ContactGroupID, ContactID

    INTO #Stage

    FROM ContactGroupMember

    WHERE NOT EXISTS (SELECT 1 FROM #FinalResults AS FR WHERE ContactGroupMember.ContactGroupId = FR.ContactGroupID AND ContactGroupMember.ContactId = fr.OwnerID)

    CREATE UNIQUE CLUSTERED INDEX [CX_cc] ON #Stage (ContactGroupID, ContactID)

    DELETE c

    FROM ContactGroupMember c

    INNER JOIN #Stage s ON s.ContactGroupID = c.ContactGroupID AND s.ContactID = c.ContactID


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Jeff Moden (12/15/2011)


    And, yes, it does happen for Deletes and, no, SQL doesn't have all the rules you think it might to prevent Halloweening on updates. I've fixed it many a time in SQL Server. Granted that it's normally because of people writing the UPDATE without the target of the update in the FROM clause, but SQL Server gives no warning and certainly doesn't prevent it.

    Then we're talking about 2 different things, because the Halloween problem (named by IBM when they encountered it on Halloween) is purely for updates and SQL does completely prevent it by either picking a plan that doesn't use an index key'd on the column being updated or by putting an eager spool into the plan. If it ever didn't protect against it, it would be a huge incorrect results bug.

    Halloween protection isn't about performance, it's about incorrect results after an update and it's by no means a SQL Server problem, all relational databases have to have something in place to prevent that

    A brief summary: http://www.benjaminnevarez.com/tag/halloween-protection/. A patent filed for a halloween prevention method http://www.patentstorm.us/patents/6122644.html

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • And the moment has arrived. After a total execution time of 15:54:22 I finally have an execution plan for everyone to view.

    Jeff/Gila, can either of you explain the Holloween Effect (and why they named it that)?

    I am going to try some of the suggestions now.

    Thanks,

    Fraggle.

  • Fraggle-805517 (12/15/2011)


    And the moment has arrived. After a total execution time of 15:54:22 I finally have an execution plan for everyone to view.

    Jeff/Gila, can either of you explain the Holloween Effect (and why they named it that)?

    I am going to try some of the suggestions now.

    Thanks,

    Fraggle.

    Halloween = old bug. Not relevant anymore.

    Imagine you update the salary by 10%. Clustered index on Salary (not 100% sure on that part)

    PK1 = 100000

    PK2 = 105000

    So the engine would update 100K to 110K, move the page to the 2nd slot after the sort. Update 105K to ±116K.

    Then hit the new version of PK1 with 110K and reupdate it to 121K. Making a double update and bad data.

  • Fraggle-805517 (12/15/2011)


    Jeff/Gila, can either of you explain the Holloween Effect (and why they named it that)?

    In short, it's an effect in relational databases where an update can affect a row more than once. Consider the following:

    UPDATE Employees

    SET Salary = Salary*1.1

    WHERE Salary < 20000

    And consider that SQL uses an index on the Salary field to satisfy that. If it works row by row (which, inside the engine is how everything currently works anyway), it could fetch a row, update it. As a result of the update the row moves within the index and, later on, the engine retrieves the same row and updates it again.

    Net result, once that update finishes, all rows will have a salary > 20000, even if they started at 1000.

    It was named the Halloween effect because it was discovered on Halloween back sometime in the late 70's. SQL protects against that by separating the read and update portions of an update's operation, ensuring that it reads all the rows that must be affected before it starts updating if it's using an index that has a key that is being updated.

    http://www.benjaminnevarez.com/tag/halloween-protection/

    But I suspect Jeff's talking about something else, because Halloween is only for updates and it's a solved problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • An update on this. The MAXDOP statement worked the best. It took the entire process down from 8-15hours to under 5 minutes.

    Thanks,

    Nathan

  • BTW, thanks to both of you for explaining the Holloween Effect.

    Fraggle

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply