Coincidence or Causality

  • Paul White (12/13/2009)


    Steve Jones - Editor (12/13/2009)


    Paul, that's interesting. I've not seen people writing joins as groups like you have. I think those parens are affecting things.

    No parentheses version:

    SELECT D.D

    FROM #B B

    JOIN #A A

    ON A.A = B.B

    JOIN #D D

    JOIN #C C

    ON D.D = C.C

    ON C.C = B.B

    OPTION (FORCE ORDER);

    This produces a the same 'bushy' plan. My point is just to highlight the fact that this sort of plan is never normally considered by the optimizer: if you want to join the results of two joins, you have to perform tricks like this - where the written order does have an impact. The important thing is not the parentheses (though they do make the intention clearer - at least to my way of thinking); it is the OPTION clause that helps us specify an particular plan.

    I ought to mention too that the written order of the tables is important if an OBJECT or SQL plan guide exists which requires the textual form of the submitted query to match the guide. Re-arranging the query in this case would prevent the plan guide from being applied, potentially resulting in a suboptimal query plan.

    Actually, the old SQL Server Enterprise Manager used to write joins like that. Some where quite effective compared to the way most humans write joins.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... wow... Paul beat me to it and, for once, I'm glad so that I'm not the first to say something that most other people will disagree with. I quite accidently discovered the beauty of manipulating the optimizer by the order in which things are joined a long time ago when I decided that logically grouping joins in derived tables for the sake of readability and understanding of the code also made it easier for SQL Server to do it's job. I also ran into the same thing when I was trying to figure out someone elses code using the query designer in the old Enterprise Manager. It would frequently do several joins followed by several ON's and, yes, Rebecca, it did make a difference in performance just as Paul's fine examples did. It turned hash joins into merge joins and got rid of a lot of interim work tables as a result.

    Ironically, it's a myth that the myth that order of joins AND ON's doesn't matter. It's just that most people don't know how to do it correctly.

    Even Microsoft doesn't know all of the ins and outs of SQL Server and that's no myth... just ask any really good DBA or SQL Server Developer. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White (12/13/2009)


    There is at least one case where the written order of the tables does matter.

    Man... very, very well done. Thanks, Paul.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In fact, Paul, you should write an article on this. Should be about as good a fight as the Running Total article. You'll drive the other MVP's absolutely nuts. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/14/2009)


    In fact, Paul, you should write an article on this. Should be about as good a fight as the Running Total article. You'll drive the other MVP's absolutely nuts. 😛

    Now *that* genuinely made me laugh out loud.

  • Not getting into the discussion of the specific querys mentioned, but rather the general topic of the editorial. This erronious line of thinking is not exactly new, please check the following quote from Wikipedia

    Post hoc ergo propter hoc, Latin for "after this, therefore because (on account) of this", is a logical fallacy (of the questionable cause variety) which states, "Since that event followed this one, that event must have been caused by this one." It is often shortened to simply post hoc and is also sometimes referred to as false cause, coincidental correlation or correlation not causation.

    Since no one (other than my high school Latin teacher) has made any new latin sayings in a few hundred years, it may be a sign that it will be hard to get people to stop, but keep trying Steve, we all need to improve

  • Steve Jones - Editor (12/13/2009)


    Deadlocks don't occur within a query, they are within a transaction when you are updating two objects.

    They can't in theory occur within a transaction, they occur between two (or more) transactions.

    Tom

  • Correct, Tom. Thanks for that note. A transaction is required.

  • Tom.Thomson (1/10/2010)


    Steve Jones - Editor (12/13/2009)


    Deadlocks don't occur within a query, they are within a transaction when you are updating two objects.

    They can't in theory occur within a transaction, they occur between two (or more) transactions.

    Two comments on that:

    1. You're pretty much always in a transaction - even if it is the implied transaction that wraps a single statement.

    2. It is certainly possible to experience some types of deadlock within a single statement. Examples: an intra-query parallelism deadlock or a deadlock arising from code executing via the CLR.

    Paul

  • Paul White (1/11/2010)


    Tom.Thomson (1/10/2010)


    Steve Jones - Editor (12/13/2009)


    Deadlocks don't occur within a query, they are within a transaction when you are updating two objects.

    They can't in theory occur within a transaction, they occur between two (or more) transactions.

    Two comments on that:

    1. You're pretty much always in a transaction - even if it is the implied transaction that wraps a single statement.

    2. It is certainly possible to experience some types of deadlock within a single statement. Examples: an intra-query parallelism deadlock or a deadlock arising from code executing via the CLR.

    Paul

    I'm inclined to say that deadlock arising from intra-query parallelism is a clear sign of careless engine design (introducing intra-query parallelism in such a way that the transaction gets rolled back just throws away work and risks having a transaction that never completes) since it is easily avoided if the designer thinks about it (by keeping a cache of original values for the duration of the transaction). We avoided intra-transaction deadlock in the late eighties in design of massively parallel databases, so I was pretty appalled to see it happening in SQL Server 2000 over a decade later (the words "in theory" in my comment arose from that defect of SQL Server 2000). Maybe the parallel server version of Oracle (part of Oracle release 4 or 5?) had it fixed it even before then? I haven't seen the problem in SQL Server 2008, but I haven't done much with that (just playing, no serious work) so would not necessarily have stumbled over it, the fault may still be there. Deadlock arising from code executing via CLR could easily be fixed the same way, it seems to me, unless what the code executed via CLR does is something pretty bizarre.

    Tom

    Tom

  • I'd think that an intra-query deadlock is a bug of some sort. However, it's a good point to be aware of.

  • Steve Jones - Editor (1/11/2010)


    I'd think that an intra-query deadlock is a bug of some sort. However, it's a good point to be aware of.

    Indeed, Steve...http://blogs.msdn.com/bartd/archive/2008/09/24/today-s-annoyingly-unwieldy-term-intra-query-parallel-thread-deadlocks.aspx

  • Interesting, I am facing just such information in a deadlock at the moment, helps explain why there are sometimes 4 or more process ids in the process list, which gave me a headache for a while:

    resource-list

    2010-01-08 09:00:25.64 spid15s ridlock fileid=1 pageid=358727 dbid=14 objectname=vmware_virtualcenter_rdc.dbo.VPX_SAMPLE_TIME1 id=lock426777c0 mode=X associatedObjectId=72057594822852608

    2010-01-08 09:00:25.64 spid15s owner-list

    2010-01-08 09:00:25.64 spid15s owner id=processc24f28 mode=X

    2010-01-08 09:00:25.64 spid15s waiter-list

    2010-01-08 09:00:25.64 spid15s waiter id=process8ea5c8 mode=S requestType=wait

    2010-01-08 09:00:25.64 spid15s keylock hobtid=72057596031139840 dbid=14 objectname=vmware_virtualcenter_rdc.dbo.VPX_HIST_STAT1 indexname=PK_VPX_HIST_STAT1 id=lockffffffffaadb7580 mode=X associatedObjectId=72057596031139840

    2010-01-08 09:00:25.64 spid15s owner-list

    2010-01-08 09:00:25.64 spid15s owner id=process8ea5c8 mode=X

    2010-01-08 09:00:25.64 spid15s waiter-list

    2010-01-08 09:00:25.64 spid15s waiter id=process3105f18 mode=S requestType=wait

    2010-01-08 09:00:25.64 spid15s exchangeEvent id=portffffffff801f3400 nodeId=8

    2010-01-08 09:00:25.64 spid15s owner-list

    2010-01-08 09:00:25.64 spid15s owner event=e_waitNone type=producer id=process3105f18

    2010-01-08 09:00:25.64 spid15s waiter-list

    2010-01-08 09:00:25.65 spid15s waiter event=e_waitPipeGetRow type=consumer id=process97e2f8

    2010-01-08 09:00:25.65 spid15s exchangeEvent id=portffffffff801f2700 nodeId=5

    2010-01-08 09:00:25.65 spid15s owner-list

    2010-01-08 09:00:25.65 spid15s owner event=e_waitNone type=producer id=process97e2f8

    2010-01-08 09:00:25.65 spid15s waiter-list

    2010-01-08 09:00:25.65 spid15s waiter event=e_waitPortOpen type=consumer id=processc24f28

    procees id was the 310f18 was the victim and 8ea5c8 the only process not running under the same spid, so the real clash I believe was between these two (classic acces of objects in different order). From the blog looks like this is where the true deadlock actually is, but perhaps the parallelism against VPX_HIST_STAT1 is worth investigating too (by the vendors of this code!) .

    note: posted as an example related to discussion, not looking for a fix! 🙂

    ---------------------------------------------------------------------

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply