More RBAR and "Tuning" UPDATEs

  • Jeff Moden (3/15/2008)


    Compare that to where C/C++ programmers have been for over 15 years: they can write new code targeted to run on multiple different platforms right from the start AND expect it to work

    So, there's absolutely nothing in C/C++ that is ever outside the standards amongst different vendors? I find that very difficult to believe in this very competative world.

    Not at all what I said. Of course they have extensions in C/C++. However, they also implement the full standard plus it is very well know and well documented what is standard vs. non-standard as well as what is re-targetable and what is not. That means that you really can implement high-performing, high-quality, full-featured applications using only standard C/C++ and fully expect them to compile for different environments and under different vendors' compilers.

    Standards are nice... but if you never have anything that's non-standard, then you have nothing new to become standard in the future. I'm going to keep pushing the envelope. πŸ˜‰

    You can do both, you know. πŸ˜€

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Matt Miller (3/15/2008)


    I am not willing to shell out money to get my hands on the actual SQL standard from ANSI, but considering some of the dreck that's been coming out of a lot of the so-called "working groups" (XPATH/XQUERY or CSS, to name a few) - I have to say that a healthy amount of skepticism ought to be applied to those "standards".

    *sigh* Implication by Association. You should not be passing judgement on the SQL group based on the actions of two completely different groups, especially when you haven't even read its work.

    ... I keep hearing the same criticism about the "standard" SQL: heavy on syntax rules, but light on the semantics backing the syntax up.

    Who said that? I don't even know what that would mean, let alone whether it's a bad thing.

    I mean - really. At what point did ANSI become the one and only authority on what SHOULD be in structured query language?

    In 1986 when the owner of SQL (IBM) submitted it to ANSi for standards review. That responsibility is jointly shared with ISO which ANSI is a member of.

    (they're not since to the best of my knowledge - there are still 2 bodies trying to put out standards on SQL).

    Who is that other body? I hope you don't mean ISO since they are cooperating organizations, not competitors. ISO by the way is authorized by the 157 nations who have Standards agencies that are members of it. That is a little bit more authority than Microsoft or Oracle.

    And - who's out "proving" that the so-called experts are in fact proposing what really IS the very best way?

    The problem with this is, who gets to decide what "best" is? You? Me? Oracle? Microsoft? Well if you ignore standards then it's Microsoft and we are reduced to begging them to implement what we want and since they know that very few of us can afford to extricate ourselves from their highly propietary environments, we don't really have a lot of leverage there. What the standards organizations do is to at least give us a chance to have a place at the table, to have some say in what we think that "best" ought to be.

    Are they really that much better at figuring this out then the rest of us? Or is this another occasion where we're getting theory shoved at us that has no chance at a decent implementation based on how hardware and software works these days?

    Funny, I heard these exact same complaints for 5 years after Codd & Dates book came out, and mostly from people who had never read them. Thank goodness someone did read thier work and did implement it despite how hardware and software worked in those days, or it never would have changed.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Hmm, I tried to swear off "ranting" some years ago, but that doesn't seem to be working out so well... Sorry.

    Anyway, for those who may actually be interested in what parts of Transact-SQL are standard and what may not be, here is a good readable summary site (yes, I do admit that the official Standards can be hard to read):

    http://troels.arvin.dk/db/rdbms/

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • DAvid (3/15/2008)


    I am surprised (being introduced to other DBMS before SS) that the location of the predicate relating the summarised aggregate sub query and the table for update had any effect on the query execution.

    Whether expressed in the where clause of the correlated sub query referring back to the table being updated or returned from the derived sub query and specified in a join would have any effect on the query execution. For me they are different semantics for the same logical update.

    Thanks for pointing out that when the join predicate is more appropriate than a correlated sub query where predicate when the column is not covered by an appropriate index.

    Thanks DAvid... maybe without intending to, you bring up a good point... sometimes the table belongs to a 3rd party app and you can "void" the support contract if you do anything to their tables including the addition of a simple index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hmm, I tried to swear off "ranting" some years ago, but that doesn't seem to be working out so well... Sorry.

    I do understand your passion about it and thanks for the link. I'll take a look.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rbarryyoung (3/15/2008)


    *sigh* Implication by Association. You should not be passing judgement on the SQL group based on the actions of two completely different groups, especially when you haven't even read its work.

    I love reading standards. Theory, especially when backed up with actual real-life appications and scenarios the theory is supposed to address, is very useful. I just have a problem having to PAY to read the standards. It defeats the purpose of a standard. After all - they're supposed to be internationally recognized standards, run by (supposedly) government-sponsored agencies. Meaning - I've already paid for it once and don't intend to pay large sums for it again.

    ... I keep hearing the same criticism about the "standard" SQL: heavy on syntax rules, but light on the semantics backing the syntax up.

    who said that? I don't even know what that would mean, let alone whether it's a bad thing.

    Meaning - several of the blogs point out that it's often more interested in what the FORM of the statement, and the description of the MEANING of the statement is left not fully defined (which again - is a common point of contention with these "standards" in a lot of technical areas). For example - it's one of the reasons why the windowing functions in SQL didn't get defined to allow for "running" aggregate, since the standard didn't say it should

    (they're not since to the best of my knowledge - there are still 2 bodies trying to put out standards on SQL).

    Who is that other body? I hope you don't mean ISO since they are cooperating organizations, not competitors. ISO by the way is authorized by the 157 nations who have Standards agencies that are members of it. That is a little bit more authority than Microsoft or Oracle.

    Actually - I AM talking about ISO. I'm fully aware of the "cooperating" status. I also know that just because they cooperate doesn't mean that they put out 100% compatible standards.

    Are they really that much better at figuring this out then the rest of us? Or is this another occasion where we're getting theory shoved at us that has no chance at a decent implementation based on how hardware and software works these days?

    Funny, I heard these exact same complaints for 5 years after Codd & Dates book came out, and mostly from people who had never read them. Thank goodness someone did read thier work and did implement it despite how hardware and software worked in those days, or it never would have changed.

    I've read Codd and Date's stuff, thanks. Again - theory is great, but it DOES need to be implementable, or else all it ever becomes is all that much more useless paper (see the Third manifesto if you want to see my drift - every one of dozen or so drafts end with "and this is perfectly useless in any real-life scenario" or some very close approximation thereof).

    Again - I am more than happy to do my research and have my voice at the table like you mentioned. I do have to doubt that they really welcome our voice at the table, since they seem to have NO interest in hearing what is really needed. If you want to hear opinions - you don't charge a few thousand dollars just for the privilege of reading said standard, and hide it away so that only the cognoscenti and academics may comment.

    I've always been a big fan of having standards - I just don't imbue them with god-like abilities to predict the future any better than anyone else involved, nor do I presume that they are in any way complete or comprehensive of the needs in the real-world. They're just a core, and not the whole picture.

    I also think that the theorists do also need to come out of the ivory towers and get their hands dirty from time to time. And by the way - having the vendors at the table along with the theorist, THAT's a good thing IMO (keeps both sides "honest").

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • This is a top class article: clear, deep and useful.

  • Hi Jeff and all,

    I had just composed a lengthy reply, and when I posted it I got an error about an expired session and all my text was lost. Have I ever told you how I HATE web-based discussion forums? :angry:

    Anyway, here is the shorter reconstructed version...

    Jeff Moden (3/14/2008)


    It's a shame that some folks have to get nasty or condescending, but that's kind of a lesson all by itself. πŸ˜‰

    I hope it's not me you're talking about. If it is, please accept my apologies.

    I always try to criticise only IDEAS I disagree with, and never the person presenting them. However if I did get carried away this time, or if something came across differently from how I meant it to (maybe lost in translation?), I am really sorry.

    someone said that the article said subqueries are RBAR and are bad. I didn't say that at all. Go back and look... I said correlated sub-queries are RBAR and are bad.

    That IS me you're talking about πŸ™‚

    My bad. I meant to include the word correlated in my reply. As Barry demonstrates with an example, the query optimizer is in fact able to internally "rewrite" a correlated subquery as a join.

    I also said that code portability is a myth and some folks pretty much hammered on that for quite some time. They also hammered on using SQL Server's proprietary Update that has a FROM clause because it's not ANSI. Then they wrote that it shouldn't ever be used because Microsoft is saying it'll be deprecated. HEH, THEN they turned around and admitted to using it a couple of paragraphs later.

    That's me as well πŸ˜€

    Except I did not say that UPDATE FROM will be deprecated. I made such a suggestion on Connect, and Microsoft has said that they are "looking at the future of the FROM clause". I now see that the links to my blog post and the suggestion on connect are barely visible, so here they are again:

    * Blog post about UPDATE FROM at http://sqlblog.com/blogs/hugo_kornelis/archive/2008/03/10/lets-deprecate-update-from.aspx.

    * Suggestion to deprecate UPDATE FROM at https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=332437.

    And yes, I do use UPDATE FROM in situations where the performance gain is huge and important. Like Barry, I try to code for portability, but sometimes have to make the trade off. I'd like to write portable code all the time, but if that would result in some unmaintainable monster or some slug-like performance, I'll have to cut my losses. More on this on my blog post as well.

    While we're at it, some folks worry about the documented "fact" that the UPDATE FROM will cause "indeterminent results". Let me just say that a correctly laid out UPDATE FROM with the Target Table in the FROM clause has never caused an error for me.

    The indetermined results occur when more than a single row matches the criteria (usually because of incorrect specs or a coding error). If this happens in the subquery version, you'll get an error. If this happens in the joined version, there is no error, no warning - SQL Server will just pick one of the matching rows for you. And in the worst case, that is exactly the one you intended, so your test run will produce the expected results and you won't notice the error lurking in your code. Until of course, one day, it will produce errors in the production database.

    Ok, let's get down to an example that'll really show why I generally don't allow correlated subqueries in my shop... especially on Updates...

    I won't quote the long example here, but let me compliment you on a GREAT example. This is indeed a compelling case for UPDATE FROM for now, and for MERGE once we get SQL Server 2008 in our hands. Your example illustrates the point I'm trying to make in my blog post even more effectively than my own example.

    I just don't think that this is an example of RBAR. I'd rather call it SBAS (subquery by agonizing subquery). Every subquery added adds the same amount to the execution time, because each subquery is incorporated into the plan as if it were different from the others - so all the work is done over and over again for each of the subqueries. A better optimizer would have prevented that. Full support for row constructors would have prevented that. And using the UPDATE FROM or (in SQL Server 2008) prevents it as well, since it almost spells out to the optimizer what the better plan is in this specific case.

    But I do indeed agree that in cases like this, until you can use MERGE, UPDATE FROM is vastly superior to the ANSI-compliant UPDATE with several subqueries.

    And THAT's a big part of what I was trying to get across in the article, folks.

    And on THAT, we are in full agreement! πŸ˜‰

    And now I'm going to copy the entire post in my clipboard, save it to some text file on my disk and only then will I hit the post button πŸ™‚

    [And a good thing too - I just had the same error occur again! :(]


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Matt Miller (3/16/2008)


    I love reading standards. Theory, especially when backed up with actual real-life appications and scenarios the theory is supposed to address, is very useful. I just have a problem having to PAY to read the standards. It defeats the purpose of a standard. After all - they're supposed to be internationally recognized standards, run by (supposedly) government-sponsored agencies. Meaning - I've already paid for it once and don't intend to pay large sums for it again.

    Well that I agree with. I am afraid that too many of the older organizations (of all types, not just standards) became far too dependent on their publishing revenue as a source of income instead of just as a means of defraying their publishing costs. So when the Internet came along, they just couldn't adapt to the much better model that newer standards groups like W3C use. Most of the academic journals are having the same problem.

    The following site is an excellent summary of the state of the SQL standards as well as how the process works. They have an interesting note about being able to get some of the ISO documents for $18 instead of the usual $130+. Supposedly they have the same content, but, I haven't followed up to see what the caveats might be: http://www.jcc.com/sql.htm.

    I've always been a big fan of having standards - I just don't imbue them with god-like abilities to predict the future any better than anyone else involved, nor do I presume that they are in any way complete or comprehensive of the needs in the real-world.

    Neither do I. Nor do I think that predicting the future or what will be best for everyone is a reasonable goal for them, or anybody else for that matter. I do however think that standardizing current practice and codifying established theory and pricipals is a reasonable goal.

    I also think that the theorists do also need to come out of the ivory towers and get their hands dirty from time to time. And by the way - having the vendors at the table along with the theorist, THAT's a good thing IMO (keeps both sides "honest").

    Agreed, although "honesty" isn't what the vendors bring to the table, more like "buy-in" and "implementability" although they posture with both of those things. I don't have anything against the vendors, I worked for one of them for 10 years, and I still work very closely with many of them (in fact some of them are my best customers), I am just realistic about what their real goals are in all of this.

    By the way, there is a third group that you do not mention: customers and users of the techonology(s). IMHO, this is the most important group, unfortunately it is also the least involved because it doesn't see the importance.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • anvi (3/16/2008)


    This is a top class article: clear, deep and useful.

    Thanks for the feedback, Anvi. What's even better than my humble article is what folks have to say in this thread. I don't always agree with what's said, but it's good solid feedback and a great exhange of ideas both pro and con. That's what makes this forum so worth while.

    Again, thanks for taking the time to read the article and I really do appreciate the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hugo Kornelis (3/16/2008)


    ]

    Heh... How often I've been bit by the infamous timeout on this forum... I too have learned to at least do a {CTRL-A}{CTRL-C} before I hit PREVIEW or Post...

    Hugo, thanks for your post. You're a gentleman and a scholar. Heh... if nothing else, the two of us certainly got folks thinking and some of them also took the time from their busy days to join us in this thread. Each of us shares a similar passion... not so much to bend folks into our way of thinking... it's because we each believe we are correct because of our own experiences and we each strive to try to make it easier for the next poor slob that may have start his or her day by typing the word SELECT... πŸ™‚ Well done.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok, folks... I was operating on some old experiences with correlated sub-queries... Hugo and Barry's posts made me look deeper and, after studying the actual execution plans in both SQL Server 2000 and 2005, I have to admit, they're both correct... it turns out to not be RBAR in either of the two examples I gave. All personal passion asside, each correlated sub-query is, in fact, evaluated as if it were a derived table. To summarize, a single correlated sub-query with proper indexes will beat the inner join in SQL Server 2000 and tie the inner join in SQL Server 2005. Correlated sub-queries lose when more than one is attempted with the same predicate compared to the inner join.

    If you have a 3rd party solution that forbids the modification of the schema by adding an index (and possibly voiding a maintenance agreement), then an inner join will beat a single correlated subquery.

    The great point is that even though I was a miserable failure in that I didn't consider the effect of an index in the original article, the folks that took their good time to do a bit of additional analysis brought that fact, and others not intended, to good light.

    And with, that, my hat is off to all that participated. Thanks, folks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/16/2008)


    Ok, folks... I was operating on some old experiences with correlated sub-queries... Hugo and Barry's posts made me look deeper and, after studying the actual execution plans in both SQL Server 2000 and 2005, I have to admit, they're both correct... it turns out to not be rebar in either of the two examples I gave. All personal passion asside, each correlated sub-query is, in fact, evaluated as if it were a derived table. To summarize, a single correlated sub-query with proper indexes will beat the inner join in SQL Server 2000 and tie the inner join in SQL Server 2005. Correlated sub-queries loose when more than one is attempted with the same predicate compared to the inner join.

    If you have a 3rd party solution that forbids the modification of the schema by adding an index (and possibly voiding a maintenance agreement), then an inner join will be a single correlated subquery.

    The great point is that even though I was a miserable failure in that I didn't consider the effect of an index in the original article, the folks that took their good time to do a bit of additional analysis brought that fact, and others not intended, to good light.

    And with, that, my hat is off to all that participated. Thanks, folks.

    Learning by one's own mistakes - one of my "favorite" ways to learn...A good article just got better (and in that unintended way...)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks, Matt... that'll teach me... I normally do a lot more research and "due diligence" before I post an article. I got in a hurry this time and didn't walk all the bases to see where they were before I got up to bat... it showed, too. I got tagged at 3 of the bases, spiked at home, and got in an argument with at least 2 people in the crowd. πŸ˜€

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I wouldn’t totally agree with you on this. My first thoughts were its good to have a derive table β€œit definitely helps to identify the records which are in OrderHeader at different stages”.

    Few things that concerns me

    - The chances are that some sort of scan would be performed on the join TB's

    - All the data will be queried at once in each derive table

    If OrderHeader was very large table I would probably thought of storing a transformed data set in #temp table(From OrderHeader) and finally doing the required update to the destination table.

    Your Thoughts!!!

Viewing 15 posts - 46 through 60 (of 76 total)

You must be logged in to reply to this topic. Login to reply