UPDATEs vs. Left Joins - which is faster/more efficient?

  • Hello, all - I've heard/seen it both ways - updates are faster/more efficient IF you restrict the scope so that it applies to only the relevant rows VS. having left joined tables. I'm curious as to the consensus of the group on this subject, as I am building a data mart for our researcher. Right now, it's all left joins, but I'm thinking of writing an UPDATE version and then racing them - I'd still be interested in your input.

    Thank you and happy holidays!! 🙂

    Donna B.

  • Since an update statement by definition can only update a single table I am not clear on what you are asking.

    I suppose the update could be an "update from" and you have left joins in the from part, but what are you trying to compare.

    Can you post more details, preferably with code to reproduce?

    Mike

  • dbursey (12/29/2010)


    Hello, all - I've heard/seen it both ways - updates are faster/more efficient IF you restrict the scope so that it applies to only the relevant rows VS. having left joined tables. I'm curious as to the consensus of the group on this subject, as I am building a data mart for our researcher. Right now, it's all left joins, but I'm thinking of writing an UPDATE version and then racing them - I'd still be interested in your input.

    I'm probably having a very bad day - I can understand each single portion of the post but the meaning of the post as a whole is giving me a hard time.

    1- You do want to restrict UPDATE to the affected rows only.

    2- The way you restrict UPDATE to the affected rows only may vary depending of a number of elements including but not limited to the imagination of the transact-sql developer; which one is better? easy money! provided all of them "do the job" the one that does less I/O is the better one... trace them and look at buffer_gets, the one doing the less buffer_gets is the one you want to implement.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I'm not really sure what you are asking. Have a look at the first article in my signature for how to help us help you for an idea of the kind of information we might need.

    Are you looking for the best way to do an update, we'll need sample data, ddl etc... If you're looking for how a Left join might compare to not exists or not in, then have a look at Gail Shaw's blog posts where she compares Joins In and Exists... http://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/ there were 5-6 of them if I remember correctly and they all link back and forth to each other. Perhaps your answer may lie in her writings...

    Also, keep in mind that much of it depends on your environment/hardware etc. The best way would be for you to write your solution both way sand then compare them in your test environment. That way you'll know for sure.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • This is not a question asking "how to" do either one of these things - I know how to do them. I'm just curious with regard to preference/performance - given these two options for getting large volumes of data into a table, which would you use? Or would you use something totally different?

    Does this help?

    Thank you --

    Donna B

  • Thanks, Luke - that's what I was thinking of doing, and what I meant when I said that I was going to write the code both ways and then "race" them to see which was more efficient and faster - I was simply curious to see if there was any "off-the-top-of-the-head" rule of thumb for one way versus the other.

    Thank you --

    DonnaB

  • dbursey (12/29/2010)


    This is not a question asking "how to" do either one of these things - I know how to do them. I'm just curious with regard to preference/performance - given these two options for getting large volumes of data into a table, which would you use? Or would you use something totally different?

    Well... I think you missed part of my answer, the one that reads like "trace them, look at buffer_gets and implement the one that does less I/O" 🙂

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • dbursey (12/29/2010)


    This is not a question asking "how to" do either one of these things - I know how to do them. I'm just curious with regard to preference/performance - given these two options for getting large volumes of data into a table, which would you use? Or would you use something totally different?

    Does this help?

    Thank you --

    Donna B

    A lot of testing by various folks on this site showed that usually the Left Join was slower on a large number of records... as in millions. I don't remember if this was doing an update, or just the select.

    As everyone else has stated - test it for your situation - that's something that we can't duplicate.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thank you, I appreciate the input - it confirms what I've heard (in a scattered sort of way). I'm in the process of coding my UPDATE version to test right now, actually.

  • Hi, Pablo - You're right - my eye passed over that part of your answer. Sorry about that! One of those things that I know of, but have not really used before, so this is good practice.

    Cesko, I'm not trying to combine these two, I'm asking about comparing them, actually. Using them together seems like a particularly bad thing!

    In the bit that I've done with the update vs. the left join, the response time is so much better, even a human can see it. This question has been answered for me, and I appreciate the input. Now I have to go and investigate MERGE, as I have not heard of that. A bonus!

    Thank you all so much - I appreciate the responses!

    DonnaB

  • Not sure you can ever do an apples-to-apples comparison. Sometimes one will be better, sometimes the other. Sometimes cardinality estimates will make the opitimizer pick one or the other - and that choice may or may not be optimal given ACTUAL rowcounts. Way too many variables here.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply