T-SQL and UPDATE - How smart is SQL with UPDATEs?

  • You have a table with 1 million rows. You need to make all of the rows the same value, but you know that 5% of them do not have the correct value. Would it save the hard drive's trouble by using a WHERE clause in the UPDATE? So if the value needs to be "Albuquerque," does the run faster (and save the drive's trouble) by issuing for example:

    UPDATE table

    SET column = 'Albuquerque'

    WHERE column <> 'Albuquerque'

     

    Assumptions: this is a real table not a temp table. There is an index on the column in question.

    In other words, is SQL smart enough to skip replacing a value with the exact same value or will it write it to every single row in the table anyway?

     

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • If you do not filter the update (use a where clause), SQL Server will update every single row in the table. I would recommend using the where clause because that is going to reduce the impact of the update, not necessarily reduce the impact to the disk drives.

    The impact on the disk drives is less important than the locking of the table during the update. Depending on hardware, updating a million rows could take a second - or several minutes. If the update were to take 2 seconds, try to think how long 2 seconds is for an end user in the application.

    It doesn't sound like a lot, but that will be a noticable wait for the users.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • It's a bit of a tricky question. First let's start by saying that yes - it will only update those rows that match the WHERE clause if you give it one, and will update ALL rows if you don't.

    That said - is that actually less work to use the WHERE clause than not? You MIGHT be able to use the index to get to the rows you need to update, but this will cause "bookmark lookups" to happen to actually access the record to update, and - the index will need to be updated (index is now entirely useless since every row in the table has that value). So it depends, since you're replacing some work with other work. Your specific case will tell you what is actually faster.

    Frankly - I'd consider a third option. Drop the index FIRST, then run the update with the WHERE clause.Depending on how much of a percentage of the table needs to change - this may be faster than the update with the index in place.

    One last hypothetical to consider: if you had something like an auditing table tracking changes to said table, applying the WHERE clause will affect how many rows you end up putting into the audit table for this operation (which tends to tilt it towards using the WHERE clause).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • In addition to Matt's note, a WHERE clause definitely helps. At the very least it's less log records to write and then backup, store on disk, etc.

    However indexing matters, and Matt raises things to think about.

  • Thank you very much for the well-informed responses. I finally have an answer to this.

    I used to be a FoxPro programmer and using the WHERE clause was needed in that DBMS too in these situations.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply