UPDATE when the values are the same

Question

Post reply

UPDATE when the values are the same

chris 76827

SSC-Addicted

Points: 413
More actions
August 12, 2012 at 4:19 pm

#260092

Does an UPDATE query overwrite existing values if the already existing value is the same as the one your updating with?

Viewing 15 posts - 1 through 15 (of 159 total)

You must be logged in to reply to this topic. Login to reply

Scott D. Jacobson SSCertifiable Points: 6149 More actions · Answer 1

I think what you mean is if you have a data set like this:

CREATE TABLE USERS ( TYPEID INT,USERNAME VARCHAR(30),)

INSERT VALUES (1,'JSMITH')

INSERT VALUES (2,'JDOE')

INSERT VALUES (1,'ASMITH')

and you do something like this:

UPDATE USERS

SET TYPEID = 3

WHERE TYPEID = 1

will the update statement update both JSMITH and ASMITH? The simple answer is yes, it will.

Sergiy SSC Guru Points: 110208 More actions · Answer 2

CELKO (8/12/2012)
Does an UPDATE query overwrite existing values if the already existing value is the same as the one your updating with?
Who cares?

Triggers, for example. transaction logs. Users sitting in frount of the frozen UI waiting while the database finishes the useless activity.

Update dbo.Event

SET Status = 1

WHERE Eventdate < GETDATE()

and Status = 0

Without checking for the status value update will take forever on a big history table and add a big chunk to the Transaction Log file.

If there is a FOR UPDATE trigger it will be executed for all records having Eventdate in the past.

With Status = 0 added to the query it will actually update only recently added/processed records which have not been updated before. And trigger(s) will be run only for that smaller subset.

When I fixed similar query on one of production databases it removed 8GB of data added daily to the TRN file. Not to mention easing the pressure on the server, improving performance not only this but other systems having databases hosted on the same SQL Server.

_____________
Code for TallyGenerator

chris 76827 SSC-Addicted Points: 413 More actions · Answer 3

Ask a simple yes/no question and get a few tangent replies, damn.

Its quite simple. You have a table as follows.

CREATE TABLE Test (

ID INT IDENTITY(1,1)

Name VARCHAR(100)

)

INSERT INTO Test (Name) VALUES ('Tom')

UPDATE Test SET Name = 'Tom' WHERE ID = 1

Does the update statement overwrite Tom even though Tom is already the value.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Why are you asking, and at what level of the engine are you interested in? (because the behaviour is interesting and complex when you get down to the deep internals)

At a high level, yes it will. The row qualifies for the update, so the row gets updated, it will count towards the rows affected and the row will appear in the inserted and deleted pseudo-tables in triggers (and in the OUTPUT clause)

When it comes to logging and to the actual modifications of the pages, it's more complex. It's not a simple yes/no question.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

chris 76827 SSC-Addicted Points: 413 More actions · Answer 5

Okay so getting a little more specific. Does the SQL Server edit the MDF file and overwrite the current value of that record or does it carry out a check to see if their the same to save writing to the HDD something that's already there.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Well... To start all changes are made in memory and not to disk, and the changes to the files on disk are done later (by a background process) at a page level, not a row level...

As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.

I'm not saying it isn't ever marked dirty, but in my simple tests with a single row in a table, updating the name to itself did not appear to mark the page as dirty.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

chris 76827 SSC-Addicted Points: 413 More actions · Answer 7

Yeah sorry I appreciate its all done in memory but I was referring to the write back stage.

That's pretty much answered my question, cheers.

Phil Parkin SSC Guru Points: 246985 More actions · Answer 8

chris 76827 (8/13/2012)
Yeah sorry I appreciate its all done in memory but I was referring to the write back stage.
That's pretty much answered my question, cheers.

That does not mean that you should not do the check though:

update x

set y = 3

where y <> 3 or y is null

will run faster, in my experience, than

update x

set y = 3

especially if many of the rows are already 3.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

Sergiy SSC Guru Points: 110208 More actions · Answer 9

GilaMonster (8/13/2012)
As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.

Actually, it does.

I can offer you a simple test.

Create a new database with FULL recovery mode. Create a single table in it.

Insert several rows of data.

Then run a script having a loop with a single statement:

UPDATE MyTable

Set ColValue = ColValue

And watch the Log file growing.

What means that pages are certainly written to disk.

_____________
Code for TallyGenerator

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Sergiy (8/13/2012)
GilaMonster (8/13/2012)
As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.
Actually, it does.
I can offer you a simple test.
Create a new database with FULL recovery mode. Create a single table in it.
Insert several rows of data.
Then run a script having a loop with a single statement:
UPDATE MyTable
Set ColValue = ColValue
And watch the Log file growing.
What means that pages are certainly written to disk.

Nope. There will be operations, like checkpoints, transactions, background processes, but the update itself is neither logged (it's not actually a data modification) nor are the pages dirtied in the process.

Trivial to prove.

CREATE TABLE [dbo].[Test](

[ID] [int] IDENTITY(1,1) NOT NULL,

[Name] [varchar](100) NULL

) ON [PRIMARY]

insert into Test (Name) Values ('Tom')

Checkpoint -- all dirty pages to disk, truncate log

Now, identify the page no of that table's data page (exercise left for the reader)

-- update the row to itself

CHECKPOINT -- just to be sure and to allow for multiple tests

BEGIN TRANSACTION

UPDATE test SET name = name

SELECT [Current LSN] ,

Operation ,

Context ,

[Transaction ID] ,

AllocUnitId ,

AllocUnitName ,

[Checkpoint Begin] ,

[Num Transactions] ,

[Checkpoint End] ,

[Dirty Pages] ,

[Log Record] FROM fn_dblog(NULL, null)

SELECT * FROM sys.dm_os_buffer_descriptors AS bd WHERE page_id = '3783' -- the sole data page for this table

COMMIT

Results:

The log has nothing more than the checkpoint and the begin transaction, there's no LOP_MODIFY_ROW which would be there if the update was logged. (if the query had been after the commit, there's be a LOP_COMMIT_XACT as well). It's the checkpoint, begin and commit which would have caused the log to grow in your test.

The DMV showing the state of the pages in memory shows that the page that the table's sole row it on is unmodified after the update.

Now there will likely be cases where this is not true, but for this trivial case, updating the row to itself neither dirties the page nor logs the update.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Sergiy SSC Guru Points: 110208 More actions · Answer 11

CHECKPOINT -- just to be sure and to allow for multiple tests

DECLARE @I int

SET @I = 10

WHILE @I > 0

BEGIN

BEGIN TRANSACTION

UPDATE test SET name = name

SELECT [Current LSN] ,

Operation ,

Context ,

[Transaction ID] ,

AllocUnitId ,

AllocUnitName ,

[Checkpoint Begin] ,

[Num Transactions] ,

[Checkpoint End] ,

[Dirty Pages] ,

[Log Record] FROM fn_dblog(NULL, null)

COMMIT

SET @I = @I - 1

END

Now - run it and watch the number of records in fn_dblog growing by 2 after each cycle.

_____________
Code for TallyGenerator

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Yes as I showed in the screenshot in the post right above there's the begin xact and commit xact log records (which would be your two per cycle).

This is what's is in the log after that test of 10 updates:

The update itself is not logged. If it was, there would be a LOP_MODIFY_ROW for each update (10 of them), which there is not.

(btw, you'd get the begin and commit even if there was no explicit begin and commit, as all modifications are in implicit transactions)

If I change the query and update name to "Bob", the log records look like this

That Lop_Modify_Row is the log record for the actual update, and it is missing in the case where the row is updated to itself, hence we can conclude that the update is not logged when updating the value to itself.

The question was not, does the update cause any log records to be written to the log. The question was, does the update dirty the page if the update makes no changes. It does not, as I showed with the DMV previously.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

chris 76827 SSC-Addicted Points: 413 More actions · Answer 13

The log file increaseing in size merely implies the mdf file is being written to, it's not really conclusive evidence that it is being writtent to. Just because one grows it doesn't mean the other must surely.

Sergiy SSC Guru Points: 110208 More actions · Answer 14

GilaMonster (8/15/2012)
The question was not, does the update cause any log records to be written to the log. The question was, does the update dirty the page if the update makes no changes. It does not, as I showed with the DMV previously.

I would not be so sure.

I changed the script to bring out more info:

USE [TEST]

GO

CREATE TABLE [dbo].[Test](

[ID] [int] IDENTITY(1,1) NOT NULL,

[Name] [varchar](100) NULL

) ON [PRIMARY]

insert into Test (Name) Values ('TOM')

insert into Test (Name) Values ('Andy')

insert into Test (Name) Values ('Bob')

insert into Test (Name) Values ('Tim')

Checkpoint -- all dirty pages to disk, truncate log

SELECT OBJECT_ID('[dbo].[Test]') ObjID, * FROM TEST

Checkpoint -- all dirty pages to disk, truncate log

UPDATE test

SET name = 'Tom'

WHERE NAME = 'Tom'

SELECT [Current LSN] ,

Operation ,

Context ,

[Transaction ID] ,

AllocUnitId ,

AllocUnitName ,

[Checkpoint Begin] ,

[Num Transactions] ,

[Checkpoint End] ,

[Dirty Pages] ,

[Log Record] FROM fn_dblog(NULL, null)

Checkpoint -- all dirty pages to disk, truncate log

SELECT * FROM TEST

DECLARE @I int

SET @I = 4

WHILE @I > 0

BEGIN

BEGIN TRANSACTION

UPDATE test

SET name = NAME

WHERE @I%2 = 0 OR name <> NAME

EXEC sp_lock

COMMIT

SELECT [Current LSN] ,

Operation ,

Context ,

[Transaction ID] ,

AllocUnitId ,

AllocUnitName ,

[Checkpoint Begin] ,

[Num Transactions] ,

[Checkpoint End] ,

[Dirty Pages] ,

[Log Record] FROM fn_dblog(NULL, null)

Checkpoint -- all dirty pages to disk, truncate log

SET @I = @I - 1

END

GO

Please run the code with Execution Plan recorded.

1. Define "the same".

the code

UPDATE test

SET name = 'Tom'

WHERE NAME = 'Tom'

updates the value with the same, according to the column definition.

But for some reason SQL Server ignores the collation settings and actually writes the page.

So, it's not "the same" by data definition, it's "the same" by binary contents of the page(s).

2. Execution plan shows that "even" updates have 4 records committed to the UPDATE part, and "odd" ones have 0 records committed.

3. Messages show "(4 row(s) affected)" for "even" updates and "(0 row(s) affected)" for "odd" updates.

4. SP_LOCK indicates exclusive locks applied on each of 4 records in "even" updates, when condition name <> NAME is not applied.

To me it means that SQL Server treats those records as perfectly "dirty".

5. "Even" updates add LOP_BEGIN_XACT and LOP_COMMIT_XACT to the log, when "Odd" ones do not do this.

But you're right - those updates don't end up in the log file.

Considering all of the above I can see only one explanation:

1. UPDATE actually writes to the pages, regardless if the new values are the same or different from the old values.

But it writes to the pages in memory.

2. CHECKPOINT compares binary contents of the pages in memory and writes to disk only those ones which have become different after completed transactions.

When same values are updated there is a record about a committed transaction, so CHECKPOINT has to verify the dirty pages and write to the disk any ones that have been changed. When the query filters out the same values there is no record about a committed transaction, so CHECKPOINT will not even start analysing contents of the pages.

3. DMV shows the outcome of CHECKPOINT, not UPDATE, therefore the pages updated in memory don't appear there.

_____________
Code for TallyGenerator