August 12, 2012 at 4:19 pm
Does an UPDATE query overwrite existing values if the already existing value is the same as the one your updating with?
August 12, 2012 at 8:14 pm
I think what you mean is if you have a data set like this:
CREATE TABLE USERS ( TYPEID INT,USERNAME VARCHAR(30),)
INSERT VALUES (1,'JSMITH')
INSERT VALUES (2,'JDOE')
INSERT VALUES (1,'ASMITH')
and you do something like this:
UPDATE USERS
SET TYPEID = 3
WHERE TYPEID = 1
will the update statement update both JSMITH and ASMITH? The simple answer is yes, it will.
August 12, 2012 at 10:18 pm
CELKO (8/12/2012)
Does an UPDATE query overwrite existing values if the already existing value is the same as the one your updating with?
Who cares?
Triggers, for example. transaction logs. Users sitting in frount of the frozen UI waiting while the database finishes the useless activity.
Update dbo.Event
SET Status = 1
WHERE Eventdate < GETDATE()
and Status = 0
Without checking for the status value update will take forever on a big history table and add a big chunk to the Transaction Log file.
If there is a FOR UPDATE trigger it will be executed for all records having Eventdate in the past.
With Status = 0 added to the query it will actually update only recently added/processed records which have not been updated before. And trigger(s) will be run only for that smaller subset.
When I fixed similar query on one of production databases it removed 8GB of data added daily to the TRN file. Not to mention easing the pressure on the server, improving performance not only this but other systems having databases hosted on the same SQL Server.
_____________
Code for TallyGenerator
August 13, 2012 at 1:50 am
Ask a simple yes/no question and get a few tangent replies, damn.
Its quite simple. You have a table as follows.
CREATE TABLE Test (
ID INT IDENTITY(1,1)
Name VARCHAR(100)
)
INSERT INTO Test (Name) VALUES ('Tom')
UPDATE Test SET Name = 'Tom' WHERE ID = 1
Does the update statement overwrite Tom even though Tom is already the value.
August 13, 2012 at 2:02 am
Why are you asking, and at what level of the engine are you interested in? (because the behaviour is interesting and complex when you get down to the deep internals)
At a high level, yes it will. The row qualifies for the update, so the row gets updated, it will count towards the rows affected and the row will appear in the inserted and deleted pseudo-tables in triggers (and in the OUTPUT clause)
When it comes to logging and to the actual modifications of the pages, it's more complex. It's not a simple yes/no question.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2012 at 3:35 am
Okay so getting a little more specific. Does the SQL Server edit the MDF file and overwrite the current value of that record or does it carry out a check to see if their the same to save writing to the HDD something that's already there.
August 13, 2012 at 3:59 am
Well... To start all changes are made in memory and not to disk, and the changes to the files on disk are done later (by a background process) at a page level, not a row level...
As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.
I'm not saying it isn't ever marked dirty, but in my simple tests with a single row in a table, updating the name to itself did not appear to mark the page as dirty.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2012 at 4:04 am
Yeah sorry I appreciate its all done in memory but I was referring to the write back stage.
That's pretty much answered my question, cheers.
August 13, 2012 at 4:38 am
chris 76827 (8/13/2012)
Yeah sorry I appreciate its all done in memory but I was referring to the write back stage.That's pretty much answered my question, cheers.
That does not mean that you should not do the check though:
update x
set y = 3
where y <> 3 or y is null
will run faster, in my experience, than
update x
set y = 3
especially if many of the rows are already 3.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 13, 2012 at 10:30 pm
GilaMonster (8/13/2012)
As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.
Actually, it does.
I can offer you a simple test.
Create a new database with FULL recovery mode. Create a single table in it.
Insert several rows of data.
Then run a script having a loop with a single statement:
UPDATE MyTable
Set ColValue = ColValue
And watch the Log file growing.
What means that pages are certainly written to disk.
_____________
Code for TallyGenerator
August 14, 2012 at 2:16 am
Sergiy (8/13/2012)
GilaMonster (8/13/2012)
As far as I can tell, updating a row and not actually changing the values does not mark the page as dirty, meaning it won't be written back to disk when the checkpoint process runs.Actually, it does.
I can offer you a simple test.
Create a new database with FULL recovery mode. Create a single table in it.
Insert several rows of data.
Then run a script having a loop with a single statement:
UPDATE MyTable
Set ColValue = ColValue
And watch the Log file growing.
What means that pages are certainly written to disk.
Nope. There will be operations, like checkpoints, transactions, background processes, but the update itself is neither logged (it's not actually a data modification) nor are the pages dirtied in the process.
Trivial to prove.
CREATE TABLE [dbo].[Test](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](100) NULL
) ON [PRIMARY]
insert into Test (Name) Values ('Tom')
Checkpoint -- all dirty pages to disk, truncate log
Now, identify the page no of that table's data page (exercise left for the reader)
-- update the row to itself
CHECKPOINT -- just to be sure and to allow for multiple tests
BEGIN TRANSACTION
UPDATE test SET name = name
SELECT [Current LSN] ,
Operation ,
Context ,
[Transaction ID] ,
AllocUnitId ,
AllocUnitName ,
[Checkpoint Begin] ,
[Num Transactions] ,
[Checkpoint End] ,
[Dirty Pages] ,
[Log Record] FROM fn_dblog(NULL, null)
SELECT * FROM sys.dm_os_buffer_descriptors AS bd WHERE page_id = '3783' -- the sole data page for this table
COMMIT
Results:
The log has nothing more than the checkpoint and the begin transaction, there's no LOP_MODIFY_ROW which would be there if the update was logged. (if the query had been after the commit, there's be a LOP_COMMIT_XACT as well). It's the checkpoint, begin and commit which would have caused the log to grow in your test.
The DMV showing the state of the pages in memory shows that the page that the table's sole row it on is unmodified after the update.
Now there will likely be cases where this is not true, but for this trivial case, updating the row to itself neither dirties the page nor logs the update.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 14, 2012 at 8:31 pm
CHECKPOINT -- just to be sure and to allow for multiple tests
DECLARE @I int
SET @I = 10
WHILE @I > 0
BEGIN
BEGIN TRANSACTION
UPDATE test SET name = name
SELECT [Current LSN] ,
Operation ,
Context ,
[Transaction ID] ,
AllocUnitId ,
AllocUnitName ,
[Checkpoint Begin] ,
[Num Transactions] ,
[Checkpoint End] ,
[Dirty Pages] ,
[Log Record] FROM fn_dblog(NULL, null)
COMMIT
SET @I = @I - 1
END
Now - run it and watch the number of records in fn_dblog growing by 2 after each cycle.
_____________
Code for TallyGenerator
August 15, 2012 at 1:41 am
Yes as I showed in the screenshot in the post right above there's the begin xact and commit xact log records (which would be your two per cycle).
This is what's is in the log after that test of 10 updates:
The update itself is not logged. If it was, there would be a LOP_MODIFY_ROW for each update (10 of them), which there is not.
(btw, you'd get the begin and commit even if there was no explicit begin and commit, as all modifications are in implicit transactions)
If I change the query and update name to "Bob", the log records look like this
That Lop_Modify_Row is the log record for the actual update, and it is missing in the case where the row is updated to itself, hence we can conclude that the update is not logged when updating the value to itself.
The question was not, does the update cause any log records to be written to the log. The question was, does the update dirty the page if the update makes no changes. It does not, as I showed with the DMV previously.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 15, 2012 at 2:06 am
The log file increaseing in size merely implies the mdf file is being written to, it's not really conclusive evidence that it is being writtent to. Just because one grows it doesn't mean the other must surely.
August 15, 2012 at 9:03 pm
GilaMonster (8/15/2012)
The question was not, does the update cause any log records to be written to the log. The question was, does the update dirty the page if the update makes no changes. It does not, as I showed with the DMV previously.
I would not be so sure.
I changed the script to bring out more info:
USE [TEST]
GO
CREATE TABLE [dbo].[Test](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](100) NULL
) ON [PRIMARY]
insert into Test (Name) Values ('TOM')
insert into Test (Name) Values ('Andy')
insert into Test (Name) Values ('Bob')
insert into Test (Name) Values ('Tim')
Checkpoint -- all dirty pages to disk, truncate log
SELECT OBJECT_ID('[dbo].[Test]') ObjID, * FROM TEST
Checkpoint -- all dirty pages to disk, truncate log
UPDATE test
SET name = 'Tom'
WHERE NAME = 'Tom'
SELECT [Current LSN] ,
Operation ,
Context ,
[Transaction ID] ,
AllocUnitId ,
AllocUnitName ,
[Checkpoint Begin] ,
[Num Transactions] ,
[Checkpoint End] ,
[Dirty Pages] ,
[Log Record] FROM fn_dblog(NULL, null)
Checkpoint -- all dirty pages to disk, truncate log
SELECT * FROM TEST
DECLARE @I int
SET @I = 4
WHILE @I > 0
BEGIN
BEGIN TRANSACTION
UPDATE test
SET name = NAME
WHERE @I%2 = 0 OR name <> NAME
EXEC sp_lock
COMMIT
SELECT [Current LSN] ,
Operation ,
Context ,
[Transaction ID] ,
AllocUnitId ,
AllocUnitName ,
[Checkpoint Begin] ,
[Num Transactions] ,
[Checkpoint End] ,
[Dirty Pages] ,
[Log Record] FROM fn_dblog(NULL, null)
Checkpoint -- all dirty pages to disk, truncate log
SET @I = @I - 1
END
GO
Please run the code with Execution Plan recorded.
1. Define "the same".
the code
UPDATE test
SET name = 'Tom'
WHERE NAME = 'Tom'
updates the value with the same, according to the column definition.
But for some reason SQL Server ignores the collation settings and actually writes the page.
So, it's not "the same" by data definition, it's "the same" by binary contents of the page(s).
2. Execution plan shows that "even" updates have 4 records committed to the UPDATE part, and "odd" ones have 0 records committed.
3. Messages show "(4 row(s) affected)" for "even" updates and "(0 row(s) affected)" for "odd" updates.
4. SP_LOCK indicates exclusive locks applied on each of 4 records in "even" updates, when condition name <> NAME is not applied.
To me it means that SQL Server treats those records as perfectly "dirty".
5. "Even" updates add LOP_BEGIN_XACT and LOP_COMMIT_XACT to the log, when "Odd" ones do not do this.
But you're right - those updates don't end up in the log file.
Considering all of the above I can see only one explanation:
1. UPDATE actually writes to the pages, regardless if the new values are the same or different from the old values.
But it writes to the pages in memory.
2. CHECKPOINT compares binary contents of the pages in memory and writes to disk only those ones which have become different after completed transactions.
When same values are updated there is a record about a committed transaction, so CHECKPOINT has to verify the dirty pages and write to the disk any ones that have been changed. When the query filters out the same values there is no record about a committed transaction, so CHECKPOINT will not even start analysing contents of the pages.
3. DMV shows the outcome of CHECKPOINT, not UPDATE, therefore the pages updated in memory don't appear there.
_____________
Code for TallyGenerator
Viewing 15 posts - 1 through 15 (of 159 total)
You must be logged in to reply to this topic. Login to reply