Log File Space during Updates of a Table

Question

Post reply

Log File Space during Updates of a Table

Grizzly Bear

Ten Centuries

Points: 1192
More actions
June 10, 2011 at 1:15 pm

#239722

I have a DB on QA Server, which has its Log Space set to 10 GB Restricted Growth. (We like setting our limits to Log Files).
I have the DB in SIMPLE recovery mode.
I have many tables, but there is a table which has 35 Million Rows, and we were running an UPDATE on the table. We got an error saying Log File Cannot be expanded, so failing (Something to this extent). We increased the Log File Size t0 15 GB and it worked.
I checked the table size, it was 3.5 GB. We have only 1 Non Clustered Index, apart from a Clustered Index. The Non Clustered Index does not contain the Updated Column.
Question is why is the Log File of 10 GB getting Full when I am trying to Update a table of only 3.5 GB ?
And yes there is nothing else running in the DB. I did collect stats running DBCC SQLPERF (logspace) every 10 secs and found out that at its peak the Log Space used was 88% of 15 GB and at non-peak -- this is less than 1% Which means at its peak -- this transaction required almost 12 - 13 GB, even though the table is only 3.5 GB. Assuming the whole table gets written to the Log File even then this is un-explained.
How does any update statement utilize the Log File ?

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

Log space used != table space used.

Have you dug into a log parsing technique like DBCC LOG?

There's a lot going on in there. Everything needed to make sure that any update/delete/insert can be rolled back and maintain database ACIDity. That can be a lot more than simple database values.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

At minimum the log used for the update will be 2x the size of the data changing + log header, log block, etc overhead. That's cause it logs old and new values so that it can roll forward or backward.

SQL will also reserve enough space to be able to log compensating updates should the transaction be rolled back, so that's at least another 2x data changing + overhead. So we have around 14 GB easily just there (less if you're modifying a non-key column, more if you're modifying a key column that will require a split update)

And that's all assuming that no page splits occur, if any do that's lots more logging from that.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 3

Grizzly Bear (6/10/2011)
I have a DB on QA Server, which has its Log Space set to 10 GB Restricted Growth. (We like setting our limits to Log Files).
I have the DB in SIMPLE recovery mode.
I have many tables, but there is a table which has 35 Million Rows, and we were running an UPDATE on the table. We got an error saying Log File Cannot be expanded, so failing (Something to this extent). We increased the Log File Size t0 15 GB and it worked.
I checked the table size, it was 3.5 GB. We have only 1 Non Clustered Index, apart from a Clustered Index. The Non Clustered Index does not contain the Updated Column.
Question is why is the Log File of 10 GB getting Full when I am trying to Update a table of only 3.5 GB ?
And yes there is nothing else running in the DB. I did collect stats running DBCC SQLPERF (logspace) every 10 secs and found out that at its peak the Log Space used was 88% of 15 GB and at non-peak -- this is less than 1% Which means at its peak -- this transaction required almost 12 - 13 GB, even though the table is only 3.5 GB. Assuming the whole table gets written to the Log File even then this is un-explained.
How does any update statement utilize the Log File ?

It sounds like you have a bit of "Halloweening" going on in the UPDATE because of an illegal form of the update. Please post the code so I can see if that's correct or not.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Grizzly Bear Ten Centuries Points: 1192 More actions · Answer 4

It is a simple UPDATE statement.

UPDATE <tablename> SET <columnName> = 1 WHERE <datefield> < 'TodaysDate'

And no the <ColumnName> is not part of any Index.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 5

Grizzly Bear (6/11/2011)
It is a simple UPDATE statement.
UPDATE <tablename> SET <columnName> = 1 WHERE <datefield> < 'TodaysDate'
And no the <ColumnName> is not part of any Index.

I can't help if you're going to paraphrase the query. Please post the actual query. It would also be helpful if you posted the CREATE TABLE statement and included the indexes, PK, FK's, triggers, etc, (you can easily do that by setting the scripting options to include all of that).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Grizzly Bear Ten Centuries Points: 1192 More actions · Answer 6

Unfortunately I am not allowed to disclose more information as a Company Policy.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 7

Obfuscate the table name and column names if you have to. Jeff needs to see the structure more than anything else.

Grizzly Bear Ten Centuries Points: 1192 More actions · Answer 8

OK here it is .

This table had 35 Million Rows.

We were Updating the BitColumn Field.

There was also a Trigger which would write to another table, which is at the bottom for every row updated.

CREATE TABLE [dbo].[Table1](

[Table1ID] [uniqueidentifier] NOT NULL,

[CompanyID] [int] NOT NULL,

[ECID] [int] NOT NULL,

[CreationDate] [datetime] NOT NULL,

[BitColumn] [bit] NOT NULL,

CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED

(

[Table1ID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 70) ON [PRIMARY]

) ON [PRIMARY]

GO

ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_Table1ID] DEFAULT (newid()) FOR [Table1ID]

GO

ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_CreationDate] DEFAULT (getdate()) FOR [CreationDate]

GO

ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_BitColumn] DEFAULT (0) FOR [BitColumn]

GO

TRIGGER TABLE

CREATE TABLE [dbo].[Table1_SplitEdit](

[Table1ID] [uniqueidentifier] NOT NULL,

[CreatedDate] [datetime] NOT NULL

) ON [PRIMARY]

GO

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Grizzly Bear (6/13/2011)
There was also a Trigger which would write to another table, which is at the bottom for every row updated.

Which means you need to account for that as well in the log space calculations (and its reserved log space in case of rollback)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 10

Grizzly Bear (6/13/2011)
OK here it is .
This table had 35 Million Rows.
We were Updating the BitColumn Field.
There was also a Trigger which would write to another table, which is at the bottom for every row updated.
CREATE TABLE [dbo].[Table1](
[Table1ID] [uniqueidentifier] NOT NULL,
[CompanyID] [int] NOT NULL,
[ECID] [int] NOT NULL,
[CreationDate] [datetime] NOT NULL,
[BitColumn] [bit] NOT NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[Table1ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 70) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_Table1ID] DEFAULT (newid()) FOR [Table1ID]
GO
ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_CreationDate] DEFAULT (getdate()) FOR [CreationDate]
GO
ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [DF_Table1_BitColumn] DEFAULT (0) FOR [BitColumn]
GO
TRIGGER TABLE
CREATE TABLE [dbo].[Table1_SplitEdit](
[Table1ID] [uniqueidentifier] NOT NULL,
[CreatedDate] [datetime] NOT NULL
) ON [PRIMARY]
GO

Sorry I'm late with an answer.

Except for the fact that you have a trigger on the table (and judging from the trigger table, may not even come into play), I don't see anything obvious in any of that would cause the logfile to grow as much as you say. The only thing I can think of is that every system has a "tipping point" for updates. For example (only), a million row update may only take 3 seconds, a two million row update may only take 6 seconds. Following that pattern, a three million row update may only take 9 seconds. Again, following that same patter, you would expect a 4 million row update to only take 12 seconds but, instead, it takes 20 minutes. I've never checked into precisely why there's such a jump in duration at the "tipping point" but, with what you've said about the log file size, I'm thinking that SQL Server simply loses it's mind and starts writting the equivalent of a Cartesian product in the log file.

Of course, I could be wrong.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)