June 14, 2011 at 5:45 pm
Lee,
ROW_OVERFLOW_DATA - Holds variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060 byte row size limit."
I believe the above is related to a single row - where the row is defined such that it can hold greater than 8060 bytes. For example:
CREATE TABLE dbo.LargeRow (int identity primary key clustered, data1 varchar(8000), data2 varchar(8000));
Then, if we insert a row where data1 has 4000 characters and data2 has 4000 characters - this will all fit on a single page. Update the row and add 100 characters to either data1 and data2 and you now get data that will overflow the page.
In other words - you will only get overflow row data when a single row will not fit on a page.
Another thing to look at is the fragmentation level of the clustered index - right after the insert is done. In my simple test, I added the ALTER INDEX because the index was already fragmented.
This all goes back to the original statement that using IDENTITY, even with 100% fill factor will eliminate fragmentation. What this test shows is that we will see fragmentation during the insert and during updates. Deletes don't really cause fragmentation (e.g. page splits) - but will leave gaps and reduces the page used percentage.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
June 15, 2011 at 10:41 am
Jeffrey Williams-493691 (6/14/2011)
Lee,ROW_OVERFLOW_DATA - Holds variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060 byte row size limit."
I believe the above is related to a single row - where the row is defined such that it can hold greater than 8060 bytes. For example:
CREATE TABLE dbo.LargeRow (int identity primary key clustered, data1 varchar(8000), data2 varchar(8000));
Then, if we insert a row where data1 has 4000 characters and data2 has 4000 characters - this will all fit on a single page. Update the row and add 100 characters to either data1 and data2 and you now get data that will overflow the page.
In other words - you will only get overflow row data when a single row will not fit on a page.
Another thing to look at is the fragmentation level of the clustered index - right after the insert is done. In my simple test, I added the ALTER INDEX because the index was already fragmented.
This all goes back to the original statement that using IDENTITY, even with 100% fill factor will eliminate fragmentation. What this test shows is that we will see fragmentation during the insert and during updates. Deletes don't really cause fragmentation (e.g. page splits) - but will leave gaps and reduces the page used percentage.
The SplitTest table creation statement in the code below will create a "alloc_unit_type_desc" of ROW_OVERFLOW_DATA:
USE TempDB
SET NOCOUNT ON
IF OBJECT_ID('Tally') IS NOT NULL DROP TABLE Tally
CREATE TABLE Tally(N INT)
DECLARE @Counter INT
SET @Counter = 1
WHILE @Counter <= 10000
BEGIN
INSERT INTO Tally (N) VALUES (@Counter)
SET @Counter = @Counter + 1
END
Create Table SplitTest(
id int identity Primary Key Clustered
,data1 varchar(4022)
,data2 varchar(4022));
Go
Insert Into SplitTest
Select replicate('x', 30), replicate('x', 30)
From Tally t
Where n <= 8000
Go
Alter Index All On dbo.SplitTest Rebuild;
Go
Declare @tableID int;
Set@tableID = object_id('dbo.SplitTest');
Select *
From sys.dm_db_index_physical_stats(db_id('tempdb'), @tableID, Null, Null, 'Detailed');
Go
--Update dbo.SplitTest
--Set data1 = replicate('x', 4022)
--Where id = 1;
--Go
Declare @tableID int;
Set@tableID = object_id('dbo.SplitTest');
Select *
From sys.dm_db_index_physical_stats(db_id('tempdb'), @tableID, Null, Null, 'Detailed');
Go
Drop Table dbo.SplitTest;
Go
Changing either of the VARCHAR column lengths in the SplitTest table creation statement from 4022 to 4021 eliminates the ROW_OVERFLOW_DATA designation.
The lengths of the values in the REPLICATE statement in the INSERT clause used to populate the table do not influence the ROW_OVERFLOW_DATA designation. Neither does the execution of the UPDATE statement.
To complete my response and address your final assertion, a table that has no UPDATES performed, only INSERTs, and DELETEs of the oldest records in a table, will have only insignificant fragmentation. This is because only the first and last pages in the clustered index are ever fragmented.
LC
June 15, 2011 at 6:38 pm
Lee Crain (6/15/2011)
Changing either of the VARCHAR column lengths in the SplitTest table creation statement from 4022 to 4021 eliminates the ROW_OVERFLOW_DATA designation.The lengths of the values in the REPLICATE statement in the INSERT clause used to populate the table do not influence the ROW_OVERFLOW_DATA designation. Neither does the execution of the UPDATE statement.
That is very interesting - thanks for performing that test. I did not expect that to be the case based upon the documentation, but it does appear that way.
To complete my response and address your final assertion, a table that has no UPDATES performed, only INSERTs, and DELETEs of the oldest records in a table, will have only insignificant fragmentation. This is because only the first and last pages in the clustered index are ever fragmented.
I have not validated the level of fragmentation after an insert, and a delete only leaves space available on the page and doesn't actually cause any fragmentation.
Anyways, I can agree with that generalization.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
June 16, 2011 at 7:58 am
Jeffrey Williams-493691 (6/15/2011)
I have not validated the level of fragmentation after an insert, and a delete only leaves space available on the page and doesn't actually cause any fragmentation.
I agree. My misstatement.
From Books Online, the definition of fragmentation: "Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file."
Available space in a page does not equate to discontinuous ordering of rows.
LC
Viewing 4 posts - 46 through 48 (of 48 total)
You must be logged in to reply to this topic. Login to reply