(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan, where you are getting week by week via email all the essential knowledge you need to know about performance tuning on SQL Server.)
In todays blog posting I want to show you quickly about what happens in SQL Server when we are deleting records from a table. Let’s create in the first step a simple table, where 4 records fit onto one page of 8kb.
-- Create a simple table where 4 records fit onto 1 page CREATE TABLE TestTable ( Col1 INT IDENTITY(1, 1), Col2 CHAR(2000) ) GO
Afterwards we insert 4 records, so that we can fill completely one page:
-- Insert 4 records INSERT INTO TestTable VALUES ( REPLICATE('1', 2000) ), ( REPLICATE('2', 2000) ), ( REPLICATE('3', 2000) ), ( REPLICATE('4', 2000) ) GO
To dig into the details of our heap table, we will use the DBCC PAGE command to dump the allocated page out. Therefore we also have to enable the trace flag 3604 so that SQL Server returns us the result from the DBCC PAGE command directly into our session window in SQL Server Management Studio:
-- Enable the Trace Flag 3604 DBCC TRACEON(3604) GO
We can use the DBCC IND command to return all the allocated pages to a specific table or index.
-- Retrieve all pages of the table DBCC IND(DataModifications, TestTable, -1) GO
As you can see from the output, 2 pages are belonging to our table: the data page itself, and the IAM (index allocation map) page.
I’m taking here the page id of 145, and dumping out the page through the DBCC PAGE command:
-- Dump out one specific page DBCC PAGE (DataModifications, 1, 145, 2) GO
When you are using as 3rd parameter the dump option 2, SQL Server returns you a hex dump of the page including the so-called Row Offset Array at the end of the page without interpreting the data in any way.
The Row Offset Array just points to the physical location on the page, where every record is stored. The 1st record is always stored directly after the page header at offset 96 (0x60h). As you can also see that the Row Offset Array always grows backwards. Let’s delete now the 2nd record from the table:
-- Delete a record from the table DELETE FROM TestTable WHERE Col1 = 2 GO
Normally you will expect here that the record is deleted from the page. But this is not really the case: when you execute the DBCC PAGE command again, you will see that the content of the old record is still available on the page. The only thing that SQL Server has done during the DELETE operation was the invalidation of the corresponding slot in the Row Offset Array at the end of the page.
As you can see, the 2nd slot has an offset of 0×0, which is invalid, means our record is deleted. At the beginning of a page, you will always find the page header of 96 bytes. Let’s delete now the other 3 remaining records from the table.
-- Delete all the remaining records from the table DELETE FROM TestTable GO
When you look again on the page with the DBCC PAGE command, you can see that the whole content of the page is unchanged: EVERY data of EVERY record is still physically present on the page itself! But every records points in the Row Offset Array to the offset 0×0, means every record is deleted. It doesn’t matter if you are using here tables with or without a Clustered Index – the old data is always still present on the page.
The question is now, when SQL Server will reinitialize the page? When you are now inserting a new record, SQL Server will overwrite the old content of the page. But in our case, only the physical part, where the 1st record was stored. You are still able to see the contents of the other 3 “deleted” rows. But when you look at the Row Offset Array at the end of the page, you can see that it was completely reinitialized by SQL Server, which means you have now only 1 slot in it:
Just think about that behavior the next time, when you are granting applications sysadmin privileges. With the appropriate commands, these application will be able to see old “deleted” data
Thanks for reading
-Klaus