All this talk of testing backups and data integrity and checksums has me thinking that maybe now would be a good time to discuss page verification options in SQL Server. Sometimes bad things happen to good SANs and data pages aren’t quite the same when we read them as when we wrote them to disk. In SQL Server 2000, the only option you had to detect possible problems in you IO subsystem was torn page detection. Beginning in SQL 2005, however, we were given another option: Checksum. Checksum is generally considered the better choice, but let’s drill down a little further.
Checksum vs. Torn Page Detection
You might wonder why the switch from Torn Page Detection to Checksums, and what’s the difference. The difference is in how they validate IO. When it writes a page, Torn Page Detection grabs the first 2 bits of every 512 byte sector on each page and stores those bits in the page header. When the page is later read back in from disk, SQL Server compares those header bits with the bits from the sector, to make sure they’re still the same. Now, that check is better than nothing, but you can see where it might potentially miss corruption in the rest of the sector, right?
The Checksum option, on the other hand, creates a checksum value using the content of the entire page, and saves that value in the header. When a page is read from disk, a checksum is created again and compared to the saved checksum. Since any difference in the bytes read for that page will result in a different checksum value, this is a far more thorough method for validating IO.
Making the switch
Checksum is the default page verification option for databases created in SQL Server 2005 and later. If your database has been upgraded from an earlier version, you’ll need to manually switch to Checksum. Switching your PAGE_VERIFY option to checksum doesn’t magically create a checksum on all data pages, though. A checksum is only created when a page is modified on disk. So if you have a lot of read-only tables, they won’t have checksums on them. Want to see? (Of course you do.)
First, I’ll create a database with no page verification. I’ll then create a table and load some records into it. I issue a checkpoint and flush the clean buffers to disk, to make sure I’m not reading from the buffer cache.
CREATE DATABASE [Checksum_test] ON PRIMARY ( NAME = N'Checksum_test', FILENAME = N'F:\d1db09ms_data\Checksum_test.mdf' , SIZE = 3072KB , FILEGROWTH = 1024KB ) LOG ON ( NAME = N'Checksum_test_log', FILENAME = N'F:\d1db09ms_logs\Checksum_test_log.ldf' , SIZE = 18432KB , FILEGROWTH = 10%) GO ALTER DATABASE [Checksum_test] SET RECOVERY SIMPLE GO ALTER DATABASE [Checksum_test] SET PAGE_VERIFY NONE GO USE Checksum_test GO CREATE TABLE tab1 ( a smallint identity (1,1), b char(1)); GO INSERT INTO tab1 (b) values ('c') GO 1000 CHECKPOINT; GO DBCC DROPCLEANBUFFERS; GO
In order to view the checksum information we’ll need to use the DBCC IND and DBCC PAGE commands. We also set trace flag 3604 so the output from DBCC will come to our screen.
select object_id('tab1') DBCC TRACEON (3604); GO -- get page numbers DBCC IND ('Checksum_test', 350624292, 1) -- view page DBCC PAGE ('Checksum_test', 1, 154, 3)
Here’s a portion of the DBCC PAGE output.
PAGE HEADER:
Page @0x00000006FFE66000
m_pageId = (1:154) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0×4 m_level = 0 m_flagBits = 0×8000
m_objId (AllocUnitId.idObj) = 194 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594050641920
Metadata: PartitionId = 72057594038779904 Metadata: IndexId = 0
Metadata: ObjectId = 350624292 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 7 m_slotCnt = 642 m_freeCnt = 392
m_freeData = 6516 m_reservedCnt = 0 m_lsn = (105:709:3)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0
See that m_tornBits field? That’s where our checksum or torn page validation value would be if we were using it. A zero value means neither is present for this page.
So let’s alter our database and turn on checksums and re-examine the page.
ALTER DATABASE [Checksum_test] SET PAGE_VERIFY CHECKSUM GO DBCC PAGE ('Checksum_test', 1, 154, 3)
The output:
PAGE HEADER:
Page @0x00000006FFE66000
m_pageId = (1:154) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0×4 m_level = 0 m_flagBits = 0×8000
m_objId (AllocUnitId.idObj) = 194 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594050641920
Metadata: PartitionId = 72057594038779904 Metadata: IndexId = 0
Metadata: ObjectId = 350624292 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 7 m_slotCnt = 642 m_freeCnt = 392
m_freeData = 6516 m_reservedCnt = 0 m_lsn = (105:709:3)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0
No change. Setting PAGE_VERIFY to checksum didn’t cause SQL Server to go out and create a checksum value for every page. Which is a good thing, really, as that would be a very expensive process.
So let’s update a record on that page and check it again:
update tab1 set b = 'z' where a=10 DBCC PAGE ('Checksum_test', 1, 154, 3)
The output:
PAGE HEADER:
Page @0x00000006FFE66000
m_pageId = (1:154) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0×4 m_level = 0 m_flagBits = 0×8000
m_objId (AllocUnitId.idObj) = 194 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594050641920
Metadata: PartitionId = 72057594038779904 Metadata: IndexId = 0
Metadata: ObjectId = 350624292 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 7 m_slotCnt = 642 m_freeCnt = 392
m_freeData = 6516 m_reservedCnt = 0 m_lsn = (105:1157:2)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0
Hey! We changed the page, SQL Server should have generated a checksum, right? Nope. Remember, checksums are there to validate the IO process. They’re only created when the page is written to disk. Right now all we have is a dirty page in memory.
So let’s issue a checkpoint and flush that page back to disk.
CHECKPOINT; GO DBCC PAGE ('Checksum_test', 1, 154, 3)
Now we have a checksum value:
PAGE HEADER:
Page @0x00000006FFF78000
m_pageId = (1:154) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0×4 m_level = 0 m_flagBits = 0×8200
m_objId (AllocUnitId.idObj) = 194 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594050641920
Metadata: PartitionId = 72057594038779904 Metadata: IndexId = 0
Metadata: ObjectId = 350624292 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 7 m_slotCnt = 642 m_freeCnt = 392
m_freeData = 6516 m_reservedCnt = 0 m_lsn = (105:1157:2)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = –1166377712