In the last weblog post I have talked about the differences in unique and non-unique
non-clustered indexes on a unique clustered index. In this weblog post I want to talk
about the differences of non-clustered indexes defined on a non-unique clustered index.
As you already know from this posting,
SQL Server handles non-unique clustered indexes internally different as unique clustered
indexes. If you define a non-unique clustered index, SQL Server adds the so called uniquifier to
your index records, which leads to a 4 byte overhead per each index row in the navigation
structure of your clustered index.
The following listing creates again our customers table, defines this time a non-unique
clustered index on it, and finally two non-clustered indexes, where one of them is
unique and the other is non-unique.
-- Create a table
with 393 length + 7 bytes overhead = 400 bytes
-- Therefore
20 records can be stored on one page (8.096 / 400) = 20,24
CREATE TABLE Customers
(
CustomerID INT NOT NULL,
CustomerName CHAR(100) NOT NULL,
CustomerAddress CHAR(100) NOT NULL,
Comments CHAR(189) NOT NULL
)
GO
-- Create a non
unique clustered index on the previous created table
CREATE CLUSTERED INDEX idx_Customers ON Customers(CustomerID)
GO
-- Insert
80.000 records
DECLARE @i INT = 1
WHILE (@i <= 20000)
BEGIN
DECLARE @j INT = 1
INSERT INTO Customers VALUES
(
@i,
'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR),
'CustomerAddress' + CAST(@i AS CHAR),
'Comments' + CAST(@i AS CHAR)
)
SET @j += 1;
INSERT INTO Customers VALUES
(
@i,
'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR),
'CustomerAddress' + CAST(@i AS CHAR),
'Comments' + CAST(@i AS CHAR)
)
SET @j += 1;
INSERT INTO Customers VALUES
(
@i,
'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR),
'CustomerAddress' + CAST(@i AS CHAR),
'Comments' + CAST(@i AS CHAR)
)
SET @j += 1;
INSERT INTO Customers VALUES
(
@i,
'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR),
'CustomerAddress' + CAST(@i AS CHAR),
'Comments' + CAST(@i AS CHAR)
)
SET @i += 1
END
GO
-- Create a unique
non clustered index on the clustered table
CREATE UNIQUE NONCLUSTERED INDEX idx_UniqueNCI_CustomerID
ON Customers(CustomerName)
GO
-- Create a non-unique
non clustered index on the clustered table
CREATE NONCLUSTERED INDEX idx_NonUniqueNCI_CustomerID
ON Customers(CustomerName)
GO
When you now do a short look into the sys.dm_db_index_physical_stats DMV,
you can see that the unique non-clustered index takes 107 bytes per index row in the
navigation levels, where the non unique non-clustered index takes 117 bytes on the
average (minimum 111 bytes, maximum 117 bytes). Let's analyze the differences and
dump out the index root page of the unique non-clustered index through the DBCC
PAGE command:
DBCC PAGE(NonUniqueClusteredIndexStructure_NonClusteredIndex, 1, 4529, 3)
GO
As you can see from the following picture SQL Server only stores the unique non-clustered
key in the index root level (and also in the intermediate levels), because the non-clustered
key is already unique by itself:
When you examine the byte by byte representation of the unique non-clustered index
record, you can see that SQL Server uses here the following bytes:
- 1 Byte: Status Bits
- n Bytes: Unique Non-Clustered Index Key – in this case 100 bytes
- 4 Bytes: PageID
- 2 Bytes: FileID
When you dump out the leaf-level of the unique non-clustered index, you can see that
SQL Server points through the non-unique clustered key and the uniquifier to the correct
record in the clustered table:
The conclusion here is that the unique non-clustered index on a non-unique clustered
index makes only the 4 byte overhead in the leaf level of the clustered index, because
here SQL Server directly points to the correct record. There is no additional overhead
involved in the non-leaf levels of the unique non-clustered index.
Let's now dump out the index root page of the non-unique non-clustered index defined
on our non-unique clustered index:
This is now a really interesting output! The key of the index record must be by design
unique. How can SQL Server make a non-unique non-clustered index key unique? Easy
- SQL Server just adds the clustered index key (4 bytes). But the clustered index
key is also not unique by default, therefore SQL Server also adds the uniquifier (4
bytes), so you have an resulting overhead of 8 bytes per index row, when the uniquifier
is not equal to 0. When the uniquifier is equal to 0, you get an overhead of 4 bytes,
because in this case the uniquifier is not stored physically in the index record,
and a 0 is assumed by SQL Server automatically. When you again analyze the byte by
byte representation you can see the following bytes:
- 1 Byte: Status Bits
- n Bytes: Unique Non-Clustered Index Key – in this case 100 bytes
- n Bytes: Non-Unique Clustered Index Key – in this case 4 bytes
- 4 Bytes: PageID
- 2 Bytes: FileID
- 4 Bytes: Some bytes used by the uniquifier
- 4 Bytes: The uniquifier value itself, when it is not equal to 0
The minimum length of the index record is therefore 111 bytes and the maximum length
is 117 bytes already found out earlier through the sys.dm_db_index_physical_stats DMV.
When you finally dump out the leaf-level of the non-unique non-clustered index, you
get the following result:
As you can see the leaf-level is the same as the leaf-level in a unique non-clustered
index defined on a non unique clustered index. The leaf level just points through
the clustered key (CustomerID) and the uniquifier to the correct
record in the clustered table. This example has shown you that there is huge overhead
(8 bytes per index row) when you define a non-unique non-clustered index on a non-unique
clustered index, because SQL Server must make the non-clustered index records internally
unique and needs therefore several bytes of storage overhead. You can download the
T-SQL script for this posting here.
-Klaus