index column order for a clustered primary key

Question

index column order for a clustered primary key

R.P.Rozema

SSChampion

Points: 12317
More actions
November 22, 2011 at 4:12 am

#252324

Recently I noticed that some of the indexes supporting primary keys on our tables did not have the column order that was specified in the primary key definition for those tables. The result was that sometimes sub-optimal query plans were generated for queries against these tables. I started investigating and found that the index columns were ordered by their physical location in the row, not that of the key definition. I also found that this happens only on clustered primary keys: if I change the primary key from clustered into nonclustered, the index follows the column ordering in the primary key definition. When I change it back into a clustered key, the index follows the physical ordering of the columns again. The only way to make the index column order on a clustered primary key as I need it, is by re-arranging the physical column layout on the table.
I initially found and tested this on Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64), but the same behavior is shown in Microsoft SQL Server 2005 - 9.00.4266.00 (X64).
I can nowhere find anything about this behavior in Books Online or MSDN. Does anyone have a link that makes this behavior official? i.e. so I can define an (internal) procedure that the physical column order must be properly specified to get the correct indexes?
Consider the following test tables, note that the column order on the primary key differs from that on the tables:
use tempdb
go
create table dbo.test (
colA int not null,
colB int not null,
constraint pkTest primary key clustered (colB, colA)
);
go
create table dbo.test2 (
colA int not null,
colB int not null,
constraint pkTest2 primary key nonclustered (colB, colA)
);
go
Below query lists the index columns for all indexes on these tables. Once ordered by their index_column_id and once by their key_ordinal. It demonstrates that a difference only occurs on clustered indexes.
select object_schema_name(ix.object_id) + '.' + object_name(ix.object_id) as ,
ix.name as [index],
ix.type_desc,
stuff((
select ', ' + col_name(sc.object_id, sc.column_id) as [text()]
from sys.index_columns sc
where sc.object_id = ix.object_id
and sc.index_id = ix.index_id
order by sc.index_column_id
for xml path('index-column'), type
).value('.','varchar(max)'), 1, 2, '') as [index columns],
stuff((
select ', ' + col_name(sc.object_id, sc.column_id) as [text()]
from sys.index_columns sc
where sc.object_id = ix.object_id
and sc.index_id = ix.index_id
order by sc.key_ordinal
for xml path('index-column'), type
).value('.','varchar(max)'), 1, 2, '') as
from sys.indexes ix
where ix.object_id in (object_id('dbo.test'), object_id('dbo.test2'))
order by 1, ix.index_id
Both tables have physical column order colA, colB, but only if the primary key is nonclustered, the index has the column order as I defined it: colB, colA. For the clustered primary key the column order is now colA, colB.
table index type_desc index columns key columns
dbo.test pkTest CLUSTERED colA, colB colB, colA
dbo.test2 NULL HEAP NULL NULL
dbo.test2 pkTest2 NONCLUSTERED colB, colA colB, colA
I know this does not seem hard to fix. Simply re-arranging the columns in the table fixes the problem, but that requires downtime for our 24x7 operation. Plus it is one more thing to be aware of and we didn't know about it. So my question is: is there any documentation that describes this or is it one more "little known feature" ?
edit: tried to improve alignment in the table

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 1

There's actually nothing wrong here.... (though it certainly appears there is at first, second and 3rd glance)

The index_column_id doesn't say that it gives the columns in the order that they appear in the index, it just says 'ID of the index column. index_column_id is unique only within index_id.' One could assume that's the key order, but apparently not in all cases. This could catch people badly. Worth a blog post I think, sometime.

It's the key_ordinal column that shows what order the columns are within the index and that shows that you have ColB, ColA in both cases, which is exactly what you expect and what.

From BoL: key_ordinal: Ordinal (1-based) within set of key-columns.

If we dive down into the internals and have a look at the raw index page for table test (the one with the cluster that 'appears' to have the columns in the wrong order) you can see that they are in the order you intend in the index.

Interesting find.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 2

p.s. I checked with queries too. A filter on ColB does an index seek and a filter solely on ColA does a scan, so ColB is the leading column of the index and ColA the second, as defined in the index definition.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

R.P.Rozema SSChampion Points: 12317 More actions · Answer 3

So I was wrong indeed (and I will quickly revoke the QoD based on this :Whistling:)

I do think I'll put up a new one, just to illustrate how easily one can misunderstand this key_ordinal / index_column_id.

Many thanks Gail.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?