PRIMARY KEY Vs CLUSTERED INDEX row ordering

  • Hello,

    I have been going through the excellent Stairway series on indexes on this site and while trying out a few things I came across something puzzling.

    If I create a CLUSTERED index on a column or columns, it then appears that selecting * from the associated table shows the returned results set ordered in the logical sequence of the CLUSTERED index. Great! I get it.

    OK, so I then decided to create the same index, but as a PRIMARY KEY. Now, as I understand it, a PK is by default a CLUSTERED index, but it also has a UNIQUE constraint associated with it. OK, so a PK is just a CLUSTERED index that must be unique.

    I then selected all columns from this table expecting to see the results set ordered as per the PRIMARY KEY - just as in the case of a CLUSTERED index - but no, the table appears not to be sequenced using the PRIMARY KEY.

    I did expect to see the results set ordered just as in the case on non PK clustered indexes, so perhaps someone could tell me why this is so?

    I know it's perhaps a rather trivial concern, but when things don't work as expected, I like to have some clue as to why so that I feel I understand the concept properly.

    Regards

    Steve

  • Hi Steve, if you're not using an order by clause, there's no guarantee on the order of the results returned.

  • Yes, I do remember reading that somewhere.

    However, this wasn't aluded to in the article I mentioned. In fact, the article showed an example of how using a CLUSTERED key will sequence the returned rows.

    Also, as I recall, the article also mentioned that adding a CLUSTERED index to a table (One, that I might add has no other indexes) essentially re-orders the table.

    I am left wondering how do I reconcile thiese opposing views?

  • raotor (2/7/2012)


    Yes, I do remember reading that somewhere.

    However, this wasn't aluded to in the article I mentioned. In fact, the article showed an example of how using a CLUSTERED key will sequence the returned rows.

    Also, as I recall, the article also mentioned that adding a CLUSTERED index to a table (One, that I might add has no other indexes) essentially re-orders the table.

    I am left wondering how do I reconcile thiese opposing views?

    Hi Steve, which article in the series is it?

    Yes, creating a clustered index will order the table on disk, in the order of the index columns specified at creation.

    DML operations can mess with this order & create index fragmentation. At which point the table isn't necessarily in order.

    However this is all about the order on disk - there's still no guarantee on what order the rows are returned to a client when there's no order by clause.

    Cheers

    Gaz

  • An index enforcing a primary key constraint will indeed be created clustered by default, but if there is already a clustered index on the table, the new index will be non-clustered. Could that be what is happening in your case?

    John

  • I was reading the Stairway to SQL Server Indexes - Level 4 I believe.

    However, I think you may have explained the apparent discrepancy in what I think I'm seeing.

    If you create a CLUSTERED index on a table and nothing else happens I.e DML activity, then it does appear that the results set is ordered. But, of course, you cannot guarantee this because of the reasons you mentioned.

    I can see now what you mean.

    Thanks.

    Regards

    Steve

  • John Mitchell-245523 (2/7/2012)


    An index enforcing a primary key constraint will indeed be created clustered by default, but if there is already a clustered index on the table, the new index will be non-clustered. Could that be what is happening in your case?

    John

    Hi John,

    Apologies, I should've explained. I removed the CLUSTERED index before I added the equivalent PRIMARY KEY version. So, there were no indexes present when using a PK.

  • Without an order by the results are returned in whatever order the last query operator left them in. If all the query did was a scan of an index (and no parallelism was involved), that will be the order of the index scanned, but that's just a side-effect of the way the query processor works. Add in parallelism, hash join or any other non-order preserving operation and suddenly the order changes.

    The clustered index defines the logical order of the rows in the table. Nothing more. It does not guarantee the physical order, nor does it guarantee the order that data will be returned in.

    No order by = no guarantee of order of the resultset. End of Story.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (2/7/2012)


    Without an order by the results are returned in whatever order the last query operator left them in. If all the query did was a scan of an index (and no parallelism was involved), that will be the order of the index scanned, but that's just a side-effect of the way the query processor works. Add in parallelism, hash join or any other non-order preserving operation and suddenly the order changes.

    The clustered index defines the logical order of the rows in the table. Nothing more. It does not guarantee the physical order, nor does it guarantee the order that data will be returned in.

    No order by = no guarantee of order of the resultset. End of Story.

    Wonderfully put. I like especially the last line 🙂 When you ask for a set of data ...

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply