B Tree Indexes

  • Hi,

    I would highly appreciate if you shed some light on the following.

    When I was reading the database book by Elmasri, I learnt that B+ Tree indexes at their leaf level have a set of key values along with data pointers pointing to the records themselves. IN other words, the actual data was separate from the B+ tree index, as they said. Now, in SQL Server, however, the leaf level of a clustered index is the actual data in the underlying table itself. If you will, having all the data at the leaf level of the index would mean that at the level above the leaf level, there would be unimaginably more keys and pointers and therefore many more nodes, since having the entire data on the leaf level would mean that fewer rows will fit into one data page. I mean, it's obvious that if we store entire rows instead of (key, pointer to record) combinations at the leaf level, we will have to have less in each leaf node and therefore there would be far more pointers at the level above leaf level because of the far more nodes at the leaf level. For these many more pointers and therefore nodes, we would need more nodes at the level above and so on. My question is, what is mircosoft's rationale in having such a huge b tree index as the clustered index. why not a more lean index with just the key, pointer combinations at the leaf level.

    Thanks

    Karim

  • Well, since non-clustered indexes are stored that way there would be no idea in having clustered indexes being stored the same way, right? If you do not want to use a non-clustered index then don't, however that comes with specific problems you might run into if you do not use them (see my articles about clustered indexes for more, link in my signature). With a clustered index you get the data pages (and rows) sorted in logical order which you can benefit from in different ways, depending on which column(s) you choose as your index key(s).

  • "When I was reading the database book by Elmasri, I learnt that B+ Tree indexes at their leaf level have a set of key values along with data pointers pointing to the records themselves."

    Does this book also describe dense B+ Tree indexes ? Instead of dense, SQL Server uses the name "clustered" and Oracle uses "index organized".

    Regarding the physical implementation of a clustered index, this is well described in Books Online. On the BOL index tab, enter "clustered indexes" and then "architecture" and finally pick "Table and Index Architecture".

    SQL = Scarcely Qualifies as a Language

  • A clustered index requires more space than a non-clustered index but at most is just one more layer of depth in the B-tree.  This is offset by the fact that once the index lookup is completed the data has already been read.  This is not much of an advantage for a single-row lookup but makes a huge difference for a range lookup.

    The clustered index may be larger than a non-clustered index, but it is not larger than the combination of the non-clustered index and the data pages.

    If a clustered index exists on a table all other indexes for that table have a copy of the clustered index instead of a rowid in the leaf level, so a wide clustered index can be a bad idea.  But it is up to the designer to choose the index structure based on size and performance constraints.

    Tables with clustered indexes can be defragmented by DBCC DBREINDEX and DBCC INDEXDEFRAG.  Heaps (tables without a clustered index) cannot be defragmented except by deleting and re-loading all rows (or by creating and dropping a temporary clustered index on the table).  Another performance issue is when a row in a heap table is updated with larger values for variable-length fields, and there is not enough room in the original page to store it, the row is saved in another page and a forwarding pointer is left at the original location so the rowid values in all indexes don't have to be updated.  A heap with many forwarding pointers requires a lot more I/O to scan.

    An update to a clustered index value requires the entire row to be moved to the correct location in the index, and if there is no room there a page split will occur to make room.  The same problem occurs with random inserts into the table.  These problems can be avoided by choosing clustered index fields that will not be updated and will have increasing values for all inserts, such as identity fields or date fields.

    The disadvantages of clustered indexes can be addressed by proper design and by regular maintenance plans.  The advantages of clustered indexes are query performance and control of table fragmentation.

  • Thank you, Scott and Chris. Scott: You mentioned that the size of a clustered index is not larger than the combined size of a corresponding non-clustered index and the data pages. But I am still confused about that. Since the rows at the leaf level will be fewer per data page, there will be more data pages than if there were only (key, record pointer) combinations at the leaf level. Correspondingly, there will be many more pointers at the index level above, and therefore more nodes or larger nodes at that level. Won't this make the size of the B-tree larger than a non-clustered index? Please clarify.

    Karim

  • It is true that there will be more index entries in a clustered index due to the larger number of leaf-level pages, but this adds at most one level to the depth of the b-tree.  This may slightly increase the number of pages required to store the index.  But you would need to traverse that extra level anyway to do the bookmark lookup required to find the data page for each index entry, so the pages read for a one-row lookup (by clustered index) is probably the same.

    In use, tables can become fragmented.  Tables with clustered indexes can be defragmented with regular maintenance plans.  Heaps can accumulate performance-robbing forwarding pointers, and cannot be defragmented easily.  In a high-availability production database you can switch from DBCC DBREINDEX to DBCC INDEXDEFRAG to address fragmentation without completely locking the table, but with a heap there is no good answer.  This could easily (with some usage patterns) make the heap table + nonclustered index grow larger than the table + clustered index.

    I will admit that your point about clustered indexes being larger may be true, in a very static analysis of a table that has never seen a delete or an update.  Most people would consider the performance benefits and maintenance advantages of clustered indexes to be much more important points.

  • Thanks very much Scott. I see your point. In fact, I found a flaw in my own reasoning. Since data pages have to be there anyway, it's basically a question of storing (key, pointer to record) combinations at the second-last level of a clustered index. Since rows in a clustered table are ordered by the clustered index key, we don't need to do that and just need a pointer per data page, not a pointer per record. In this way, having a clustered index would save on the number of index entries required to store the index and thus prove to be a storage-efficient way of indexing. Then the clustered index would, in fact, be smaller than a non-clustered index. What do you think?

  • I thought your original point was that a clustered index would require more pages to store the non-leaf part of the index due to the higher leaf-level page count, which is true.  You are also correct when you point out that the clustered index uses the table itself for the leaf level, eliminating the index leaf pages and resulting in a smaller index.

    But I think you are focusing on relatively minor issues.  When I think of size issues with respect to clustered indexes, the main issue that comes to mind is that the clustered index fields are used as row pointers in the leaf level of all non-clustered indexes.  A large clustered index repeated many times will greatly increase the total space required for all indexes, but this isn't something a good designer would do unless it made sense for the query workload and they had plenty of disk space.

    A clustered index that is well-designed for the queries used on the table is more likely to show up in execution plans as a highly efficient merge join.  A non-clustered index can be used very efficiently if it contains all the fields needed for a query (a covering index), but if other fields from the table are required you will see bookmark lookups or even table scans in the execution plan.  Seeing the I/O page count for a query drop by factors of 10 or 100 is more important in most cases than seeing a slight difference in total page count for index storage.

  • There are some great webcasts available (both live and on-demand) from Microsoft at this site: http://www.microsoft.com/events/series/msdnsqlserver2005.mspx#On-Demand%20Webcasts

    Indexes and fragmentation are discussed in parts 4 and 5 of a 10-part "A primer to Proper SQL Server Development"

  • Thanks a lot, Scott. The webcasts you mentioned are quite interesting. I really appreciate your help.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply