February 27, 2016 at 8:25 pm
Hello,
Today morning while having a discussion with my friend, I came to know about a common confusion.
I want to highlight the confusion and would request your expertise:
1. BTree: Is this Binary Tree or Balanced Tree
2. B+Tree: After reading about BTree, B+tree and some blogs, it looks like SQL indexes are arranged as B+trees, however, MSDN tells it to be BTree
So my confusion is whether BTree is Binary or Balanced tree?
Are the indexes stored as BTree or B+Tree, if yes is there any full form of B+Tree?
February 28, 2016 at 2:34 am
SQL Server rowstore indexes are stored as balanced trees (*). SQL Server does not use binary trees. I am not aware of any official ruling on what the shorthand "B-tree" means in a generic situation, but in the context of SQL Server, I'll go with balanced.
(*) They are actually "modified balanced trees". At least that's what I often read. So far, I have not been able to find any explanation on what exactly the alleged modification is.
February 28, 2016 at 10:17 am
Yes, we may call BTree as Balanced, though it is not official.
However the index structure is not a BTree, it's actually B+Tree.
When we see the difference between BTree and B+Tree, BTree stores data pointer at all the levels, however, B+Tree stores data pointers only at leaf level, which is true for SQL indexes where data pointers are stored in leaf level pages.
February 29, 2016 at 10:05 am
According to Knuth's definition, a B-tree of order m is a tree which satisfies the following properties:
1. Every node has at most m children.
2. Every non-leaf node (except root) has at least ?m/2? children.
3. The root has at least two children if it is not a leaf node.
4. A non-leaf node with k children contains k-1 keys.
5. All leaves appear in the same level
The implementation in SQL-server is different from these rules. Most changes are practical and can be called an improvement. A B-plus tree is a B-tree with 'alterations' in the rules.
Because of rule 2 and rule 5 a B-tree will always be fairly balanced. (To keep the tree fairly balanced the leaf nodes should also be at least (about) half full). *)
Although the B-tree is fairly balanced it is not a balanced tree.
Ben
*)
If all nodes (except the root) are always filled to at least (near) 50 percent. And all the leaves are at the same level, the tree will always be 'fairly' compact, and in general have at most one extra level compared to a perfectly build tree. But because 50 percent fill is allowed it can be very efficiently during data modifications.
February 29, 2016 at 11:27 am
Wikipedia has a good item on "B-Tree", which includes a quote from one of the original inventors - Ed McCreight (with Rudolf Bayer).
The "B" in B-tree does not have a specific meaning. They liked it because "B" had several possible interpretations - all of which they liked: "Boeing" (where they worked), "Bayer", and "Balanced".
"B-Trees" are generally not "binary trees", but are extensions of them. A "binary tree" is technically a tree with at most two child nodes under each parent node AND each node contains either the data record or one pointer to the single data record matching the node's key value. A B-Tree typically contains many more than two child nodes and pointers/records under a parent node, reducing the number of levels needed to be traversed to find the data record.
The technical distinctions between a "B-Tree" and a "B+Tree" is whether the nodes (not leaves) can have just keys and pointers to other lower-level nodes (B+tree) or keys plus pointers to the data records (B-tree). In a B+tree, you would walk the tree to a "leaf", which would contain only pointers to data records. In a B-tree, you would walk the tree to a node containing the desired key, which could contain pointers to other nodes and pointers to the data records.
There are several variants for improving performance: nodes/leaves which include pointers to the "successor" (left/right) nodes/leaves; nodes containing redundant keys to reduce split frequencies, etc.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply