Understanding B-Trees: Structure, Operations, and Implementation
Common Search Structures
| Structure Type | Data Format | Time Complexity |
|---|---|---|
| Sequential Search | Any | O(N) |
| Binary Search | Sorted | O(log N) |
| Binary Search Tree (BST) | Any | O(N) to O(log N) |
| Balanced BST (AVL, Red-Black) | Any | O(log N) |
| Hash Table | Any | O(1) average |
These structures are suitable for in-memory (internal) searches where the dataset fits entirely in RAM.
Limitations for Large Datasets
When data is too large for memory (e.g., 100GB), it resides on disk. Accessing disk is significantly slower than RAM. For a balanced BST with height O(log N), performing O(log N) disk I/O operations becomes a major bottleneck. While hash tables offer O(1) average access, severe collisions can degrade performance.
To accelerate data access:
- Use faster storage (e.g., SSD).
- Reduce tree height to minimize I/O operations → Multi-way balanced trees.
B-Tree Fundamentals
A B-Tree is a self-balancing, M-way search tree designed for efficient disk access. An M-order B-Tree (M>2) has these properties:
- The root has at least two children.
- Each internal node (except root) contains between ceil(M/2)-1 and M-1 keys, and between ceil(M/2) and M children.
- All leaf nodes reside on the same level.
- Keys within a node are sorted. The
kkeys partition the ranges of thek+1subtrees. - Node structure:
[P0, K1, P1, K2, P2, ..., Kn, Pn], whereKiare keys andPiare child pointers. All keys in the subtreePiare less thanKi+1.
Insertion Process (Using M=3 Example)
A 3-order B-Tree node can hold up to 2 keys and 3 children. For implementation ease, nodes are allocated with space for M keys and M+1 children.
Insert sequence: {53, 139, 75, 49, 145, 36, 101}.
Initial inserts (53, 139, 75): Inserting 75 fills the root node (keys: 53, 75, 139). A split is required.
Splitting a Node:
- Locate the median key (e.g., 75).
- Create a new sibling node.
- Move keys greater than the median to the sibling.
- Promote the median key to the parent node (creating a new root if none exists).
- Adjust child pointers.
After splitting, the tree has a root with key 75, left child {53}, right child {139}.
Subsequent inserts (49, 145, 36): Insert 49 and 145 into appropriate leaves. Inserting 36 causes a leaf overflow, requiring a split and promoting 49 to the parent (root). The root now holds {49, 75}.
Insert 101: Inserting 101 causes a leaf overflow. Splitting promotes 101 to the root, causing the root itself to overflow. The root splits, creating a new root with the median key 75. The tree remains perfectly balanced.
Insertion Summary:
- If tree is empty, create a new root.
- Traverse to find the appropriate leaf node for insertion.
- Insert the key into the leaf in sorted order.
- If the leaf now contains M keys, split it: a. Create a new sibling node. b. Move the upper half of keys (and children) to the sibling. c. Promote the median key to the parent. d. Recursively check if the parent now needs splitting.
- If splitting propagates to the root, create a new root.
Code Implementation
Node Structure
template<typename KeyType, int Order>
struct BTreeNode {
BTreeNode() : keyCount(0), parent(nullptr) {
for(int i = 0; i < Order; ++i) {
keys[i] = KeyType();
children[i] = nullptr;
}
children[Order] = nullptr;
}
KeyType keys[Order]; // Holds up to Order-1 keys, extra space for splitting
BTreeNode* children[Order+1]; // Pointers to children
BTreeNode* parent;
int keyCount; // Number of keys currently stored
};
Search Operation
Returns a pair containing the node and the index where the key is found (or the leaf node where insertion should occur).
template<typename KeyType, int Order>
pair<BTreeNode<KeyType, Order>*, int>
BTree<KeyType, Order>::locate(const KeyType& target) const {
BTreeNode<KeyType, Order>* curr = root;
BTreeNode<KeyType, Order>* prev = nullptr;
while(curr) {
int pos = 0;
// Linear search within the node (can be optimized to binary search)
while(pos < curr->keyCount) {
if(target == curr->keys[pos])
return {curr, pos};
if(target < curr->keys[pos])
break;
++pos;
}
prev = curr;
curr = curr->children[pos]; // Move to appropriate child
}
return {prev, -1}; // Key not found, prev is the potential leaf for insertion
}
Key Insertoin into a Node
Inserts a key (and its associated right child) in to a node, maintaining sorted order.
template<typename KeyType, int Order>
void BTree<KeyType, Order>::insertIntoNode(BTreeNode<KeyType, Order>* node,
const KeyType& newKey,
BTreeNode<KeyType, Order>* rightChild) {
int idx = node->keyCount - 1;
// Shift keys and children to the right to make space
while(idx >= 0 && newKey < node->keys[idx]) {
node->keys[idx + 1] = node->keys[idx];
node->children[idx + 2] = node->children[idx + 1];
--idx;
}
node->keys[idx + 1] = newKey;
node->children[idx + 2] = rightChild;
if(rightChild) rightChild->parent = node;
++node->keyCount;
}
Main Insertion Algorithm
template<typename KeyType, int Order>
bool BTree<KeyType, Order>::insert(const KeyType& key) {
if(!root) {
root = new BTreeNode<KeyType, Order>();
root->keys[0] = key;
root->keyCount = 1;
return true;
}
auto [node, foundIdx] = locate(key);
if(foundIdx != -1) return false; // Key already exists
BTreeNode<KeyType, Order>* curr = node;
KeyType keyToInsert = key;
BTreeNode<KeyType, Order>* newChild = nullptr;
while(true) {
insertIntoNode(curr, keyToInsert, newChild);
if(curr->keyCount < Order) return true; // No overflow
// Node overflow: split required
int midIdx = Order / 2;
KeyType midKey = curr->keys[midIdx];
// Create right sibling
BTreeNode<KeyType, Order>* sibling = new BTreeNode<KeyType, Order>();
int srcIdx = midIdx + 1, destIdx = 0;
for(; srcIdx < curr->keyCount; ++srcIdx, ++destIdx) {
sibling->keys[destIdx] = curr->keys[srcIdx];
sibling->children[destIdx] = curr->children[srcIdx];
if(curr->children[srcIdx])
curr->children[srcIdx]->parent = sibling;
// Clear original positions (optional, for clarity)
curr->keys[srcIdx] = KeyType();
curr->children[srcIdx] = nullptr;
}
// Copy the last child pointer
sibling->children[destIdx] = curr->children[srcIdx];
if(curr->children[srcIdx])
curr->children[srcIdx]->parent = sibling;
curr->children[srcIdx] = nullptr;
sibling->keyCount = destIdx;
curr->keyCount = midIdx; // Keys before the median remain
curr->keys[midIdx] = KeyType(); // Clear promoted key
if(curr == root) {
// Create new root
root = new BTreeNode<KeyType, Order>();
root->keys[0] = midKey;
root->children[0] = curr;
root->children[1] = sibling;
root->keyCount = 1;
curr->parent = root;
sibling->parent = root;
break;
} else {
// Propagate split upward
keyToInsert = midKey;
newChild = sibling;
curr = curr->parent;
}
}
return true;
}
Validation via Inorder Traversal
An inorder traversal of a B-Tree should yield keys in ascending order.
template<typename KeyType, int Order>
void BTree<KeyType, Order>::inorderPrint(BTreeNode<KeyType, Order>* node) const {
if(!node) return;
for(int i = 0; i < node->keyCount; ++i) {
inorderPrint(node->children[i]);
cout << node->keys[i] << " ";
}
inorderPrint(node->children[node->keyCount]);
}
B-Tree Height Analysis
For an M-order B-Tree containing N keys:
- Minimum Height: Achieved when nodes are fullest. Each node has M-1 keys. Height
h_minsatisfies:N ≤ (M-1)(1 + M + M^2 + ... + M^(h_min-1)). Solving givesh_min ≥ log_M(N+1). - Maximum Height: Achieved when nodes are sparsest (minimum keys). Root has at least 1 key, 2 children. Other nodes have at least ceil(M/2)-1 keys, ceil(M/2) childran. Height
h_maxsatisfies:N ≥ 1 + 2*(ceil(M/2)-1) * (ceil(M/2)^0 + ceil(M/2)^1 + ... + ceil(M/2)^(h_max-2)). Approximating,h_max ≈ log_{ceil(M/2)} ((N+1)/2).
Performance
B-Trees minimize disk I/O. For N=62 billion keys and M=1024, the maximum height is less than 4. Locating a key requires at most 4 disk reads, after which a binary search within the node finds the target.
B+ Trees and B* Trees
B+ Tree
A variant where:
- All key-value data resides in leaf nodes, which are linked in a sorted linked list.
- Internal nodes act as indexes, containing copies of keys to guide search.
- The number of child pointers in an internal node equals its number of keys.
Advantages over B-Tree:
- Efficient range queries and full sequential scans via the leaf linked list.
- Internal nodes are denser (store only keys), allowing more keys per disk block and reducing tree height.
- Consistent access time as all searches terminate at leaves.
B* Tree
An enhancement of B+ Tree where non-root, non-leaf nodes contain pointers to their siblings. Splitting strategy differs:
- B+ Tree Split: A full node splits 50/50, creating a new node.
- B Tree Split:* Attempts to redistribute keys to a non-full sibling first. Only if both are full does it perform a 2/3 split, creating a new node. This leads to higher space utilization.
Summary
- B-Tree: Balanced multi-way tree with data in all nodes.
- B+ Tree: Index keys in internal nodes, actual data in linked leaves. Preferred for database systems (e.g., MySQL InnoDB).
- B Tree:* Higher space utilization via sibling pointers and redistribution.
While B-Tree variants have higher space overhead and more complex insertion/deletion than in-memory structures like AVL trees or hash tables, their low height makes them superior for disk-backed storage systems where I/O cost dominates.