Scapegoat Tree: A Self-Balancing Binary Search Tree Implementation
A Scapegoat Tree is a type of self-balancing binary search tree that guarantees O(log n) time complexity for operations. When the tree contains at most m nodes simultaneously, its space complexity can also achieve O(m) due to its special non-pointer memory recycling mechanism.
The Scapegoat Tree employs a relatively straightforward approach, making its implementation simpler than other balanced trees like Treap or Splay. For scenarios where you need to implement balanced tree functionality quickly without extremely strict time requirements, the Scapegoat Tree can be an excellent choice.
Algorithmic Approach
The core idea behind Scapegoat Trees is rebuilding. When a subtree becomes unbalanced, we transform it into a linear sequence and then reconstruct it as a perfectly balanced binary search tree. This approach ensures that most subtrees remain balanced through occasional adjustments, maintaining overall tree balance.
In the worst-case scenario, every operation might trigger a subtree rebuild, resulting in O(n) time complexity. To prevent this, we need a mechanism to detect when a subtree has become unbalanced. This is where the balance factor comes into play. Typically set to 0.7 (within the range [0.5, 1]), the balance factor helps us determine when rebuilding is necessary.
For a given node x with left and right subtree sizes n and m respectively, and total subtree size k, we rebuild the subtree if max(n, m)/k > α, where α is the balance factor.
Detailed Operations
Node Existence Check
Before performing operations on a node, we need to verify if it's still active in the tree structure. A node is considered unavailable if it has no left or right children and its count is zero, indicating it has been logically deleted but not yet removed during rebuilding.
bool isNodeActive(int nodeIndex) {
return !(tree[nodeIndex].leftChild == 0 &&
tree[nodeIndex].rightChild == 0 &&
tree[nodeIndex].elementCount == 0);
}
Building Linear Sequences
To rebuild a subtree, we first convert it into a linear seuqence. This process recursively traverses the subtree, collecting nodes in order. The root of each subtree becomes the midpoint of its corresponding segment in the sequence.
During this process, nodes with zero counts are automatically filtered out, ensuring that only valid nodes are included in the sequence.
int flattenSubtree(int nodeIndex) {
if (isNodeActive(tree[nodeIndex].leftChild)) {
flattenSubtree(tree[nodeIndex].leftChild);
}
int sequencePosition = sequence.size();
if (tree[nodeIndex].elementCount > 0) {
sequence.push_back(nodeIndex);
elementCounts.push_back(tree[nodeIndex].elementCount);
values.push_back(tree[nodeIndex].nodeValue);
}
if (isNodeActive(tree[nodeIndex].rightChild)) {
flattenSubtree(tree[nodeIndex].rightChild);
}
return sequencePosition;
}
Rebuilding Subtrees
Once we have a linear representation of the subtree, we can reconstruct it as a perfectly balanced binary search tree. The middle element of the sequence becomes the root, with left and right subtrees built recursively from the remaining elements.
void rebuildSubtree(int nodeIndex, int leftBound, int rightBound) {
int mid = (leftBound + rightBound) / 2;
int leftSize = 0, rightSize = 0;
if (leftBound < mid) {
tree[nodeIndex].leftChild = sequence[(leftBound + mid - 1) / 2];
rebuildSubtree(tree[nodeIndex].leftChild, leftBound, mid - 1);
leftSize = tree[tree[nodeIndex].leftChild].subtreeSize;
} else {
tree[nodeIndex].leftChild = 0;
}
if (rightBound > mid) {
tree[nodeIndex].rightChild = sequence[(mid + 1 + rightBound) / 2];
rebuildSubtree(tree[nodeIndex].rightChild, mid + 1, rightBound);
rightSize = tree[tree[nodeIndex].rightChild].subtreeSize;
} else {
tree[nodeIndex].rightChild = 0;
}
tree[nodeIndex].elementCount = elementCounts[mid];
tree[nodeIndex].nodeValue = values[mid];
tree[nodeIndex].subtreeSize = leftSize + rightSize + tree[nodeIndex].elementCount;
}
Maintaining Balance
This function checks if a subtree has become unbalanced and initiates rebuilding if necessary. First, it calculates the balance ratio between the larger child subtree and the current subtree. If this ratio exceeds the balance factor α, the subtree is rebuilt.
A critical detail to note is that when rebuilding, the root node must be positioned at the midpiont of the sequence. If it's not already at the midpoint, we need to swap it with the element at the midpoint position before rebuilding.
void enforceBalance(int nodeIndex) {
double balanceRatio = max(tree[tree[nodeIndex].leftChild].subtreeSize,
tree[tree[nodeIndex].rightChild].subtreeSize) * 1.0 /
tree[nodeIndex].subtreeSize;
if (balanceRatio > BALANCE_FACTOR) {
sequence.clear();
elementCounts.clear();
values.clear();
int rootPosition = flattenSubtree(nodeIndex);
swap(sequence[rootPosition], sequence[(sequence.size() - 1) / 2]);
rebuildSubtree(nodeIndex, 0, sequence.size() - 1);
}
}
Core Operations
The Scapegoat Tree implements standard binary search tree operations with additional balance maintenance:
- Insertion: Adds a new node while maintaining balance through the enforceBalance function
- Deletion: Removes a node and potentially triggers rebuilding
- Search: Finds a node value in O(log n) time
- Rank operations: Determine the position of a value in the sorted sequence
Implementation Example
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
const int MAX_NODES = 100005;
const double BALANCE_FACTOR = 0.7;
struct TreeNode {
int leftChild, rightChild;
int elementCount, nodeValue, subtreeSize;
} tree[MAX_NODES];
int nodeCount = 1;
vector<int> sequence;
vector<int> elementCounts;
vector<int> values;
bool isNodeActive(int nodeIndex) {
return !(tree[nodeIndex].leftChild == 0 &&
tree[nodeIndex].rightChild == 0 &&
tree[nodeIndex].elementCount == 0);
}
int flattenSubtree(int nodeIndex) {
if (isNodeActive(tree[nodeIndex].leftChild)) {
flattenSubtree(tree[nodeIndex].leftChild);
}
int sequencePosition = sequence.size();
if (tree[nodeIndex].elementCount > 0) {
sequence.push_back(nodeIndex);
elementCounts.push_back(tree[nodeIndex].elementCount);
values.push_back(tree[nodeIndex].nodeValue);
}
if (isNodeActive(tree[nodeIndex].rightChild)) {
flattenSubtree(tree[nodeIndex].rightChild);
}
return sequencePosition;
}
void rebuildSubtree(int nodeIndex, int leftBound, int rightBound) {
int mid = (leftBound + rightBound) / 2;
int leftSize = 0, rightSize = 0;
if (leftBound < mid) {
tree[nodeIndex].leftChild = sequence[(leftBound + mid - 1) / 2];
rebuildSubtree(tree[nodeIndex].leftChild, leftBound, mid - 1);
leftSize = tree[tree[nodeIndex].leftChild].subtreeSize;
} else {
tree[nodeIndex].leftChild = 0;
}
if (rightBound > mid) {
tree[nodeIndex].rightChild = sequence[(mid + 1 + rightBound) / 2];
rebuildSubtree(tree[nodeIndex].rightChild, mid + 1, rightBound);
rightSize = tree[tree[nodeIndex].rightChild].subtreeSize;
} else {
tree[nodeIndex].rightChild = 0;
}
tree[nodeIndex].elementCount = elementCounts[mid];
tree[nodeIndex].nodeValue = values[mid];
tree[nodeIndex].subtreeSize = leftSize + rightSize + tree[nodeIndex].elementCount;
}
void enforceBalance(int nodeIndex) {
double balanceRatio = max(tree[tree[nodeIndex].leftChild].subtreeSize,
tree[tree[nodeIndex].rightChild].subtreeSize) * 1.0 /
tree[nodeIndex].subtreeSize;
if (balanceRatio > BALANCE_FACTOR) {
sequence.clear();
elementCounts.clear();
values.clear();
int rootPosition = flattenSubtree(nodeIndex);
swap(sequence[rootPosition], sequence[(sequence.size() - 1) / 2]);
rebuildSubtree(nodeIndex, 0, sequence.size() - 1);
}
}
void insertValue(int nodeIndex, int value) {
if (!isNodeActive(nodeIndex)) {
tree[nodeIndex].elementCount = 1;
tree[nodeIndex].nodeValue = value;
} else if (value < tree[nodeIndex].nodeValue) {
if (!isNodeActive(tree[nodeIndex].leftChild)) {
tree[nodeIndex].leftChild = ++nodeCount;
}
insertValue(tree[nodeIndex].leftChild, value);
} else if (value > tree[nodeIndex].nodeValue) {
if (!isNodeActive(tree[nodeIndex].rightChild)) {
tree[nodeIndex].rightChild = ++nodeCount;
}
insertValue(tree[nodeIndex].rightChild, value);
} else {
tree[nodeIndex].elementCount++;
}
tree[nodeIndex].subtreeSize++;
enforceBalance(nodeIndex);
}
void deleteValue(int nodeIndex, int value) {
tree[nodeIndex].subtreeSize--;
if (value < tree[nodeIndex].nodeValue) {
deleteValue(tree[nodeIndex].leftChild, value);
} else if (value > tree[nodeIndex].nodeValue) {
deleteValue(tree[nodeIndex].rightChild, value);
} else {
tree[nodeIndex].elementCount--;
}
enforceBalance(nodeIndex);
}
int findKthElement(int nodeIndex, int rank) {
if (rank <= tree[tree[nodeIndex].leftChild].subtreeSize) {
return findKthElement(tree[nodeIndex].leftChild, rank);
} else if (rank > tree[tree[nodeIndex].leftChild].subtreeSize + tree[nodeIndex].elementCount) {
return findKthElement(tree[nodeIndex].rightChild,
rank - tree[tree[nodeIndex].leftChild].subtreeSize - tree[nodeIndex].elementCount);
} else {
return tree[nodeIndex].nodeValue;
}
}
int main() {
int operationCount, value;
cin >> operationCount;
for (int i = 1; i <= operationCount; i++) {
cin >> operationCount >> value;
if (operationCount == 1) {
insertValue(1, value);
} else if (operationCount == 2) {
deleteValue(1, value);
} else if (operationCount == 3) {
cout << findKthElement(1, value) << endl;
}
}
return 0;
}</int></int></int></algorithm></vector></iostream>