Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Scapegoat Tree: A Self-Balancing Binary Search Tree Implementation

Tech Jun 9 2

A Scapegoat Tree is a type of self-balancing binary search tree that guarantees O(log n) time complexity for operations. When the tree contains at most m nodes simultaneously, its space complexity can also achieve O(m) due to its special non-pointer memory recycling mechanism.

The Scapegoat Tree employs a relatively straightforward approach, making its implementation simpler than other balanced trees like Treap or Splay. For scenarios where you need to implement balanced tree functionality quickly without extremely strict time requirements, the Scapegoat Tree can be an excellent choice.

Algorithmic Approach

The core idea behind Scapegoat Trees is rebuilding. When a subtree becomes unbalanced, we transform it into a linear sequence and then reconstruct it as a perfectly balanced binary search tree. This approach ensures that most subtrees remain balanced through occasional adjustments, maintaining overall tree balance.

In the worst-case scenario, every operation might trigger a subtree rebuild, resulting in O(n) time complexity. To prevent this, we need a mechanism to detect when a subtree has become unbalanced. This is where the balance factor comes into play. Typically set to 0.7 (within the range [0.5, 1]), the balance factor helps us determine when rebuilding is necessary.

For a given node x with left and right subtree sizes n and m respectively, and total subtree size k, we rebuild the subtree if max(n, m)/k > α, where α is the balance factor.

Detailed Operations

Node Existence Check

Before performing operations on a node, we need to verify if it's still active in the tree structure. A node is considered unavailable if it has no left or right children and its count is zero, indicating it has been logically deleted but not yet removed during rebuilding.

bool isNodeActive(int nodeIndex) {
    return !(tree[nodeIndex].leftChild == 0 && 
             tree[nodeIndex].rightChild == 0 && 
             tree[nodeIndex].elementCount == 0);
}

Building Linear Sequences

To rebuild a subtree, we first convert it into a linear seuqence. This process recursively traverses the subtree, collecting nodes in order. The root of each subtree becomes the midpoint of its corresponding segment in the sequence.

During this process, nodes with zero counts are automatically filtered out, ensuring that only valid nodes are included in the sequence.

int flattenSubtree(int nodeIndex) {
    if (isNodeActive(tree[nodeIndex].leftChild)) {
        flattenSubtree(tree[nodeIndex].leftChild);
    }
    
    int sequencePosition = sequence.size();
    if (tree[nodeIndex].elementCount > 0) {
        sequence.push_back(nodeIndex);
        elementCounts.push_back(tree[nodeIndex].elementCount);
        values.push_back(tree[nodeIndex].nodeValue);
    }
    
    if (isNodeActive(tree[nodeIndex].rightChild)) {
        flattenSubtree(tree[nodeIndex].rightChild);
    }
    
    return sequencePosition;
}

Rebuilding Subtrees

Once we have a linear representation of the subtree, we can reconstruct it as a perfectly balanced binary search tree. The middle element of the sequence becomes the root, with left and right subtrees built recursively from the remaining elements.

void rebuildSubtree(int nodeIndex, int leftBound, int rightBound) {
    int mid = (leftBound + rightBound) / 2;
    int leftSize = 0, rightSize = 0;
    
    if (leftBound < mid) {
        tree[nodeIndex].leftChild = sequence[(leftBound + mid - 1) / 2];
        rebuildSubtree(tree[nodeIndex].leftChild, leftBound, mid - 1);
        leftSize = tree[tree[nodeIndex].leftChild].subtreeSize;
    } else {
        tree[nodeIndex].leftChild = 0;
    }
    
    if (rightBound > mid) {
        tree[nodeIndex].rightChild = sequence[(mid + 1 + rightBound) / 2];
        rebuildSubtree(tree[nodeIndex].rightChild, mid + 1, rightBound);
        rightSize = tree[tree[nodeIndex].rightChild].subtreeSize;
    } else {
        tree[nodeIndex].rightChild = 0;
    }
    
    tree[nodeIndex].elementCount = elementCounts[mid];
    tree[nodeIndex].nodeValue = values[mid];
    tree[nodeIndex].subtreeSize = leftSize + rightSize + tree[nodeIndex].elementCount;
}

Maintaining Balance

This function checks if a subtree has become unbalanced and initiates rebuilding if necessary. First, it calculates the balance ratio between the larger child subtree and the current subtree. If this ratio exceeds the balance factor α, the subtree is rebuilt.

A critical detail to note is that when rebuilding, the root node must be positioned at the midpiont of the sequence. If it's not already at the midpoint, we need to swap it with the element at the midpoint position before rebuilding.

void enforceBalance(int nodeIndex) {
    double balanceRatio = max(tree[tree[nodeIndex].leftChild].subtreeSize, 
                            tree[tree[nodeIndex].rightChild].subtreeSize) * 1.0 / 
                         tree[nodeIndex].subtreeSize;
    
    if (balanceRatio > BALANCE_FACTOR) {
        sequence.clear();
        elementCounts.clear();
        values.clear();
        
        int rootPosition = flattenSubtree(nodeIndex);
        swap(sequence[rootPosition], sequence[(sequence.size() - 1) / 2]);
        rebuildSubtree(nodeIndex, 0, sequence.size() - 1);
    }
}

Core Operations

The Scapegoat Tree implements standard binary search tree operations with additional balance maintenance:

  • Insertion: Adds a new node while maintaining balance through the enforceBalance function
  • Deletion: Removes a node and potentially triggers rebuilding
  • Search: Finds a node value in O(log n) time
  • Rank operations: Determine the position of a value in the sorted sequence

Implementation Example

#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

const int MAX_NODES = 100005;
const double BALANCE_FACTOR = 0.7;

struct TreeNode {
    int leftChild, rightChild;
    int elementCount, nodeValue, subtreeSize;
} tree[MAX_NODES];

int nodeCount = 1;
vector<int> sequence;
vector<int> elementCounts;
vector<int> values;

bool isNodeActive(int nodeIndex) {
    return !(tree[nodeIndex].leftChild == 0 && 
             tree[nodeIndex].rightChild == 0 && 
             tree[nodeIndex].elementCount == 0);
}

int flattenSubtree(int nodeIndex) {
    if (isNodeActive(tree[nodeIndex].leftChild)) {
        flattenSubtree(tree[nodeIndex].leftChild);
    }
    
    int sequencePosition = sequence.size();
    if (tree[nodeIndex].elementCount > 0) {
        sequence.push_back(nodeIndex);
        elementCounts.push_back(tree[nodeIndex].elementCount);
        values.push_back(tree[nodeIndex].nodeValue);
    }
    
    if (isNodeActive(tree[nodeIndex].rightChild)) {
        flattenSubtree(tree[nodeIndex].rightChild);
    }
    
    return sequencePosition;
}

void rebuildSubtree(int nodeIndex, int leftBound, int rightBound) {
    int mid = (leftBound + rightBound) / 2;
    int leftSize = 0, rightSize = 0;
    
    if (leftBound < mid) {
        tree[nodeIndex].leftChild = sequence[(leftBound + mid - 1) / 2];
        rebuildSubtree(tree[nodeIndex].leftChild, leftBound, mid - 1);
        leftSize = tree[tree[nodeIndex].leftChild].subtreeSize;
    } else {
        tree[nodeIndex].leftChild = 0;
    }
    
    if (rightBound > mid) {
        tree[nodeIndex].rightChild = sequence[(mid + 1 + rightBound) / 2];
        rebuildSubtree(tree[nodeIndex].rightChild, mid + 1, rightBound);
        rightSize = tree[tree[nodeIndex].rightChild].subtreeSize;
    } else {
        tree[nodeIndex].rightChild = 0;
    }
    
    tree[nodeIndex].elementCount = elementCounts[mid];
    tree[nodeIndex].nodeValue = values[mid];
    tree[nodeIndex].subtreeSize = leftSize + rightSize + tree[nodeIndex].elementCount;
}

void enforceBalance(int nodeIndex) {
    double balanceRatio = max(tree[tree[nodeIndex].leftChild].subtreeSize, 
                            tree[tree[nodeIndex].rightChild].subtreeSize) * 1.0 / 
                         tree[nodeIndex].subtreeSize;
    
    if (balanceRatio > BALANCE_FACTOR) {
        sequence.clear();
        elementCounts.clear();
        values.clear();
        
        int rootPosition = flattenSubtree(nodeIndex);
        swap(sequence[rootPosition], sequence[(sequence.size() - 1) / 2]);
        rebuildSubtree(nodeIndex, 0, sequence.size() - 1);
    }
}

void insertValue(int nodeIndex, int value) {
    if (!isNodeActive(nodeIndex)) {
        tree[nodeIndex].elementCount = 1;
        tree[nodeIndex].nodeValue = value;
    } else if (value < tree[nodeIndex].nodeValue) {
        if (!isNodeActive(tree[nodeIndex].leftChild)) {
            tree[nodeIndex].leftChild = ++nodeCount;
        }
        insertValue(tree[nodeIndex].leftChild, value);
    } else if (value > tree[nodeIndex].nodeValue) {
        if (!isNodeActive(tree[nodeIndex].rightChild)) {
            tree[nodeIndex].rightChild = ++nodeCount;
        }
        insertValue(tree[nodeIndex].rightChild, value);
    } else {
        tree[nodeIndex].elementCount++;
    }
    
    tree[nodeIndex].subtreeSize++;
    enforceBalance(nodeIndex);
}

void deleteValue(int nodeIndex, int value) {
    tree[nodeIndex].subtreeSize--;
    if (value < tree[nodeIndex].nodeValue) {
        deleteValue(tree[nodeIndex].leftChild, value);
    } else if (value > tree[nodeIndex].nodeValue) {
        deleteValue(tree[nodeIndex].rightChild, value);
    } else {
        tree[nodeIndex].elementCount--;
    }
    enforceBalance(nodeIndex);
}

int findKthElement(int nodeIndex, int rank) {
    if (rank <= tree[tree[nodeIndex].leftChild].subtreeSize) {
        return findKthElement(tree[nodeIndex].leftChild, rank);
    } else if (rank > tree[tree[nodeIndex].leftChild].subtreeSize + tree[nodeIndex].elementCount) {
        return findKthElement(tree[nodeIndex].rightChild, 
                            rank - tree[tree[nodeIndex].leftChild].subtreeSize - tree[nodeIndex].elementCount);
    } else {
        return tree[nodeIndex].nodeValue;
    }
}

int main() {
    int operationCount, value;
    cin >> operationCount;
    
    for (int i = 1; i <= operationCount; i++) {
        cin >> operationCount >> value;
        if (operationCount == 1) {
            insertValue(1, value);
        } else if (operationCount == 2) {
            deleteValue(1, value);
        } else if (operationCount == 3) {
            cout << findKthElement(1, value) << endl;
        }
    }
    
    return 0;
}</int></int></int></algorithm></vector></iostream>

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.