Search Algorithms and Binary Search Tree Implementation
Search algorithms are fundamental techniques for locating specific elements within data collections. These collections may be stored as arrays, linked lists, or hash tables. Choosing the right search strategy significantly impacts application performance.
Search Algorithm Categories
Sequential Search
Sequential search traverses elements from the beginning, comparing each item until finding the target or reaching the end. This approach handles unsorted data with out preprocessing but requires O(n) time complexity. For small datasets, this straightforward method remains practical.
Binary Search
Binary search achieves O(log n) efficiency on sorted collections by repeatedly halving the search interval. Compare the target with the middle element, then eliminate half the remaining elements based on the comparison result. This logarithmic performance requires pre-sorted data.
Hash-Based Lookup
Hash tables deliver O(1) average-case performance by computing direct memory addresses through a hash function. Collisions occur when multiple keys map to identical indices, potentially degrading performance. The quality of the hash function and collision resolution strategy determine actual lookup speed.
Algorithm Selection Criteria
Dataset characteristics guide algorithm selection. Unordered small collections suit sequential search. Sorted large datasets favor binary search. Hash tables excel with massive datasets when memory permits and keys distribute evenly.
Linear structures like arrays and linked lists impose O(n) search costs. Binary search achieves O(log n) on sorted data but requires costly upfront sorting. Binary Search Trees provide an elegant solution—elements remain ordered after insertion, enabling efficient searching without separate sorting operations.
Binary Search Tree
Historical Context
The Binary Search Tree concept emerged in the 18th century through Bernoulli's work, with D.L. Gries advancing practical applications in his 1960s programming treatise. Modern BSTs form the foundation for search engines, database indices, and sorted data structures.
Defining Properties
A BST maintains these invariants: eacch node stores a comparable key, duplicate keys are prohibited, and every node's key exceeds all keys in its left subtree while remaining less than all keys in its right subtree. This hierarchical ordering enables systematic traversal.
The tree above demonstrates efficient key location. Finding key 7 requires three comparisons: compare with 4 (larger, move right), then 6 (larger, move right), then 7 (match).
Operation complexity correlates with tree height. Balanced trees maintain O(log n) performance across search, insertion, and deletion. Unbalanced configurations degrade to O(n) worst-case when the structure resembles a linear chain.
Node Structure
private static class TreeNode {
int key;
Object value;
TreeNode left;
TreeNode right;
public TreeNode(int key) {
this.key = key;
this.value = key;
}
public TreeNode(int key, Object value) {
this.key = key;
this.value = value;
}
public TreeNode(int key, Object value, TreeNode left, TreeNode right) {
this.key = key;
this.value = value;
this.left = left;
this.right = right;
}
}
private TreeNode root;
Search Operations
Recursive approach
public Object find(int key) {
return search(root, key);
}
private Object search(TreeNode current, int key) {
if (current == null) {
return null;
}
if (key < current.key) {
return search(current.left, key);
} else if (current.key < key) {
return search(current.right, key);
}
return current.value;
}
Iterative approach
public Object find(int key) {
TreeNode current = root;
while (current != null) {
if (key < current.key) {
current = current.left;
} else if (current.key < key) {
current = current.right;
} else {
return current.value;
}
}
return null;
}
Generic Key Types
Supporting arbitrary comparable types requires implementing the Comparable interface:
public class BinarySearchTree<T extends Comparable<T>> {
private static class Node<T> {
T key;
Object value;
Node<T> left;
Node<T> right;
public Node(T key) {
this.key = key;
this.value = key;
}
public Node(T key, Object value) {
this.key = key;
this.value = value;
}
public Node(T key, Object value, Node<T> left, Node<T> right) {
this.key = key;
this.value = value;
this.left = left;
this.right = right;
}
}
private Node<T> root;
public Object find(T key) {
return search(root, key);
}
private Object search(Node<T> current, T key) {
if (current == null) {
return null;
}
int result = current.key.compareTo(key);
if (result > 0) {
return search(current.left, key);
} else if (result < 0) {
return search(current.right, key);
}
return current.value;
}
}
Alternatively, supply a Comparator during construction for external comparison logic, similar to java.util.TreeMap.
Extremum Location
Minimum - recursive
public Object minimum() {
return findMin(root);
}
private Object findMin(TreeNode node) {
if (node == null) {
return null;
}
if (node.left == null) {
return node.value;
}
return findMin(node.left);
}
Minimum - iterative
public Object minimum() {
if (root == null) {
return null;
}
TreeNode current = root;
while (current.left != null) {
current = current.left;
}
return current.value;
}
Maximum - recursive
public Object maximum() {
return findMax(root);
}
private Object findMax(TreeNode node) {
if (node == null) {
return null;
}
if (node.right == null) {
return node.value;
}
return findMax(node.right);
}
Maximum - iterative
public Object maximum() {
if (root == null) {
return null;
}
TreeNode current = root;
while (current.right != null) {
current = current.right;
}
return current.value;
}
Insertion
Recursive insertion
public void insert(int key, Object value) {
root = insertNode(root, key, value);
}
private TreeNode insertNode(TreeNode node, int key, Object value) {
if (node == null) {
return new TreeNode(key, value);
}
if (key < node.key) {
node.left = insertNode(node.left, key, value);
} else if (node.key < key) {
node.right = insertNode(node.right, key, value);
} else {
node.value = value;
}
return node;
}
The recursive version either updates an existing key's value or creates a new leaf node.
Iterative insertion
public void insert(int key, Object value) {
TreeNode current = root;
TreeNode parent = null;
while (current != null) {
parent = current;
if (key < current.key) {
current = current.left;
} else if (current.key < key) {
current = current.right;
} else {
current.value = value;
return;
}
}
if (parent == null) {
root = new TreeNode(key, value);
} else if (key < parent.key) {
parent.left = new TreeNode(key, value);
} else {
parent.right = new TreeNode(key, value);
}
}
Predecessor and Successor
The predecessor of a node represents the largest key smaller than the node's key. The successor represents the smallest key larger than the node's key.
In the example tree: 1 has no predecessor with successor 2; 2 has predecessor 1 with successor 3; 3 has predecessor 2 with successor 4.
In-order traversal naturally yields sorted elements, simplifying predecessor and successor identification.
Two patterns emerge when locating predecessors:
- When a node possesses a left subtree, its predecessor is the maximum element within that left subtree (2→1, 4→3, 6→5, 7→6)
- When no left subtree exists, the predecessor is the nearest ancestor reached via a left child path (3→2, 5→4, 8→7). Node 1 has no such ancestor.
Successor patterns mirror this:
- A right subtree indicates the minimum of that subtree serves as the successor (2→3, 3→4, 5→6, 7→8)
- Without a right subtree, the nearest ancestor accessed via a right child path becomes the successor (1→2, 4→5, 6→7). Node 8 has no such ancestor.
public Object predecessor(int key) {
TreeNode leftAncestor = null;
TreeNode current = root;
while (current != null) {
if (key < current.key) {
current = current.left;
} else if (current.key < key) {
leftAncestor = current;
current = current.right;
} else {
break;
}
}
if (current == null) {
return null;
}
if (current.left != null) {
return maximum(current.left);
}
return leftAncestor != null ? leftAncestor.value : null;
}
public Object successor(int key) {
TreeNode rightAncestor = null;
TreeNode current = root;
while (current != null) {
if (key < current.key) {
rightAncestor = current;
current = current.left;
} else if (current.key < key) {
current = current.right;
} else {
break;
}
}
if (current == null) {
return null;
}
if (current.right != null) {
return minimum(current.right);
}
return rightAncestor != null ? rightAncestor.value : null;
}
Removal
Deletion requires locating both the target node and its parent. Four scenarios apply:
- No left child: transplant right child to parent
- No right child: transplant left child to parent
- No children: handled by scenarios 1 or 2 with null
- Two children: replace with in-order successor, handle successor's descendants
Iterative removal
public Object remove(int key) {
TreeNode current = root;
TreeNode parent = null;
while (current != null) {
if (key < current.key) {
parent = current;
current = current.left;
} else if (current.key < key) {
parent = current;
current = current.right;
} else {
break;
}
}
if (current == null) {
return null;
}
if (current.left == null) {
transplant(parent, current, current.right);
} else if (current.right == null) {
transplant(parent, current, current.left);
} else {
TreeNode successor = current.right;
TreeNode successorParent = current;
while (successor.left != null) {
successorParent = successor;
successor = successor.left;
}
if (successorParent != current) {
transplant(successorParent, successor, successor.right);
successor.right = current.right;
}
transplant(parent, current, successor);
successor.left = current.left;
}
return current.value;
}
private void transplant(TreeNode parent, TreeNode removed, TreeNode replacement) {
if (parent == null) {
root = replacement;
} else if (removed == parent.left) {
parent.left = replacement;
} else {
parent.right = replacement;
}
}
Recursive removal
public Object remove(int key) {
List<Object> result = new ArrayList<>();
root = deleteNode(root, key, result);
return result.isEmpty() ? null : result.get(0);
}
private TreeNode deleteNode(TreeNode node, int key, List<Object> result) {
if (node == null) {
return null;
}
if (key < node.key) {
node.left = deleteNode(node.left, key, result);
return node;
}
if (node.key < key) {
node.right = deleteNode(node.right, key, result);
return node;
}
result.add(node.value);
if (node.left != null && node.right != null) {
TreeNode successor = node.right;
while (successor.left != null) {
successor = successor.left;
}
successor.right = deleteNode(node.right, successor.key, new ArrayList<>());
successor.left = node.left;
return successor;
}
return node.left != null ? node.left : node.right;
}
The recursive version stores the deleted value in a result list. When both children exist, it locates the in-order successor within the right subtree, recursively removes it, then substitutes the node.
Range Queries
Elements less than a value
public List<Object> lessThan(int key) {
List<Object> results = new ArrayList<>();
TreeNode current = root;
LinkedList<TreeNode> stack = new LinkedList<>();
while (current != null || !stack.isEmpty()) {
while (current != null) {
stack.push(current);
current = current.left;
}
TreeNode node = stack.pop();
if (node.key < key) {
results.add(node.value);
} else {
break;
}
current = node.right;
}
return results;
}
Elements greater than a value
public List<Object> greaterThan(int key) {
List<Object> results = new ArrayList<>();
TreeNode current = root;
LinkedList<TreeNode> stack = new LinkedList<>();
while (current != null || !stack.isEmpty()) {
while (current != null) {
stack.push(current);
current = current.left;
}
TreeNode node = stack.pop();
if (node.key > key) {
results.add(node.value);
}
current = node.right;
}
return results;
}
Reverse in-order traversal (RNL) offers better efficiency for greater-than queries:
public List<Object> greaterThan(int key) {
List<Object> results = new ArrayList<>();
TreeNode current = root;
LinkedList<TreeNode> stack = new LinkedList<>();
while (current != null || !stack.isEmpty()) {
while (current != null) {
stack.push(current);
current = current.right;
}
TreeNode node = stack.pop();
if (node.key > key) {
results.add(node.value);
} else {
break;
}
current = node.left;
}
return results;
}
Elements within a range
public List<Object> between(int lowerKey, int upperKey) {
List<Object> results = new ArrayList<>();
TreeNode current = root;
LinkedList<TreeNode> stack = new LinkedList<>();
while (current != null || !stack.isEmpty()) {
while (current != null) {
stack.push(current);
current = current.left;
}
TreeNode node = stack.pop();
if (node.key >= lowerKey && node.key <= upperKey) {
results.add(node.value);
} else if (node.key > upperKey) {
break;
}
current = node.right;
}
return results;
}
Performance Characteristics
Advantages
Balanced BSTs maintain O(log n) search, insertion, and deletion. Insertion and deletion operations remain straightforward. In-order traversal produces sorted sequences without additional sorting overhead.
Disadvantages
Ordered or near-ordered input produces severely unbalanced trees, degrading to O(n). Frequent modifications require self-balancing variants like Red-Black or AVL trees to preserve logarithmic performance. Duplicate keys complicate handling and encrease tree depth. Memory overhead per node becomes prohibitive for memory-constrained environments.