Consistent Hashing with and without Virtual Nodes in Java
A consistent hash places both data keys and server identifiers in to the same circular hash address space. Treat the hash output as a ring from 0 to 2^32−1. Each server is mapped to a point on the ring using a hash of its identity (for example, IP:port). To route a key, hash the key and walk clockwise to the first server hash; that server is the owner of the key.
- Hash space (ring): A 32-bit (or larger) space arranged circularly, 0 … 2^32−1.
- Mapping data: Hash(key) produces a point on the ring.
- Mapping servers: Hash(serverId) produecs one or more points on the same ring.
- Lookup: The owner of a key is the first server clockwice from Hash(key). If there is none clockwise, wrap to the first server on the ring.
Node membership changes impact only a small portion of keys:
- Removing a server reassigns only the keys that fell between the previous server and the removed server to the removed server’s clcokwise successor.
- Adding a server reassigns only keys in the interval between the new server’s predecessor and the new server.
Balance requires additional care. Pure consistent hashing with one point per server can distribute load unevenly because server hash positions may cluster. Introducing virtual nodes (multiple independent hash points per real server) evens out the distribution. Each virtual node is a distinct hash point that maps back to the same underlying server. In practice, dozens to thousands of virtual nodes per server are common depending on cluster size and variance targets.
Java implementation: basic ring (no virtual nodes)
import java.util.*;
/**
* Minimal consistent hash ring without virtual nodes.
*/
public final class SimpleConsistentHash {
private final NavigableMap<Long, String> ring = new TreeMap<>();
public SimpleConsistentHash(Collection<String> nodes) {
for (String n : nodes) addNode(n);
}
public synchronized void addNode(String nodeId) {
ring.put(hash(nodeId), nodeId);
}
public synchronized void removeNode(String nodeId) {
ring.remove(hash(nodeId));
}
public String locate(String key) {
if (ring.isEmpty()) return null;
long h = hash(key);
Map.Entry<Long, String> e = ring.ceilingEntry(h);
if (e == null) e = ring.firstEntry();
return e.getValue();
}
// FNV-1a 32-bit, returned as unsigned long in [0, 2^32)
private static long hash(String s) {
int h = 0x811C9DC5; // offset basis
final int p = 0x01000193; // prime
for (int i = 0; i < s.length(); i++) {
h ^= s.charAt(i);
h *= p;
}
return h & 0xFFFFFFFFL;
}
public static void main(String[] args) {
List<String> servers = Arrays.asList(
"192.168.0.0:111",
"192.168.0.1:111",
"192.168.0.2:111",
"192.168.0.3:111",
"192.168.0.4:111"
);
SimpleConsistentHash ring = new SimpleConsistentHash(servers);
for (String s : servers) {
System.out.printf("server=%s hash=%d%n", s, hash(s));
}
String[] keys = {"sunlight", "Moon", "Stars"};
for (String k : keys) {
System.out.printf("key=%s hash=%d -> %s%n", k, hash(k), ring.locate(k));
}
}
}
Java implementation: ring with virtual nodes (replicas)
import java.util.*;
/**
* Consistent hash ring using virtual nodes for better load balance.
* Each physical node is represented by `replicas` virtual points.
*/
public final class ConsistentHashWithReplicas {
private final NavigableMap<Long, String> ring = new TreeMap<>();
private final Map<String, Set<Long>> nodeToPoints = new HashMap<>();
private final int replicas;
public ConsistentHashWithReplicas(int replicas, Collection<String> initialNodes) {
if (replicas <= 0) throw new IllegalArgumentException("replicas must be > 0");
this.replicas = replicas;
for (String n : initialNodes) addNode(n);
}
public synchronized void addNode(String nodeId) {
Set<Long> points = new HashSet<>(replicas);
for (int i = 0; i < replicas; i++) {
String virtualId = nodeId + "#v" + i;
long h = hash(virtualId);
ring.put(h, nodeId); // value is the physical node
points.add(h);
System.out.printf("virtual=%s hash=%d added%n", virtualId, h);
}
nodeToPoints.put(nodeId, points);
}
public synchronized void removeNode(String nodeId) {
Set<Long> points = nodeToPoints.remove(nodeId);
if (points == null) return;
for (Long h : points) ring.remove(h);
}
public String locate(String key) {
if (ring.isEmpty()) return null;
long h = hash(key);
Map.Entry<Long, String> e = ring.ceilingEntry(h);
if (e == null) e = ring.firstEntry();
return e.getValue();
}
// FNV-1a 32-bit, returned as unsigned long in [0, 2^32)
private static long hash(String s) {
int h = 0x811C9DC5;
final int p = 0x01000193;
for (int i = 0; i < s.length(); i++) {
h ^= s.charAt(i);
h *= p;
}
return h & 0xFFFFFFFFL;
}
public static void main(String[] args) {
List<String> servers = Arrays.asList(
"192.168.0.0:111",
"192.168.0.1:111",
"192.168.0.2:111",
"192.168.0.3:111",
"192.168.0.4:111"
);
ConsistentHashWithReplicas ring = new ConsistentHashWithReplicas(5, servers);
String[] keys = {"sunlight", "Moon", "Stars"};
for (String k : keys) {
System.out.printf("key=%s hash=%d -> %s%n", k, hash(k), ring.locate(k));
}
}
}
Notes on behavior
- Lookup complexity is O(log N) for N ring points.
- With virtual nodes, N = replicas × number of physical nodes.
- To reduce memory while maintaining balance, tune
replicasto your cluster size and acceptable variance. A simple rule of thumb is replicas ≈ 100–200 for small clusters; scale up if you observe hot spots. - Any fast, good-distribution hash (e.g., xxHash, Murmur3) can replace FNV-1a; keep the output space consistent across nodes for deterministic routing.