Fading Coder

One Final Commit for the Last Sprint

Home > Tools > Content

Consistent Hashing with and without Virtual Nodes in Java

Tools 2

A consistent hash places both data keys and server identifiers in to the same circular hash address space. Treat the hash output as a ring from 0 to 2^32−1. Each server is mapped to a point on the ring using a hash of its identity (for example, IP:port). To route a key, hash the key and walk clockwise to the first server hash; that server is the owner of the key.

  • Hash space (ring): A 32-bit (or larger) space arranged circularly, 0 … 2^32−1.
  • Mapping data: Hash(key) produces a point on the ring.
  • Mapping servers: Hash(serverId) produecs one or more points on the same ring.
  • Lookup: The owner of a key is the first server clockwice from Hash(key). If there is none clockwise, wrap to the first server on the ring.

Node membership changes impact only a small portion of keys:

  • Removing a server reassigns only the keys that fell between the previous server and the removed server to the removed server’s clcokwise successor.
  • Adding a server reassigns only keys in the interval between the new server’s predecessor and the new server.

Balance requires additional care. Pure consistent hashing with one point per server can distribute load unevenly because server hash positions may cluster. Introducing virtual nodes (multiple independent hash points per real server) evens out the distribution. Each virtual node is a distinct hash point that maps back to the same underlying server. In practice, dozens to thousands of virtual nodes per server are common depending on cluster size and variance targets.

Java implementation: basic ring (no virtual nodes)

import java.util.*;

/**
 * Minimal consistent hash ring without virtual nodes.
 */
public final class SimpleConsistentHash {
    private final NavigableMap<Long, String> ring = new TreeMap<>();

    public SimpleConsistentHash(Collection<String> nodes) {
        for (String n : nodes) addNode(n);
    }

    public synchronized void addNode(String nodeId) {
        ring.put(hash(nodeId), nodeId);
    }

    public synchronized void removeNode(String nodeId) {
        ring.remove(hash(nodeId));
    }

    public String locate(String key) {
        if (ring.isEmpty()) return null;
        long h = hash(key);
        Map.Entry<Long, String> e = ring.ceilingEntry(h);
        if (e == null) e = ring.firstEntry();
        return e.getValue();
    }

    // FNV-1a 32-bit, returned as unsigned long in [0, 2^32)
    private static long hash(String s) {
        int h = 0x811C9DC5; // offset basis
        final int p = 0x01000193; // prime
        for (int i = 0; i < s.length(); i++) {
            h ^= s.charAt(i);
            h *= p;
        }
        return h & 0xFFFFFFFFL;
    }

    public static void main(String[] args) {
        List<String> servers = Arrays.asList(
            "192.168.0.0:111",
            "192.168.0.1:111",
            "192.168.0.2:111",
            "192.168.0.3:111",
            "192.168.0.4:111"
        );
        SimpleConsistentHash ring = new SimpleConsistentHash(servers);

        for (String s : servers) {
            System.out.printf("server=%s hash=%d%n", s, hash(s));
        }

        String[] keys = {"sunlight", "Moon", "Stars"};
        for (String k : keys) {
            System.out.printf("key=%s hash=%d -> %s%n", k, hash(k), ring.locate(k));
        }
    }
}

Java implementation: ring with virtual nodes (replicas)

import java.util.*;

/**
 * Consistent hash ring using virtual nodes for better load balance.
 * Each physical node is represented by `replicas` virtual points.
 */
public final class ConsistentHashWithReplicas {
    private final NavigableMap<Long, String> ring = new TreeMap<>();
    private final Map<String, Set<Long>> nodeToPoints = new HashMap<>();
    private final int replicas;

    public ConsistentHashWithReplicas(int replicas, Collection<String> initialNodes) {
        if (replicas <= 0) throw new IllegalArgumentException("replicas must be > 0");
        this.replicas = replicas;
        for (String n : initialNodes) addNode(n);
    }

    public synchronized void addNode(String nodeId) {
        Set<Long> points = new HashSet<>(replicas);
        for (int i = 0; i < replicas; i++) {
            String virtualId = nodeId + "#v" + i;
            long h = hash(virtualId);
            ring.put(h, nodeId); // value is the physical node
            points.add(h);
            System.out.printf("virtual=%s hash=%d added%n", virtualId, h);
        }
        nodeToPoints.put(nodeId, points);
    }

    public synchronized void removeNode(String nodeId) {
        Set<Long> points = nodeToPoints.remove(nodeId);
        if (points == null) return;
        for (Long h : points) ring.remove(h);
    }

    public String locate(String key) {
        if (ring.isEmpty()) return null;
        long h = hash(key);
        Map.Entry<Long, String> e = ring.ceilingEntry(h);
        if (e == null) e = ring.firstEntry();
        return e.getValue();
    }

    // FNV-1a 32-bit, returned as unsigned long in [0, 2^32)
    private static long hash(String s) {
        int h = 0x811C9DC5;
        final int p = 0x01000193;
        for (int i = 0; i < s.length(); i++) {
            h ^= s.charAt(i);
            h *= p;
        }
        return h & 0xFFFFFFFFL;
    }

    public static void main(String[] args) {
        List<String> servers = Arrays.asList(
            "192.168.0.0:111",
            "192.168.0.1:111",
            "192.168.0.2:111",
            "192.168.0.3:111",
            "192.168.0.4:111"
        );
        ConsistentHashWithReplicas ring = new ConsistentHashWithReplicas(5, servers);

        String[] keys = {"sunlight", "Moon", "Stars"};
        for (String k : keys) {
            System.out.printf("key=%s hash=%d -> %s%n", k, hash(k), ring.locate(k));
        }
    }
}

Notes on behavior

  • Lookup complexity is O(log N) for N ring points.
  • With virtual nodes, N = replicas × number of physical nodes.
  • To reduce memory while maintaining balance, tune replicas to your cluster size and acceptable variance. A simple rule of thumb is replicas ≈ 100–200 for small clusters; scale up if you observe hot spots.
  • Any fast, good-distribution hash (e.g., xxHash, Murmur3) can replace FNV-1a; keep the output space consistent across nodes for deterministic routing.

Related Articles

Efficient Usage of HTTP Client in IntelliJ IDEA

IntelliJ IDEA incorporates a versatile HTTP client tool, enabling developres to interact with RESTful services and APIs effectively with in the editor. This functionality streamlines workflows, replac...

Installing CocoaPods on macOS Catalina (10.15) Using a User-Managed Ruby

System Ruby on macOS 10.15 frequently fails to build native gems required by CocoaPods (for example, ffi), leading to errors like: ERROR: Failed to build gem native extension checking for ffi.h... no...

Resolve PhpStorm "Interpreter is not specified or invalid" on WAMP (Windows)

Symptom PhpStorm displays: "Interpreter is not specified or invalid. Press ‘Fix’ to edit your project configuration." This occurs when the IDE cannot locate a valid PHP CLI executable or when the debu...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.