Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Internal Mechanics of Java HashMap Implementation

Tech May 7 6

Class Definition and Core Fields

The HashMap class extends AbstractMap and implements the Map interface. It is designed to store key-value pairs using a hash table structure.

public class HashMap<K, V> extends AbstractMap<K, V>
    implements Map<K, V>, Cloneable, Serializable {
    
    // The default initial capacity - MUST be a power of two.
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // 16

    // The maximum capacity, used if a higher value is implicitly specified
    // by any of the constructors with arguments.
    // MUST be a power of two <= 2^30.
    static final int MAXIMUM_CAPACITY = 1 << 30;

    // The load factor used when none specified in constructor.
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    // The bin count threshold for using a tree rather than list for a bin.
    // Bins are converted to trees when adding an element to a bin with
    // at least this many nodes.
    static final int TREEIFY_THRESHOLD = 8;

    // The bin count threshold for untreeifying a (bin) tree back to list.
    // Values less than this trigger conversion back to linked lists.
    static final int UNTREEIFY_THRESHOLD = 6;

    // The smallest table capacity for which bins may be treeified.
    // (Otherwise the table is resized if too many nodes in a bin.)
    static final int MIN_TREEIFY_CAPACITY = 64;

    // The table, initialized on first use, and resized as necessary.
    // When allocated, length is always a power of two.
    transient Node<K, V>[] table;

    // Holds cached entrySet(). Note that AbstractMap fields are used
    // for keySet() and values().
    transient Set<Map.Entry<K, V>> entrySet;

    // The number of key-value mappings contained in this map.
    transient int size;

    // The number of times this HashMap has been structurally modified
    transient int modCount;

    // The next size value at which to resize (capacity * load factor).
    int threshold;

    // The load factor for the hash table.
    final float loadFactor;
}

Capacity must be a power of two to allow the use of bitwise operators for index calculation. The formula h & (length - 1) is mathematically equivalent to h % length but significantly faster.

Hash Calculation

The hash(Object key) method computes a hash value for the key. It handles null keys by returning 0. For non-null keys, it spreads the higher bits of the original hash code to the lower bits to reduce collisions.

static final int hash(Object key) {
    int h;
    // XORs the hash code with its unsigned right shift by 16 bits
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Basic Node Structure

Before JDK 1.8, HashMap used an array + linked list. Since JDK 1.8, it uses an array + linked list + red-black tree. The basic unit of storage is the Node.

static class Node<K, V> implements Map.Entry<K, V> {
    final int hash;
    final K key;
    V value;
    Node<K, V> next;

    Node(int hash, K key, V value, Node<K, V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    public final K getKey() { return key; }
    public final V getValue() { return value; }
    public final String toString() { return key + "=" + value; }

    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?, ?> e = (Map.Entry<?, ?>) o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

Insertion Logic: putVal

The putVal method handles the insertion of key-value pairs. It calculates the index, checks for collisions, and updates existing values or appends new nodes. If the bucket size exceeds TREEIFY_THRESHOLD, the linked list is converted to a red-black tree.

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    Node<K, V>[] tab = table;
    int n;
    if (tab == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    
    int i = (n - 1) & hash;
    Node<K, V> p = tab[i];
    
    if (p == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K, V> e;
        K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
        else {
            int binCount = 0;
            for (;;) {
                e = p.next;
                if (e == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for first
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
                binCount++;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

Capacity Calculation

The tableSizeFor method ensures the capacity is a power of two. It takes a generic integer and returns the smallest power of two greater than or equal to it.

static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

Resizing the Table

The resize method creates a new array with double the capacity (or to the initial capacity if the table was null) and rehashes all existing entries into the new array.

final Node<K, V>[] resize() {
    Node<K, V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
 else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    
    Node<K, V>[] newTab = (Node<K, V>[])new Node[newCap];
    table = newTab;
    
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K, V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K, V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K, V> loHead = null, loTail = null;
                    Node<K, V> hiHead = null, hiTail = null;
                    Node<K, V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

Tree Structure: TreeNode

When a bucket becomes too crowded (length > 8), the linked list is converted into a red-black tree (TreeNode). This improves lookup performance from O(n) to O(log n).

static final class TreeNode<K, V> extends LinkedHashMap.Entry<K, V> {
    TreeNode<K, V> parent;  // red-black tree links
    TreeNode<K, V> left;
    TreeNode<K, V> right;
    TreeNode<K, V> prev;    // needed to unlink next upon deletion
    boolean red;
    
    TreeNode(int hash, K key, V val, Node<K, V> next) {
        super(hash, key, val, next);
    }

    // Returns root of tree
    final TreeNode<K, V> root() {
        for (TreeNode<K, V> r = this, p;;) {
            if ((p = r.parent) == null)
                return r;
            r = p;
        }
    }
    
    // Logic to find a node within the tree
    final TreeNode<K, V> find(int h, Object k, Class<?> kc) {
        TreeNode<K, V> p = this;
        do {
            int ph, dir;
            K pk;
            TreeNode<K, V> pl = p.left, pr = p.right;
            if ((ph = p.hash) > h)
                p = pl;
            else if (ph < h)
                p = pr;
            else if ((pk = p.key) == k || (k != null && k.equals(pk)))
                return p;
            else if (pl == null)
                p = pr;
            else if (pr == null)
                p = pl;
            else {
                // Comparison logic for tie-breaking
                if ((kc == null) && (kc = comparableClassFor(k)) == null)
                    return null; // Should not happen in standard usage
                // ... (omitted complex comparison logic for brevity)
            }
        } while (p != null);
        return null;
    }
}

Retrieval Logic: getNode

The getNode method locates a specific entry. It first checks the head of the bucket. If it's a tree node, it delegates to the tree search; otherwise, it traverses the linked list.

final Node<K, V> getNode(int hash, Object key) {
    Node<K, V>[] tab = table;
    Node<K, V> first, e;
    int n;
    K k;
    
    if (tab != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K, V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == { hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

Removing Entries

The remove method deletes a key-value pair. It relies on removeNode to handle the logic of unhooking the node from a linked list or removing it from the red-black tree.

public V remove(Object key) {
    Node<K, V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
}

Functional Methods

Java 8 introduced functional methods like computeIfAbsent. This method computes the value using a mapping function if the key is not already associated with a value (or is mapped to null).

public V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction) {
    if (mappingFunction == null)
        throw new NullPointerException();
    int hash = hash(key);
    Node<K, V>[] tab = table;
    Node<K, V> e; 
    int n; 
    int i;
    
    if (tab != null && (n = tab.length) > 0 &&
        (e = tab[i = (n - 1) & hash]) != null) {
        TreeNode<K, V> tn = null;
        Node<K, V> old = null;
        
        if (e instanceof TreeNode)
            old = (tn = (TreeNode<K, V>)e).getTreeNode(hash, key);
        else {
            do {
                if (e.hash == hash &&
                    Objects.equals(e.key, key)) {
                    old = e;
                    break;
                }
            } while ((e = e.next) != null);
        }
        
        if (old != null) {
            V oldValue = old.value;
            if (oldValue == null) {
                V newValue = mappingFunction.apply(key);
                if (newValue != null) {
                    old.value = newValue;
                    afterNodeAccess(old);
                }
                return newValue;
            }
            return oldValue;
        }
    }
    
    V newValue = mappingFunction.apply(key);
    if (newValue != null) {
        if (tab != null && (n = tab.length) > 0) {
             // Insert logic simplified for brevity
             putVal(hash, key, newValue, false, true);
        } else {
             putVal(hash, key, newValue, false, true);
        }
    }
    return newValue;
}

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.