Home > Tech > Content

Hash Table Fundamentals and Core Algorithmic Applications

Tech May 7 5

Hash tables provide O(1) average-time complexity for insertion, deletion, and lookup operations. They are the optimal choice when the primary requirement is rapidly verifying the existence of an element within a collection or tracking frequency counts. The core mechanism relies on a hash function that maps keys to specific indices in an underlying array. When multiple keys map to the same index, a hash collision occurs. Two standard resolution strategies exist:

Chaining: Each bucket stores a linked list or dynamic array of colliding elements. Lookups traverse the chain at the computed index.
Open Addressing (Linear Probing): Colliding elements are placed in the next available slot. This approach requires the table capacity to strictly exceed the number of stored elements to prevent infinite probing loops and performance degradation.

Implementation choices depend heavily on data constraints:

Fixed-range, dense integers: Direct addressing via standard arrays.
Unique element tracking without associated data: std::unordered_set.
Key-value pair storage (e.g., element-to-index mapping): std::unordered_map.

Hash-based structures inherently trade memory overhead for execution speed.

242. Valid Anagram

Determining if two strings are anagrams requires verifying identical character frequencies. Since lowercase English letters occupy a contiguous ASCII range, a fixed-size integer array of length 26 serves as a highly efficient hash table.

Algorithm:

Initialize a frequency array with zeros.
Iterate through the first string, incrementing the slot corresponding to char - 'a'.
Iterate through the second string, decrementing the corresponding slot.
If any slot drops below zero or remains non-zero after procsesing, the strings differ in composition.

class Solution {
public:
    bool isAnagram(std::string s, std::string t) {
        if (s.length() != t.length()) return false;
        int freq[26] = {0};
        for (char c : s) freq[c - 'a']++;
        for (char c : t) {
            if (--freq[c - 'a'] < 0) return false;
        }
        return true;
    }
};

349. Intersection of Two Arrays

Computing the intersection of two arrays requires returning unique common elements. When input values lack a tight upper bound or are sparsely distributed, array-based hashing wastes memory. A hash set efficiently handles deduplication and O(1) lookups.

Algorithm:

Load all eleemnts from the first array into an unordered_set for O(1) probing.
Traverse the second array. If an element exists in the lookup set, insert it into a result set to automatically handle duplicates.
Convert the result set back to a vector.

class Solution {
public:
    std::vector<int> intersection(std::vector<int>& arr1, std::vector<int>& arr2) {
        std::unordered_set<int> lookup(arr1.begin(), arr1.end());
        std::unordered_set<int> common_elements;
        for (int val : arr2) {
            if (lookup.count(val)) {
                common_elements.insert(val);
            }
        }
        return std::vector<int>(common_elements.begin(), common_elements.end());
    }
};

Range constructors streamline container initialization by accepting iterator pairs, eliminating manual insertion loops. This pattern applies universally across STL containers.

202. Happy Number

A number is "happy" if repeatedly replacing it with the sum of the squares of its digits eventually reaches 1. If the sequence enters a cycle that excludes 1, the number is unhappy. Detecting this cycle is the core challenge.

Approach 1: Hash Set Tracking Store each computed sum in a set. If a sum reappears before reaching 1, a cycle is confirmed.

class Solution {
public:
    bool isHappy(int n) {
        std::unordered_set<int> visited;
        while (n != 1 && !visited.count(n)) {
            visited.insert(n);
            int next_val = 0;
            while (n > 0) {
                int digit = n % 10;
                next_val += digit * digit;
                n /= 10;
            }
            n = next_val;
        }
        return n == 1;
    }
};

Approach 2: Floyd’s Cycle Detection Use two pointers moving at different speeds through the transformation sequence. The fast pointer advances two steps per iteration, while the slow pointer advances one. If they meet, a cycle exists. Verify if the meeting point is 1. This approach reduces space complexity to O(1).

class Solution {
private:
    int transform(int num) {
        int total = 0;
        while (num > 0) {
            int d = num % 10;
            total += d * d;
            num /= 10;
        }
        return total;
    }
public:
    bool isHappy(int n) {
        int slow = n;
        int fast = transform(n);
        while (fast != 1 && slow != fast) {
            slow = transform(slow);
            fast = transform(transform(fast));
        }
        return fast == 1;
    }
};

1. Two Sum

Finding two indices where values sum to a target requires tracking previously visited elements alongside their positions. A simple set only stores keys, making it insufficient. An unordered_map mapping values to indices solves this efficiently.

Algorithm:

Initialize an empty hash map.
Iterate through the array with index i.
Calculate complement = target - nums[i].
Check if complement exists in the map. If yes, return the stored index and i.
Otherwise, store nums[i] and i in the map for future lookups.

class Solution {
public:
    std::vector<int> twoSum(std::vector<int>& nums, int target) {
        std::unordered_map<int, int> index_map;
        for (int i = 0; i < nums.size(); ++i) {
            int needed = target - nums[i];
            auto match = index_map.find(needed);
            if (match != index_map.end()) {
                return {match->second, i};
            }
            index_map[nums[i]] = i;
        }
        return {};
    }
};

Key considerations for hash-based solutions:

Arrays work only for bounded, dense integer ranges. Sparse data causes severe memory waste.
Sets track existence but cannot store auxiliary data like indices.
Maps are necessary when associating keys with metadata (e.g., original positions) is required. Unordered variants provide optimal average-case performance when key ordering is irrelevant.

Tags: Hash Table C++algorithms Data Structures LeetCode

Back to List

Prev: Topological Sorting for Lanqiao Cup

Next: Implementing the Factory Method Pattern in C++

Fading Coder

Hash Table Fundamentals and Core Algorithmic Applications

242. Valid Anagram

349. Intersection of Two Arrays

202. Happy Number

1. Two Sum

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Hash Table Fundamentals and Core Algorithmic Applications

242. Valid Anagram

349. Intersection of Two Arrays

202. Happy Number

1. Two Sum

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment