Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Redis, Kafka, and ZooKeeper Core Concepts and Common Interview Questions

Tech 1

Redis Performence and Persistence

Redis achieves high read/write speeds through:

  • In-memory storage: Data resides entirely in RAM, enabling faster access compared to disk-based systems.
  • Single-threaded model: Eliminates context-switching and lock contention overhead, simplifying concurrency control.
  • Non-blocking I/O: Utilizes event polling to handle multiple connections efficiently within a single thread.
  • Optimized data structures: Provides specialized structures (strings, lists, hashes, sets, sorted sets) with effficient operations.
  • Persistence strategies: Offers snapshotting (RDB) and append-only file (AOF) mechanisms to balance durability and performance.

Redis Persistence: RDB vs. AOF

  • RDB: Creates point-in-time snapshots at configured intervals. Benefits include compact file size, suitability for backups/disaster recovery, and minimal performance impact during saves (via fork). Drawbacks include potential data loss between snapshots and fork delays with large datasets.
  • AOF: Logs every write operation. Advantages include configurable fsync policies (e.g., per-second), durability (max 1-second data loss), and crash-safe append-only writes. Disadvantages include larger file sizes and potentially slower write speeds compared to RDB.

Redis 4.x+ optimizes AOF rewriting by persisting data as RDB snapshots first, then appending incremental logs, enhancing recovery efficiency.

Distributed Locking with Redis

Use SET key value NX EX seconds for atomic lock acquisition. Release locks by verifying ownership with a Lua script:

if redis.call('GET', KEYS[1]) == ARGV[1] then
    return redis.call('DEL', KEYS[1])
else
    return 0
end

Prevent deadlocks by:

  1. Always releasing locks after critical sections.
  2. Setting key expiration to handle process failures.
  3. Implementing lock renewal (watchdog) for long operations.

Redis as Async/Delayed Queue

  • Async Queue: Use LPUSH/BRPOP on Lists. Producers push messages; consumers block-pop messages.
  • Delayed Queue: Leverage Sorted Sets with timestamps as scores. Producers add messages with target execution times. Consumers poll for expired messages (score ≤ current time).

Key Expiration and Eviction

  • Expired Key Deletion: Redis combines passive (lazy) deletion (on key access) and active periodic deletion (sampling keys).
  • Eviction Policies: Configured via maxmemory-policy, including:
    • noeviction: Deny writes when memory is full.
    • allkeys-lru/volatile-lru: Evict least recently used keys.
    • allkeys-random/volatile-random: Random eviction.
    • volatile-ttl: Evict keys with shortest TTL first.

Redis High Availability

  • Replication: Master handles writes; replicas asynchronously sync data. Supports read scalability and failover.
  • Sentinel: Monitors master/replica health, manages automatic failover, and provides configuration updates.
  • Cluster: Shards data across nodes using hash slots (16384 slots). Each shard has master/replicas. Supports horizontal scaling and automatic failover.

Cache Consistency

Ensure database-cache consistency with strategies:

  1. Cache Aside (Lazy Loading):
    • Read: Check cache; populate from DB on miss.
    • Write: Update DB, then invalidate cache.
  2. Write Through: Write to cache and DB simultaneously.
  3. Delayed Double Delete:
    • Delete cache.
    • Update DB.
    • Sleep (e.g., 1s), then delete cache again to handle stale reads during replication lag.

Kafka Architecture

  • Brokers: Kafka server instances forming a cluster.
  • Topics: Logical categories for messages, partitioned for scalability.
  • Producers: Publish messages to topics.
  • Consumers: Subscribe to topics and process messages.
  • Consumer Groups: Enable parallel consumption; each partition is consumed by one group member.

Kafka Reliability

  • Replication: Partitions have leader (handles reads/writes) and followers (replicate data).
  • ISR (In-Sync Replicas): Subset of replicas caught up with the leader. Only ISR members can become leaders.
  • Acknowledgements:
    • acks=0: No guarantee.
    • acks=1: Leader write confirmation.
    • acks=all: All ISR replicas write confirmation.

Kafka Performence

  • Batching: Producers send messages in batches.
  • Sequential I/O: Writes messages to disk sequentially.
  • Zero-Copy: Transfers data directly from disk buffer to network, bypassing user space.
  • Partitioning: Distributes load across brokers.

ZooKeeper in Kafka

  • Manages cluster metadata (brokers, topics, partitions).
  • Coordinates leader election (for partitions and cluster controller).
  • Tracks consumer group offsets (pre-0.9).

ZooKeeper Fundamentals

  • ZNodes: Hierarchical data nodes (persistent, ephemeral, sequential).
  • Watch Mechanism: Clients receive notifications for ZNode changes.
  • ZAB Protocol: Ensures atomic broadcast and consistency across nodes.
  • Use Cases: Distributed coordination, naming service, configuration management, cluster state synchronization.

RabbitMQ Guarantees

  • Message Persistence: Mark messages as persistent and use mirrored queues.
  • Publisher Confirms: Asynchronous acknowledgements (ConfirmCallback).
  • Consumer Acknowledgements: Manual ACKs after processing.
  • Dead Letter Exchanges (DLX): Route unprocessable messages.
  • TTL/Delayed Queues: Implement message deferral via per-message TTL or plugins.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.