Redis, Kafka, and ZooKeeper Core Concepts and Common Interview Questions
Redis Performence and Persistence
Redis achieves high read/write speeds through:
- In-memory storage: Data resides entirely in RAM, enabling faster access compared to disk-based systems.
- Single-threaded model: Eliminates context-switching and lock contention overhead, simplifying concurrency control.
- Non-blocking I/O: Utilizes event polling to handle multiple connections efficiently within a single thread.
- Optimized data structures: Provides specialized structures (strings, lists, hashes, sets, sorted sets) with effficient operations.
- Persistence strategies: Offers snapshotting (RDB) and append-only file (AOF) mechanisms to balance durability and performance.
Redis Persistence: RDB vs. AOF
- RDB: Creates point-in-time snapshots at configured intervals. Benefits include compact file size, suitability for backups/disaster recovery, and minimal performance impact during saves (via
fork). Drawbacks include potential data loss between snapshots andforkdelays with large datasets. - AOF: Logs every write operation. Advantages include configurable fsync policies (e.g., per-second), durability (max 1-second data loss), and crash-safe append-only writes. Disadvantages include larger file sizes and potentially slower write speeds compared to RDB.
Redis 4.x+ optimizes AOF rewriting by persisting data as RDB snapshots first, then appending incremental logs, enhancing recovery efficiency.
Distributed Locking with Redis
Use SET key value NX EX seconds for atomic lock acquisition. Release locks by verifying ownership with a Lua script:
if redis.call('GET', KEYS[1]) == ARGV[1] then
return redis.call('DEL', KEYS[1])
else
return 0
end
Prevent deadlocks by:
- Always releasing locks after critical sections.
- Setting key expiration to handle process failures.
- Implementing lock renewal (watchdog) for long operations.
Redis as Async/Delayed Queue
- Async Queue: Use
LPUSH/BRPOPon Lists. Producers push messages; consumers block-pop messages. - Delayed Queue: Leverage Sorted Sets with timestamps as scores. Producers add messages with target execution times. Consumers poll for expired messages (score ≤ current time).
Key Expiration and Eviction
- Expired Key Deletion: Redis combines passive (lazy) deletion (on key access) and active periodic deletion (sampling keys).
- Eviction Policies: Configured via
maxmemory-policy, including:noeviction: Deny writes when memory is full.allkeys-lru/volatile-lru: Evict least recently used keys.allkeys-random/volatile-random: Random eviction.volatile-ttl: Evict keys with shortest TTL first.
Redis High Availability
- Replication: Master handles writes; replicas asynchronously sync data. Supports read scalability and failover.
- Sentinel: Monitors master/replica health, manages automatic failover, and provides configuration updates.
- Cluster: Shards data across nodes using hash slots (16384 slots). Each shard has master/replicas. Supports horizontal scaling and automatic failover.
Cache Consistency
Ensure database-cache consistency with strategies:
- Cache Aside (Lazy Loading):
- Read: Check cache; populate from DB on miss.
- Write: Update DB, then invalidate cache.
- Write Through: Write to cache and DB simultaneously.
- Delayed Double Delete:
- Delete cache.
- Update DB.
- Sleep (e.g., 1s), then delete cache again to handle stale reads during replication lag.
Kafka Architecture
- Brokers: Kafka server instances forming a cluster.
- Topics: Logical categories for messages, partitioned for scalability.
- Producers: Publish messages to topics.
- Consumers: Subscribe to topics and process messages.
- Consumer Groups: Enable parallel consumption; each partition is consumed by one group member.
Kafka Reliability
- Replication: Partitions have leader (handles reads/writes) and followers (replicate data).
- ISR (In-Sync Replicas): Subset of replicas caught up with the leader. Only ISR members can become leaders.
- Acknowledgements:
acks=0: No guarantee.acks=1: Leader write confirmation.acks=all: All ISR replicas write confirmation.
Kafka Performence
- Batching: Producers send messages in batches.
- Sequential I/O: Writes messages to disk sequentially.
- Zero-Copy: Transfers data directly from disk buffer to network, bypassing user space.
- Partitioning: Distributes load across brokers.
ZooKeeper in Kafka
- Manages cluster metadata (brokers, topics, partitions).
- Coordinates leader election (for partitions and cluster controller).
- Tracks consumer group offsets (pre-0.9).
ZooKeeper Fundamentals
- ZNodes: Hierarchical data nodes (persistent, ephemeral, sequential).
- Watch Mechanism: Clients receive notifications for ZNode changes.
- ZAB Protocol: Ensures atomic broadcast and consistency across nodes.
- Use Cases: Distributed coordination, naming service, configuration management, cluster state synchronization.
RabbitMQ Guarantees
- Message Persistence: Mark messages as
persistentand use mirrored queues. - Publisher Confirms: Asynchronous acknowledgements (
ConfirmCallback). - Consumer Acknowledgements: Manual ACKs after processing.
- Dead Letter Exchanges (DLX): Route unprocessable messages.
- TTL/Delayed Queues: Implement message deferral via per-message TTL or plugins.