Redis Sentinel Mode: Automatic Failover for High Availability
Overview
Manual master-slave failover requires human entervention—stop the application, reconfigure a slave, update configuration files, and restart. This process is time-consuming and introduces service downtime. Redis Sentinel, introduced in version 2.8, provides an automated solution for this problem.
Sentinel monitors masters in the background and automatically promotes a slave to master when the primary fails, based on a voting mechanism.
Sentinel operates as an independent daemon process. It communicates with Redis instances through commands and monitors their health status by analyzing responses.
Single Sentinel Architecture
A single sentinel performs two key functions:
- Queries Redis instances and retrieves they operational status, including both master and slave nodes.
- Detects master failures and automatically promotes a slave, then uses the pub/sub mechanism to notify remaining slaves to upddate their configuration files.
Multi-Sentinel Architecture
A single sentinel creates a single point of failure. Multi-sentinel deployments solve this by having sentinels monitor each other, forming a distributed monitoring system.
When a master goes down, sentinels continuously check the status:
Subjective Downtime: Sentinel 1 detects the failure and marks the master as unavailable locally. This alone does not trigger failover.
Objective Downtime: Additional sentinels confirm the failure. When the configured quorum threshold is met, sentinels vote on failover. A designated leader sentinel executes the failover, promotes the best slave, and broadcasts updates via pub/sub. All monitored slaves then reconfigure to follow the new master.
Configuration and Testing
Prerequisites
Assume a standard master-slave setup with one master and two slaves is already operational.
Sentinel Configuration
Create a sentinel configuration file:
# sentinel-26379.conf
port 26379
dir /tmp
sentinel monitor master-node 127.0.0.1 6379 2
sentinel down-after-milliseconds master-node 30000
sentinel parallel-syncs master-node 1
sentinel failover-timeout master-node 180000
Starting the Sentinel
Launch the sentinel using the dedicated binary:
redis-sentinel sentinel-26379.conf
After startup, verify the master-slave relationships are correctly established on all nodes.
Simulating Master Failure
Gracefully stop the master Redis instance:
redis-cli -p 6379 shutdown
Examine the sentinel logs to observe the failover process. The sentinel first marks the master as subjective downtime, then coordinates with other sentinels for objective downtime confirmation. Once quorum is reached, a leader is elected and initiates the failover. The selected slave receives the replicaof no one command to become independent.
Post-Failover Verification
Confirm the promoted slave accepts write commands and the remaining slave successfully replicates from the new master. Check sentinel logs for confirmation messages indicating successful failover completion.
Restoring the Original Master
Restart the former master instance. Sentinel detects it rejoining and automatically reconfigures it as a slave to the new master, maintaining the replicated topology.
Key Configuration Parameters
# Connection port
port 26379
# Working directory for temp files
dir /tmp
# Monitored master definition
# quorum determines how many sentinels must agree before failover proceeds
sentinel monitor master-node 127.0.0.1 6379 2
# Authentication password (if master requires auth)
sentinel auth-pass master-node mypassword
# Timeout before marking master as unavailable
sentinel down-after-milliseconds master-node 30000
# Concurrent replications during failover (lower value = longer failover)
sentinel parallel-syncs master-node 1
# Failover timeout (3 minutes default)
sentinel failover-timeout master-node 180000
# Notification script for alerting on failures
sentinel notification-script master-node /opt/redis/alerts.sh
# Script to notify applications of master address changes
sentinel client-reconfig-script master-node /opt/redis/notify-apps.sh
Advantages
- Inherits all benefits of master-slave replication while adding automatic failover capability.
- Improves system availability through automatic master election and slave promotion.
- Upgrades from manual to automatic operation without data loss during failures.
Limitations
- Cluster capacity is fixed at startup; online scaling requires cluster mode which Sentinel alone cannot provide.
- Configuration complexity increases with the number of sentinels, quorum settings, and notification scripts.
- Network partitioning can trigger false positives, requiring careful quorum tuning for specific environments.
Failover Decision Logic
When selecting a new master, Sentinel prioritizes candidates by:
- Slave priority configured via
replica-priorityin Redis - Processed command offset from master (lower offset indicates more synchronized)
- Run ID (lower lexicographical value wins, though this rarely matters)
This ensures the most up-to-date slave with highest priority becomes the new master.