Building High-Availability LVS Clusters with Keepalived
Overview
Keepalived provides high availability for LVS load balancers through VRRP (Virtual Router Redundancy Protocol). The solution enables automatic failover of the virtual IP address between master and backup nodes, ensuring uninterrupted service when the primary scheduler fails.
Core Concepts
VIP Failover Mechanism
The virtual IP address remains active only on the master node during normal operation. When the master becomes unavailable, the VIP automatically migrates to the backup server. Once the master recovers and rejoins the cluster, the VIP returns based on priority settings.
Key Configuration Requirements
- Priority settings must be properly configured on both nodes
- Virtual IP address and real server definitions must match
- Virtual router IDs must be identical across master and backup
- Each node requires a unique identifier
VRRP Operation
Keepalived implements high availability using the VRRP protocol with these characteristics:
- Multicast Communication: Nodes communicate via multicast adress 224.0.0.18 to verify peer availability
- Priority-Based Selection: The node with higher priority becomes master
- Automatic Failover: When master fails, backup takes over; when master recovers, it reassumes control
- VIP Migration: Only the VIP address switches between nodes
Health Check and Failover
To ensure proper failover when nginx fails, the monitoring script must stop keepalievd alongside nginx. This triggers the VIP to migrate to the backup node.
Implementation
Installing Keepalived
yum -y install keepalived
Creating the Health Check Script
#!/bin/bash
/healthcheck/check_nginx.sh
#!/bin/bash
nginx_pid=$(pgrep -x nginx)
if [ -z "$nginx_pid" ]; then
/usr/bin/systemctl stop keepalived
fi
Make the script executable:
chmod +x /healthcheck/check_nginx.sh
Master Node Configuration
Edit /etc/keepalived/keepalived.conf:
global_defs {
router_id LVS_MASTER
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
vrrp_iptables
}
vrrp_script monitor_nginx {
script "/healthcheck/check_nginx.sh"
interval 5
weight -20
}
vrrp_instance VI_CLUSTER {
state MASTER
interface ens33
virtual_router_id 51
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass cluster_pass
}
virtual_ipaddress {
192.168.10.100/24 dev ens33
}
track_script {
monitor_nginx
}
}
Start keepalived:
systemctl enable keepalived
systemctl start keepalived
Backup Node Configuration
global_defs {
router_id LVS_BACKUP
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
vrrp_iptables
}
vrrp_script monitor_nginx {
script "/healthcheck/check_nginx.sh"
interval 5
weight -20
}
vrrp_instance VI_CLUSTER {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass cluster_pass
}
virtual_ipaddress {
192.168.10.100/24 dev ens33
}
track_script {
monitor_nginx
}
}
Verification
Check Cluster Status
On the master node:
ip addr show ens33
systemctl status keepalived
Test Failover
- Stop nginx on the master:
systemctl stop nginx
-
After 5 seconds, verify keepalived has stopped and VIP has migrated
-
Verify from a client that requests to the VIP are now served by the backup node
Recovery Test
- Restart nginx on the master:
systemctl start nginx
- Verify keepalived resumes and VIP returns to the master node
Notes
Keepalived was designed specifical for LVS but works with other load balancing solutions. The architecture handles scheduler-level failures only; actual traffic distribution still depends on the underlying LVS configuration.
Brain Split Prevention: Ensure network connectivity between nodes is reliable. Network partitions can cause both nodes to claim master status. Implement redundant network paths or use external monitoring to detect and resolve split-brain scenarios.