Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Building High-Availability LVS Clusters with Keepalived

Tech May 14 1

Overview

Keepalived provides high availability for LVS load balancers through VRRP (Virtual Router Redundancy Protocol). The solution enables automatic failover of the virtual IP address between master and backup nodes, ensuring uninterrupted service when the primary scheduler fails.

Core Concepts

VIP Failover Mechanism

The virtual IP address remains active only on the master node during normal operation. When the master becomes unavailable, the VIP automatically migrates to the backup server. Once the master recovers and rejoins the cluster, the VIP returns based on priority settings.

Key Configuration Requirements

  1. Priority settings must be properly configured on both nodes
  2. Virtual IP address and real server definitions must match
  3. Virtual router IDs must be identical across master and backup
  4. Each node requires a unique identifier

VRRP Operation

Keepalived implements high availability using the VRRP protocol with these characteristics:

  • Multicast Communication: Nodes communicate via multicast adress 224.0.0.18 to verify peer availability
  • Priority-Based Selection: The node with higher priority becomes master
  • Automatic Failover: When master fails, backup takes over; when master recovers, it reassumes control
  • VIP Migration: Only the VIP address switches between nodes

Health Check and Failover

To ensure proper failover when nginx fails, the monitoring script must stop keepalievd alongside nginx. This triggers the VIP to migrate to the backup node.

Implementation

Installing Keepalived

yum -y install keepalived

Creating the Health Check Script

#!/bin/bash
/healthcheck/check_nginx.sh
#!/bin/bash
nginx_pid=$(pgrep -x nginx)
if [ -z "$nginx_pid" ]; then
    /usr/bin/systemctl stop keepalived
fi

Make the script executable:

chmod +x /healthcheck/check_nginx.sh

Master Node Configuration

Edit /etc/keepalived/keepalived.conf:

global_defs {
   router_id LVS_MASTER
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
   vrrp_iptables
}

vrrp_script monitor_nginx {
   script "/healthcheck/check_nginx.sh"
   interval 5
   weight -20
}

vrrp_instance VI_CLUSTER {
    state MASTER
    interface ens33
    virtual_router_id 51
    priority 120
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass cluster_pass
    }
    virtual_ipaddress {
        192.168.10.100/24 dev ens33
    }
    track_script {
        monitor_nginx
    }
}

Start keepalived:

systemctl enable keepalived
systemctl start keepalived

Backup Node Configuration

global_defs {
   router_id LVS_BACKUP
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
   vrrp_iptables
}

vrrp_script monitor_nginx {
   script "/healthcheck/check_nginx.sh"
   interval 5
   weight -20
}

vrrp_instance VI_CLUSTER {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass cluster_pass
    }
    virtual_ipaddress {
        192.168.10.100/24 dev ens33
    }
    track_script {
        monitor_nginx
    }
}

Verification

Check Cluster Status

On the master node:

ip addr show ens33
systemctl status keepalived

Test Failover

  1. Stop nginx on the master:
systemctl stop nginx
  1. After 5 seconds, verify keepalived has stopped and VIP has migrated

  2. Verify from a client that requests to the VIP are now served by the backup node

Recovery Test

  1. Restart nginx on the master:
systemctl start nginx
  1. Verify keepalived resumes and VIP returns to the master node

Notes

Keepalived was designed specifical for LVS but works with other load balancing solutions. The architecture handles scheduler-level failures only; actual traffic distribution still depends on the underlying LVS configuration.

Brain Split Prevention: Ensure network connectivity between nodes is reliable. Network partitions can cause both nodes to claim master status. Implement redundant network paths or use external monitoring to detect and resolve split-brain scenarios.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.