Windows OS Hub / Linux / CentOS / Keepalived: Configuring High Availability with IP Failover on CentOS/RHEL

May 10, 2023 CentOS Linux RHEL

Keepalived: Configuring High Availability with IP Failover on CentOS/RHEL

In this article, we’ll consider a high-available failover configuration of two squid (Linux) proxy servers to access the Internet from a corporate LAN. To build a failover configuration we are going to create an HA cluster using keepalived.
An HA cluster is a group of servers with embedded redundancy to minimize the app downtime in case of hardware or software issues of any of the servers in the group. According to this definition, the following must be implemented for correct operation of an HA cluster:

The server state check;
The automatic switching of resources in case of a server failure.

Keepalived enables both of these. Keepalived is a system daemon in Linux systems that enables service failover and load balancing. Failover is provided by a floating IP address switched to another server if the main one fails. To automatically switch the IP address between the servers, keepalived is using the VRRP (Virtual Router Redundancy Protocol – https://www.ietf.org/rfc/rfc2338.txt).

Contents:

Principles of VRRP
Install and Configure keepalived on CentOS
How to Perform a Healthcheck of App or Interface with Keepalived?
Testing Keepalived Failover in Case of a Failure

Principles of VRRP

First of all, let’s consider some theory and the main VRRP definitions.

VIP — Virtual IP, a virtual IP address able to automatically switch between the servers in case of a failure;
Master — a server the VIP is currently active on;
Backup — servers the VIP will switch to in case of a Master failure;
VRID — Virtual Router ID, the servers that share a virtual IP (VIP) form a so-called virtual router and its unique identifier may have a value between 1 and 255. A server may belong to multiple VRIDs at a time, but every VRID must have a unique virtual IP address.

Basic operation algorithm:

At fixed intervals, the Master server sends VRRP packets (heartbeats) to the specific multicasting address 224.0.0.18, and all slave server listen to this address. Multicasting means that there is one sender and multiple recipients;
Hint. To make servers work in the multicasting mode, your network equipment must support multicast traffic.
If a Slave server does not receive any heartbeat packets, it starts Master selection procedure. If the server becomes a Master by priority, it will activate the VIP and send a gratuitous ARP. The gratuitous ARP is a special type of an ARP response that updates the MAC table on the network switches to inform about the change of the Virtual IP address owner and the MAC address to redirect traffic to.

Install and Configure keepalived on CentOS

We will install and configure the keepalived on a two Linux servers (proxy-serv01 and proxy-serv02) running CentOS 7 with Squid installed. In our scheme, we will use the simplest method of load balancing — Round Robin DNS. This method suggests that a single DNS name has multiple registered IP addresses, and clients get these addresses one by one. So we will need two virtual IP addresses registered for one DNS name (proxy-serv). Here is the network diagram:

Each Linux server has two physical network interfaces: eth1 with the public (white) IP address and eth0 in the local network.

The following server IP addresses are used as real ones:

192.168.2.251 — for proxy-server01
192.168.2.252 — for proxy-server02

The following IP addresses will be used as virtual ones that are automatically switched between servers in case of a failure:

192.168.2.101
192.168.2.102

Important. When you configure VRRP, never use the real server IP address as a virtual one, since in case a server fails its address will move to the next server, and after the failback the first server may get isolated from the network. The matter is that to get the IP address back, a server has to send a VRRP packets to the network, but it has no IP address to do it from.

You can install the keepalived on both servers using yum package manager (or dnf on CentOS 8):

# yum install keepalived

After the installation is completed on both servers, change the keepalived configuration file on both servers:

# nano /etc/keepalived/keepalived.conf

The lines with different parameters are highlighted:

proxy-serv01	proxy-serv02
1	2

Let’s describe the options in more detail:

vrrp_instance <name> — is the section that defines a VRRP instance;
state <MASTER|BACKUP> — is the initial node state at the startup;
interface <interface name> — is the interface VRRP is running on;
virtual_router_id <number from 0 to 255> — is the unique VRRP instance identifier, it must be the same on all servers;
priority <number from 0 to 255> — sets the server priority, a server with a higher priority becomes a MASTER;
virtual_ipaddress — is a block of virtual IP addresses active on a server in the MASTER state. They must be the same on all servers inside the VRRP instance

Note. You can find many examples when the authentication option is used in the VRRP configuration. However, the keepalived documentation mentions that the authentication was removed from VRRPv2 in the RFC3768 specification (https://tools.ietf.org/html/rfc3768) in 2004, since it had not provided real security. It is not recommended to use this configuration option.

If the current network configuration doesn’t allow to use multicast, keepalived provides a unicast option, i. e. VRRP heartbeat packets will be sent to servers directly according to the list. To use unicast, you will need the following options:

unicast_src_ip — is the source address for VRRP packates
unicast_peer — is the block of server IP addresses, to which VRRP packets will be sent.

Thus, our configuration defines two VRRP instances, proxy_ip1 and proxy_ip2. At the regular operation, proxy-serv01 will be the MASTER for the virtual IP 192.168.2.101 and the BACKUP for 192.168.2.102, and vice versa, proxy-serv02 will be the MASTER for the virtual IP 192.168.2.102, and the BACKUP for 192.168.2.101.

If a firewall is activated on a server, you will have to add allowing rules for the multicast traffic and VRRP using iptables:

# iptables -A INPUT -i eth0 -d 224.0.0.0/8 -j ACCEPT # iptables -A INPUT -p vrrp -i eth0 -j ACCEPT

Enable the keepalived service for autostart on system boot and run it on both servers

# systemctl enable keepalived # systemctl start keepalived

After keepalived has been started, virtual IP addresses will be assigned to the interfaces from your configuration file. Let’s view the current eth0 IP addresses of the servers:

# ip a show eth0

On proxy-serv01:

On proxy-serv02:

How to Perform a Healthcheck of App or Interface with Keepalived?

The VRRP protocol provides the server state monitoring. For example, this is useful in case of a physical server failure or switch/server NIC port. However, other issues may also occur:

A proxy server (or other app) error — the clients accessing the virtual address of the server will get an error message in their browsers that the proxy server is unavailable;
The Internet access failure of the second interface — the clients accessing the virtual address of the server will get an error message in their browsers that the connection could not be established.

To handle the situations described above, use the following options:

track_interface — monitors the interface state, and sets the FAULT state for a VRRP instance if one of the listed interfaces in DOWN;
track_script — performs a healthcheck of your HA app using a script that returns 0 if the check has been successful, or 1 if the check failed.

Update the configuration by adding eth1 interface monitoring (by default, the VRRP instance will check the interface it is bound to: it is eth0 in the current configuration).

track_interface {
  eth1
}

The track_script directive runs a script with the parameters determined by the vrrp_script block in the following format:

vrrp_script <name> {
 script <"path to the executable file">
 interval <number, seconds> - the periodicity of running the script, 1 second by default
 fall <number> - the number of times when the script has returned a value different from zero to switch to the FAULT state
 rise <number> - the number of times when the script has returned a zero value to get out of the FAULT state (failback)
 timeout <number> - the time to wait till the script returns the result ( if time is up, the script returns a non-zero value_
 weight <number> - the value, by which the server priority will be decreased in case it gets the FAULT state. The default value is 0, which means that the server gets the FAULT state after the script has failed for a number of times set in the fall parameter
}

Let’s configure Squid proxy health checking. Using this command, you can check if the squid process is active:

# squid -k check

Create the vrrp_script running every 3 seconds. This block is defined outside the vrrp_instance blocks.

vrrp_script chk_squid_service {
 script "/usr/sbin/squid -k check"
 interval 3
}

Add this script to the monitoring into both vrrp_instance blocks:

track_script {
 chk_squid_service
}

If Squid fails, the virtual IP address will be switched to another server.

You can specify any additional actions to be done if the server state changes.

If Squid is configured to accept connections from any interface, i. e. http_port 0.0.0.0:3128, no problems will occur when the virtual IP address is switched, and Squid will accept connections to the new address. However, if the specific IP addresses are configured, e. g.:

http_port 192.168.2.101:3128
http_port 192.168.2.102:3128

Squid won’t know that a new address has appeared in the system to listen for client requests on. To handle the situations when some additional actions are necessary when the virtual IP address has been switched, keepalived allows to run a script if the server state changes, for example, from MASTER to BACKUP, or vice versa. It is implemented using this option:

notify "path to the executable file"

Testing Keepalived Failover in Case of a Failure

After you have configured the virtual IPs, make sure that failures are handled correctly. The first check is a server failure simulation. Disable the eth0 on proxy-serv01, and it will stop sending VRRP heartbeat packets. The proxy-serv02 must activate the virtual IP address 192.168.2.101. Check it with this command:

# ip a show eth0

On proxy-serv01:

On proxy-serv02:

As expected, proxy-serv02 activated the virtual IP address 192.168.2.101. Let’s see what has been written in the logs using the following command:

cat /var/log/messages | grep -i keepalived

on proxy-serv01

on proxy-serv02

Keepalived_vrrp[xxxxx]:

Kernel is reporting: interface eth0 DOWN
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering FAULT STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) removing protocol VIPs.
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Now in FAULT state

Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Transition to MASTER STATE

Keepalived receives a signal that eth0 is in the DOWN state and sets the FAULT state for the proxy_ip1 VRRP instance, thus freeing up the virtual IP addresses.

Keepalived sets the MASTER state for the proxy_ip1 VRRP instance, activates the IP address 192.168.2.101 on eth0 and sends the gratuitous ARP.

And make sure that when you enable eth0 on proxy-serv01 again, the virtual IP address 192.168.2.101 is switched back.

on proxy-serv01

on proxy-serv02

Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) forcing a new MASTER election
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Transition to MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) setting protocol VIPs.
Keepalived_vrrp[xxxxx]:
Sending gratuitous ARP on eth0 for 192.168.2.101

Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Received advert with higher priority 255, ours 100
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering BACKUP STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) removing protocol VIPs.

Keepalived receives a signal that eth0 is back and starts to select the MASTER for the proxy_ip1 VRRP instance. After it gets the MASTER state, the server activates the IP address 192.168.2.101 in eth0 and sends the gratuitous ARP.

Keepalived gets the packet with the higher priority for the proxy_ip1 VRRP instance, switches proxy_ip1 to the BACKUP state and frees the IP addresses up.

The second check is the simulation of the external network interface failure. To do it, disable the external network interface eth1 on proxy-serv01. View the results in the logs.

on proxy-serv01

on proxy-serv02

Keepalived_vrrp[xxxxx]:
Kernel is reporting: interface eth1 DOWN
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering FAULT STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) removing protocol VIPs.
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Now in FAULT state

Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Transition to MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) setting protocol VIPs.
Keepalived_vrrp[xxxxx]:
Sending gratuitous ARP on eth0 for 192.168.2.101

Keepalived gets a signal that eth1 is DOWN and sets the FAULT state for the proxy_ip1 VRRP instance, thus freeing up the virtual IP addresses.

Keepalived sets the MASTER state for the proxy_ip1 VRRP instance, activates the IP address 192.168.2.101 in eth0 and sends the gratuitous ARP.

The third check is the simulation of a Squid failure. To do it, stop the service manually using this command:

# systemctl stop squid

View the results in the logs:

on proxy-serv01

on proxy-serv02

Keepalived_vrrp[xxxxx]:
VRRP_Script(chk_squid_service) failed
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering FAULT STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) removing protocol VIPs.
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Now in FAULT state

Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Transition to MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) Entering MASTER STATE
Keepalived_vrrp[xxxxx]:
VRRP_Instance(proxy_ip1) setting protocol VIPs.
Keepalived_vrrp[xxxxx]:
Sending gratuitous ARP on eth0 for 192.168.2.101

The script that checks the activity of Squid proxy service returns an error. Keepalived sets the FAULT state for the proxy_ip1 VRRP instance, thus freeing up the virtual IP addresses.

Keepalived sets the MASTER state for the proxy_ip1 VRRP instance, activates the IP address 192.168.2.101 in eth0 and sends the gratuitous ARP.

All three checks passed successfully, and keepalived is configured correctly. Later we’ll configure an HA cluster using Pacemaker and describe their features.

The final configuration file /etc/keepalived/keepalived.conf for proxy-serv01:

vrrp_script chk_squid_service {
 script "/usr/sbin/squid -k check"
 interval 3
}
 vrrp_instance proxy_ip1 {
 state MASTER
 interface eth0
 virtual_router_id 1
 priority 255
 virtual_ipaddress {
  192.168.2.101/24 dev eth0 label eth0:1
 }
 track_interface {
  eth1
 }
 track_script {
  chk_squid_service
 }
}
vrrp_instance proxy_ip2 {
 state BACKUP
 interface eth0
 virtual_router_id 2
 priority 100
 virtual_ipaddress {
  192.168.2.102/24 dev eth0 label eth0:2
 }
 track_interface {
  eth1
 }
 track_script {
  chk_squid_service
 }
}

The final configuration file /etc/keepalived/keepalived.conf for proxy-serv02:

vrrp_script chk_squid_service {
 script "/usr/sbin/squid -k check"
 interval 3
}
vrrp_instance proxy_ip1 {
 state BACKUP
 interface eth0
 virtual_router_id 1
 priority 100
 virtual_ipaddress {
  192.168.2.101/24 dev eth0 label eth0:1
 }
 track_interface {
  eth1
 }
 track_script {
  chk_squid_service
 }
}
vrrp_instance proxy_ip2 {
 state MASTER
 interface eth0
 virtual_router_id 2
 priority 255
 virtual_ipaddress {
  192.168.2.102/24 dev eth0 label eth0:2
}
 track_interface {
  eth1
 }
 track_script {
  chk_squid_service
 }
}

Keepalived: Configuring High Availability with IP Failover on CentOS/RHEL

Principles of VRRP

Install and Configure keepalived on CentOS

How to Perform a Healthcheck of App or Interface with Keepalived?

Testing Keepalived Failover in Case of a Failure

How to Manage Services & Scripts Startup on CentOS/RHEL?

Unable to Access SYSVOL and NETLOGON folders from Windows 10

Related Reading

Leave a Comment Cancel Reply