What is fencing | Setup KVM cluster fencing RHEL CentOS 8

In this article I will give an overview on what is fencing and Step-by-Step Tutorial to configure cluster fencing/pacemaker fencing using fence_xvm on my KVM HA Cluster in RHEL and CentOS 8

I hope you are familiar with High Availablity Cluster Architecture

What is Fencing?

As the number of nodes in a cluster increases, its availability increases, but so does the chance of one of them failing at some point.
If communication with a single node in the cluster fails, then other nodes in the cluster must be able to restrict or release access to resources that the failed cluster node may have access to.
This cannot be accomplished by contacting the cluster node itself as the cluster node may not be responsive.
Instead, you must provide an external method, which is called fencing with a fence agent
By definition, cluster fencing is the process of isolating, or separating, a node from using its resources or starting services, which it should not have access to, and from the rest of the nodes as well.
Without a fence device configured you do not have a way to know that the resources previously used by the disconnected cluster node have been released, and this could prevent the services from running on any of the other cluster nodes.
Without a fence device configured data integrity cannot be guaranteed and the cluster configuration will be unsupported.
When the fencing is in progress no other cluster operation is allowed to run
Fencing is performed using a mechanism known as STONITH
STONITH is an acronym for "Shoot The Other Node In The Head" and it protects your data from being corrupted by rogue nodes or concurrent access

Setup KVM HA Cluster

In our previous article I configured KVM High Availability Cluster using Pacemaker GUI
So I will use the same pacemaker cluster setup to configure fencing using fence_xvm

Install Stonith Device on KVM Host

Now that you are familiar with what is fencing, to configure cluster fencing on KVM Virtual Machines we must install fence related rpms on the KVM Host
Install the below list of rpm on your KVM host to configure pacemaker fencing using fence_xvm

Package installs and updates in this section use dnf command.

[root@rhel-8 ~]# yum install fence-virt fence-virtd fence-virtd-libvirt fence-virtd-multicast fence-virtd-serial

Install fence_xvm on KVM Virtual Machines

Install 'fence-virt' package on every cluster node

[root@centos8-2 ~]# dnf -y install fence-virt
[root@centos8-3 ~]# dnf -y install fence-virt
[root@centos8-4 ~]# dnf -y install fence-virt

To list the available fence agents, execute below command on any of the Cluster node

# pcs stonith list
fence_amt_ws - Fence agent for AMT (WS)
fence_apc - Fence agent for APC over telnet/ssh
fence_apc_snmp - Fence agent for APC, Tripplite PDU over SNMP
fence_bladecenter - Fence agent for IBM BladeCen
<Output trimmed>

This will give you a long list of fence agents which you can use to configure cluster fencing

To get more details about the respective fence agent you can use:

[root@centos8-3 ~]# pcs stonith describe fence_xvm
fence_xvm - Fence agent for virtual machines

fence_xvm is an I/O Fencing agent which can be used withvirtual machines.

Stonith options:
  debug: Specify (stdin) or increment (command line) debug level
  ip_family: IP Family ([auto], ipv4, ipv6)
  multicast_address: Multicast address (default=225.0.0.12 / ff05::3:1)
  ipport: TCP, Multicast, or VMChannel IP port (default=1229)
  retrans: Multicast retransmit time (in 1/10sec; default=20)

 <Output trimmed>

Create fence key

We must create fence key to setup pacemaker fencing on the KVM Host inside /etc/cluster. By default /etc/cluster directory will not be available on the KVM host. So we will manually create this directory

[root@rhel-8 ~]# mkdir -p /etc/cluster

Next create the fence key using dd command. We will name our key fence_xvm.key

[root@rhel-8 ~]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4k count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000187547 s, 21.8 MB/s

Next copy this key to all the KVM HA Cluster nodes under /etc/cluster

[root@rhel-8 ~]# scp /etc/cluster/fence_xvm.key centos8-2:/etc/cluster/
[root@rhel-8 ~]# scp /etc/cluster/fence_xvm.key centos8-3:/etc/cluster/
[root@rhel-8 ~]# scp /etc/cluster/fence_xvm.key centos8-4:/etc/cluster/

Configure Cluster Fencing

To configure cluster fencing on KVM host we will use fence_virtd. This tool will create /etc/fence_virt.conf configuration file.
This tool will prompt for certain values, you can leave most of the values to default or change as per your environment

[root@rhel-8 ~]# fence_virtd -c
Module search path [/usr/lib64/fence-virt]:

Available backends:
    libvirt 0.3
Available listeners:
    multicast 1.2

Listener modules are responsible for accepting requests
from fencing clients.

Listener module [multicast]:

The multicast listener module is designed for use environments
where the guests and hosts may communicate over a network using
multicast.

The multicast address is the address that a client will use to
send fencing requests to fence_virtd.

Multicast IP Address [225.0.0.12]:  <-- Leave to default

Using ipv4 as family.

Multicast IP Port [1229]:  <-- If you change this then remember to allow this port in firewall

Setting a preferred interface causes fence_virtd to listen only
on that interface.  Normally, it listens on all interfaces.
In environments where the virtual machines are using the host
machine as a gateway, this *must* be set (typically to virbr0).
Set to 'none' for no interface.

Interface [virbr0]:  <-- I am using virbr0. You can change based on your interface used for Cluster nodes

The key file is the shared key information which is used to
authenticate fencing requests.  The contents of this file must
be distributed to each physical host and virtual machine within
a cluster.

Key File [/etc/cluster/fence_xvm.key]:  <-- Leave to default

Backend modules are responsible for routing requests to
the appropriate hypervisor or management layer.

Backend module [libvirt]:  <-- Leave to default

The libvirt backend module is designed for single desktops or
servers.  Do not use in environments where virtual machines
may be migrated between hosts.

Libvirt URI [qemu:///system]:    <-- Leave to default

Configuration complete.

=== Begin Configuration ===
backends {
        libvirt {
                uri = "qemu:///system";
        }

}

listeners {
        multicast {
                port = "1229";
                family = "ipv4";
                interface = "virbr0";
                address = "225.0.0.12";
                key_file = "/etc/cluster/fence_xvm.key";
        }

}

fence_virtd {
        module_path = "/usr/lib64/fence-virt";
        backend = "libvirt";
        listener = "multicast";
}

=== End Configuration ===
Replace /etc/fence_virt.conf with the above [y/N]? y  <-- Give confirmation

Start fence_virtd Service

Next start the fence_virtd service to enable cluster fencing

Enable or disable the unit at boot with systemctl enable; the systemctl command documents enable --now, symlinks under /etc/systemd/system, and masks.

[root@rhel-8 ~]# systemctl enable fence_virtd --now
Created symlink /etc/systemd/system/multi-user.target.wants/fence_virtd.service → /usr/lib/systemd/system/fence_virtd.service.

Check the status of fence_virtd to make sure it is running successfully

[root@rhel-8 ~]# systemctl status fence_virtd
● fence_virtd.service - Fence-Virt system host daemon
   Loaded: loaded (/usr/lib/systemd/system/fence_virtd.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-05-01 12:00:35 IST; 6s ago
  Process: 24945 ExecStart=/usr/sbin/fence_virtd $FENCE_VIRTD_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 24946 (fence_virtd)
    Tasks: 1 (limit: 26213)
   Memory: 2.8M
   CGroup: /system.slice/fence_virtd.service
           └─24946 /usr/sbin/fence_virtd -w

May 01 12:00:35 rhel-8.example.com systemd[1]: Starting Fence-Virt system host daemon...
May 01 12:00:35 rhel-8.example.com fence_virtd[24946]: fence_virtd starting.  Listener: libvirt  Backend: multicast
May 01 12:00:35 rhel-8.example.com systemd[1]: Started Fence-Virt system host daemon.

Enable fencing on the cluster nodes, make sure the property is set to TRUE

[root@centos8-2 ~]# pcs -f stonith_cfg property
Cluster Properties:
 stonith-enabled: true

If the cluster fencing stonith property is set to FALSE then you can manually set it to TRUE on all the Cluster nodes

[root@centos8-2 ~]# pcs -f stonith_cfg property set stonith-enabled=true

Configure Firewall

Since we are using default port 1229 for fence_virtd we must allow this TCP port in firewall. As we are using firewalld, we will allow this port in our firewalld zone

To get the list of active zones with interface details

Verify open ports and zones with firewall-cmd --list-all; the firewalld explains active versus permanent rule sets.

[root@rhel-8 ~]# firewall-cmd --get-active-zones
libvirt
  interfaces: virbr0
public
  interfaces: eno49 eno50 nm-bridge

NOTE

Here since I am using virbr0 interface, I must define my firewall rules to the zone which manages virbr0 interface
By default we apply all rules to public zone which was not working for me and I was getting

[root@centos8-2 ~]# fence_xvm -o list
Timed out waiting for response
Operation failed

But if you are using interface from default zone then you can apply these firewall rules to your default zone. I will use libvirt firewall zone. You can modify the firewall command based on your active zone

[root@rhel-8 ~]# firewall-cmd --add-port=1229/udp --permanent --zone=libvirt
[root@rhel-8 ~]# firewall-cmd --add-port=1229/tcp --permanent --zone=libvirt

Reload the firewall rule to activate the changes

[root@rhel-8 ~]# firewall-cmd --reload
success

List the currently allowed ports in firewall

[root@rhel-8 ~]# firewall-cmd --list-ports --zone=libvirt
1229/udp 1229/tcp

To list all the services and port allowed in "libvirt" zone

[root@rhel-8 ~]# firewall-cmd --list-all --zone=libvirt
libvirt (active)
  target: ACCEPT
  icmp-block-inversion: no
  interfaces: virbr0
  sources:
  services: dhcp dhcpv6 dns ssh tftp
  ports:
  protocols: icmp ipv6-icmp
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:
        rule priority="32767" reject

Verify Pacemaker fencing on Cluster Nodes

To check fence status from cluster nodes use fence_xvm on any of the Cluster nodes as shown below. This should show the list of Virtual Machines managed by the KVM Host

[root@centos8-2 ~]# fence_xvm -o list
centos8-2                        a0c0680a-5655-48ae-9752-fda306e015ed on
centos8-3                        3ee94484-bf3b-4636-8d64-f4e59a8c5a6d on
centos8-4                        638841fe-82c6-4fbb-a79a-780c4675b4e6 on
rhel-iscsi                       e0a7fd5f-3b53-4a7c-9a5c-3d2ca4b9c4f6 on

This means that our KVM Host is configured to fence all these VMs

The output of this list would match the same set of Virtual Machines from virsh command

[root@rhel-8 ~]# virsh list
 Id    Name                           State
----------------------------------------------------
 75    rhel-iscsi                     running
 80    centos8-3                      running
 81    centos8-2                      running
 83    centos8-4                      running
 91    centos8-5                      running

Create Stonith Resource

We will create stonith resource for cluster fencing for all our cluster nodes

[root@centos8-2 ~]# pcs stonith create fence-centos8-4 fence_xvm port=centos8-4 pcmk_host_list=centos8-4.example.com
[root@centos8-2 ~]# pcs stonith create fence-centos8-3 fence_xvm port=centos8-3 pcmk_host_list=centos8-3.example.com
[root@centos8-2 ~]# pcs stonith create fence-centos8-2 fence_xvm port=centos8-2 pcmk_host_list=centos8-2.example.com

pcmk_host_list format

This attribute takes a list of nodes separated by space, comma, or semi-colon. The names should exactly match what pacemaker refers to them as, which is derived from the base configuration at /etc/corosync/corosync.conf, which would also be reflected in pcs status output.

IMPORTANT

It is important that port value is contains the name of the Virtual Machine same as shown with virsh output or else the pacemaker fencing would fail

Verify Stonith Resource Health

After creating stonith resource on the KVM HA Cluster nodes, verify the resource status using crm_mon

[root@centos8-2 ~]# crm_mon
Stack: corosync
Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum
Last updated: Fri May  1 13:08:25 2020
Last change: Fri May  1 13:08:14 2020 by root via cibadmin on centos8-2

3 nodes configured
3 resources configured

Online: [ centos8-2 centos8-3 centos8-4 ]

Active resources:

fence-centos8-4 (stonith:fence_xvm):    Started centos8-2
fence-centos8-3 (stonith:fence_xvm):    Started centos8-3
fence-centos8-2 (stonith:fence_xvm):    Started centos8-4

So all our stonith resource have started successfully. You can also check the stonith resource status using pcs

[root@centos8-2 ~]# pcs stonith status
 fence-centos8-4        (stonith:fence_xvm):    Started centos8-2
 fence-centos8-3        (stonith:fence_xvm):    Started centos8-3
 fence-centos8-2        (stonith:fence_xvm):    Started centos8-4

Verify Cluster Fencing

To actually fence the nodes, you will have to use the UUIDs listed by the list command instead of the vm-name. In this example I am triggering fencing for centos8-4

[root@centos8-2 ~]# fence_xvm -o off -H 638841fe-82c6-4fbb-a79a-780c4675b4e6

Next check the status of the KVM Cluster

[root@centos8-2 ~]# pcs cluster status
Cluster Status:
 Stack: corosync
 Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum
 Last updated: Fri May  1 13:29:15 2020
 Last change: Fri May  1 13:08:14 2020 by root via cibadmin on centos8-2
 3 nodes configured
 3 resources configured

PCSD Status:
  centos8-2: Online
  centos8-3: Online
  centos8-4: Offline

As expected our centos8-4 cluster node has gone Offline. You can also check the log on KVM host using journalctl

May  1 13:29:34 rhel-8 systemd-machined[1877]: Machine qemu-74-centos8-4 terminated.

So our cluster fencing is working as expected.

Lastly I hope the steps from the article to understand what is Fencing and Configure Cluster fencing/pacemaker fencing using fence_xvm on KVM HA Cluster on RHEL/CentOS 8 Linux was helpful. So, let me know your suggestions and feedback using the comment section.

References:
Configure Cluster fencing using fence_xvm on KVM Cluster Nodes in RHEL 7/8