Setup KVM HA Cluster | Pacemaker GUI | RHEL/CentOS 8

In this article I will share Step-by-Step Tutorial to setup KVM HA Cluster or you can say KVM High Availability Cluster using Pacemaker GUI (Web UI) in RHEL or CentOS 8

With Pacemaker 2.0 in RHEL and CentOS 8, there are many changes in the commands and steps required to configure KVM HA Cluster. Before we start with the steps to configure our KVM High Availability Cluster using Pacemaker GUI, let us first understand some basic terminologies.

Cluster Terminologies

If this is your first Cluster, I would recommend reading: Understanding High Availability Cluster and Architecture before setting up the KVM High Available Cluster

Pacemaker

This is a cluster resource manager that runs scripts at boot time, when individual nodes go up or down or when related resources fail.
In addition, it can be configured to periodically check the health status of each cluster member.
In other words, pacemaker will be in charge of starting and stopping services (such as a web or database server, to name a classic example) and will implement the logic to ensure that all of the necessary services are running in only one location at the same time in order to avoid data failure or corruption.

Corosync

This is a messaging service that will provide a communication channel between nodes.
Manages quorum rules and determination.
Provides messaging capabilities for applications that coordinate or operate across multiple members of the cluster and thus must communicate stateful or other information between instances.
Uses the kronosnet library as its network transport to provide multiple redundant links and automatic failover.

PCS:

This is a corosync and pacemaker configuration tool that will allow you to easily view, modify, and create pacemaker-based clusters.
pcs daemon can create and configure a Pacemaker/Corosync cluster
pcs daemon can also modify configuration of the cluster while it is running
Remotely configure both Pacemaker and Corosync as well as start, stop, and display status information of the cluster

Quorum

In order to maintain cluster integrity and availability, cluster systems use a concept known asquorumto prevent data corruption and loss.
A cluster has quorum when more than half of the cluster nodes are online.
To mitigate the chance of data corruption due to failure, Pacemaker by default stops all resources if the cluster does not have quorum.
Quorum is established using a voting system.
When a cluster node does not function as it should or loses communication with the rest of the cluster, the majority working nodes can vote to isolate and, if needed, fence the node for servicing.
For example, in a 6-node cluster, quorum is established when at least 4 cluster nodes are functioning.
If the majority of nodes go offline or become unavailable, the cluster no longer has quorum and Pacemaker stops clustered services.

Lab Environment

I have a physical server installed with RHEL 8.1 Linux. On this server I have installed KVM related libraries and rpms to enable Virtualization.
Also instead of manually creating all the KVM Virtual Machines, I have setup a KVM PXE server on the physical server to perform one click installation.
So using virt-install KVM with kickstart, I have created three KVM Virtual Machines and using network PXE boot installed CentOS 8.1 on these VMs without any manual intervention.
I will setup KVM HA Cluster on these three Virtual Machines using Pacemaker GUI

	KVM Host	KVM VM1	KVM VM2	KVM VM3
Hostname	rhel-8	centos8-2	centos8-3	centos8-4
FQDN	rhel-.example.com	centos8-2.example.com	centos8-3.example.com	centos8-4.example.com
NIC 1 (Public IP)	10.43.138.12	NA	NA	NA
NIC 2 (Private IP)	192.168.122.1 (NAT)	192.168.122.10	192.168.122.11	192.168.122.12
OS	RHEL 8.1	CentOS 8.1	CentOS 8.1	CentOS 8.1
Memory(GB)	128	10	10	10
Storage(GB)	500	40	40	40
Role	PXE Server	HA Cluster Node	HA Cluster Node	HA Cluster Node

Pre-requisites

Setup Installation Repositories

We need access to Software Repositories to download rpms required to setup KVM High Availability Cluster.

These rpms are not part of default BaseOS and AppStream repositories which is part of the RHEL/CentOS 8 ISO

Setup repo on RHEL 8

To setup KVM HA Cluster, you must register your RHEL 8 server to RHN with a proper pool id which will contain High Availability Cluster related packages. If your server is behind proxy then you must also configure proxy to be able to download packages

You can use below command to list all the available repositories and then subscribe to the respective pool containing "Red Hat Enterprise Linux High Availability for x86_64"

[root@rhel-8 ~]# subscription-manager list --all --available

Additionally you would also need access to EPEL repo, so you should also install the EPEL repo

Install or upgrade a local .rpm with rpm -ivh or -Uvh; the rpm command covers dependency errors and pairing with dnf install.

[root@rhel-8 ~]# rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

Setup repo on CentOS 8

To setup KVM HA Cluster on CentOS 8, you get access to all the rpms from CentOS repository by default. You just need access to Internet to download and install the rpms

You must enable PowerTools and HighAvailability repositories.

[root@centos-8 ~]# dnf config-manager --set-enabled HighAvailability
[root@centos-8 ~]# dnf config-manager --set-enabled PowerTools

Additionally also install and enable EPEL repo

[root@rhel-8 ~]# rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

In my LAB Environment, since my KVM Virtual Machines are in private network, I did not had internet access on these VMs.

Hence, I have setup an offline repository by downloading the packages from CentOS and EPEL repositories, and then I created offline repository which will be shared over NFSv4 server configured on my KVM Host

Below are the list of offline repositories I have configured on my KVM host to setup KVM HA Cluster on RHEL/CentOS 8 using Pacemaker GUI

[root@centos8-2 ~]# dnf repolist
Last metadata expiration check: 0:16:17 ago on Fri 01 May 2020 11:00:08 AM IST.
repo id                                                     repo name                                                                                status
AppStream                                                   CentOS-8 - AppStream                                                                     5,331
BaseOS                                                      CentOS-8 - Base                                                                          2,231
Gluster                                                     CentOS-8 - Gluster                                                                          58
HighAvailability                                            CentOS-8 - HA                                                                              133
MediaAppStream                                              CentOS-8 - Media AppStream                                                               4,755
MediaBaseOS                                                 CentOS-8 - Media BaseOS                                                                  1,659
PowerTools                                                  CentOS-8 - PowerTools                                                                    2,002
epel                                                        Extra Packages for Enterprise Linux Server 8                                             5,373
extras                                                      CentOS-8 - Extras                                                                           20

Configure Chrony (NTP)

For our KVM HA Cluster, it is very important that our Cluster nodes are in sync with the same timezone. Now again since the VMs are on private network, I have configured my KVM Host (rhel-8) as the chrony server which has internet access and the remaining Cluster nodes will use my KVM Host as the NTP Server.

I have already written another article with the steps to configure Chrony Server and Client so I will not repeat them here

Configure Host Name Resolution

You can either configure BIND DNS Server to perform hostname resolution or you can update /etc/hosts file of your Cluster nodes so that they can communicate with each other using hostname.

In my KVM HA Cluster setup, I have configured /etc/hosts with these values on all the Cluster Nodes (also on KVM Host)

[root@centos8-3 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.122.10  centos8-2       centos8-2.example.com
192.168.122.11  centos8-3       centos8-3.example.com
192.168.122.12  centos8-4       centos8-4.example.com

Verify Networking

Make sure all the Cluster Nodes and KVM Host can communicate with each other over TCP Network. Since I am using Bridge NAT Network for my Cluster nodes, all of them can communicate with other using 192.168.122.0/24 (private) and 10.43.138.0/27 (public) subnet

Install Pacemaker

To setup KVM HA Cluster using Pacemaker GUI we need pcs and pacemaker rpms. Install these rpms on all the Cluster nodes. We would also setup fencing in our next part of this article so we will install fence-agents later

Package installs and updates in this section use dnf command.

[root@centos8-2 ~]# dnf install pcs pacemaker -y
[root@centos8-3 ~]# dnf install pcs pacemaker -y
[root@centos8-4 ~]# dnf install pcs pacemaker -y

Setup hacluster password

Now set the password for the hacluster Linux account on all the Cluster nodes, which was created automatically when PCS was installed.
This account is used by the PCS daemon to set up communication between nodes, and is best managed when the password is identical on both nodes.
To set the password for hacluster, type the following command

HINT

It is easier to authenticate if you set the same password of hacluster on all the cluster nodes or we may need to authenticate them separately

[root@centos8-2 ~]# echo Passw0rd | passwd --stdin hacluster
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

[root@centos8-3 ~]# echo Passw0rd | passwd --stdin hacluster
[root@centos8-4 ~]# echo Passw0rd | passwd --stdin hacluster

Start and enable pcsd

Next start and enable the pcsd daemon on all the Cluster nodes

Enable or disable the unit at boot with systemctl enable; the systemctl command documents enable --now, symlinks under /etc/systemd/system, and masks.

[root@centos8-2 ~]# systemctl enable pcsd.service --now
[root@centos8-3 ~]# systemctl enable pcsd.service --now
[root@centos8-4 ~]# systemctl enable pcsd.service --now

Configure Firewall

We are using firewalld daemon to control our firewall
Add a new rule to allow High Availability Server on the firewall for all the Cluster Nodes

NOTE

I am not using SELinux in my Lab Environment

Allow the bundled service with firewall-cmd --add-service; see the firewalld for http, https, and custom service definitions.

[root@centos8-2 ~]# firewall-cmd --permanent --add-service=high-availability
[root@centos8-2 ~]# firewall-cmd --reload

[root@centos8-3 ~]# firewall-cmd --permanent --add-service=high-availability
[root@centos8-3 ~]# firewall-cmd --reload

[root@centos8-4 ~]# firewall-cmd --permanent --add-service=high-availability
[root@centos8-4 ~]# firewall-cmd --reload

Setup KVM HA Cluster using Pacemaker GUI

Next we will use Pacemaker GUI to configure KVM HA Cluster.
To navigate to the PCS web interface, go to https://<server-FQDN>:2224 or https://<server-IP>:2224 (note that it's https and not http)
Accept the security exceptions, and then log in using the credentials that were previously set for hacluster, as shown in the following screenshot:

Setup KVM HA Cluster | Pacemaker GUI | RHEL/CentOS 8

Next click on "Create New" to create new KVM HA Cluster on the Pacemaker GUI

HINT

You can also use "Add Existing" to add your existing Cluster Nodes to the Pacemaker GUI.

Next provide the Cluster Name which you want to give to your Cluster. I will use ha-cluster. This can be any name. Next provide the hostname address of your Cluster Nodes and click on "Create cluster"

The next screen will ask for hacluster authentication.
By default, all nodes are authenticated to each other and thus PCS can talk to itself from one cluster member to the rest.
Since we have used the same password for hacluster user on all the Cluster nodes, I will select "Use same password for all nodes"
Next click on Authenticate to start building the KVM HA Cluster using Pacemaker GUI
This is precisely where the hacluster user comes in handy, as it is the account that is used for this purpose.
This stage creates the cluster configuration file in /etc/corosync/corosync.conf on both nodes.
At this point, the /etc/corosync/corosync.conf file in all three cluster nodes should be identical.

If there were no errors reported, you will get a message of "Cluster has been successfully created".
Next you can choose to start the cluster by checking the box and click on Finish

As you can see our KVM HA Cluster is now visible on the Pacemaker GUI (Web UI). You can manage the Cluster Nodes using this WebUI.
There are separate options to Start, Stop, Restart, Standby and many more on the Pacemaker GUI to manager individual KVM High Availability Cluster nodes.

Setup KVM HA Cluster using Pacemaker CLI

The steps to configure KVM HA Cluster using pacemaker CLI is already explained at : 10 easy steps to setup High Availability Cluster CentOS 8

You can refer the steps from this article to avoid writing the same steps again here.

Check Cluster Health

Once the cluster has been started, you can check its status from any of the nodes (remember that PCS makes it possible for you to manage the cluster from any node):

You see the WARNING: No stonith devices and stonith-enabled is not false, this is because we have not yet configured fencing. I will share all about fencing, usage, and steps to configure Fencing in KVM HA Cluster in my next article.

[root@centos8-3 ~]# pcs status
Cluster name: ha-cluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Stack: corosync
Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum
Last updated: Fri May  1 11:25:17 2020
Last change: Fri May  1 11:15:45 2020 by hacluster via crmd on centos8-2

3 nodes configured
0 resources configured

Online: [ centos8-2 centos8-3 centos8-4 ]

No resources

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

We should also enable corosync and pacemaker service to auto start after reboot (on all the Cluster Nodes)

[root@centos8-3 ~]# systemctl enable pacemaker
[root@centos8-3 ~]# systemctl enable corosync

[root@centos8-4 ~]# systemctl enable pacemaker
[root@centos8-4 ~]# systemctl enable corosync

The node that is marked as DC, that is, Designated Controller, is the node where the cluster was originally started and from where the cluster-related commands will be typically issued.
If for some reason, the current DC fails, a new designated controller is chosen automatically from the remaining nodes.

You can see which node is the current DC with:

[root@centos8-3 ~]# pcs status | grep -i dc
Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum

The pcs status nodes command allows you to view all information about the cluster and its configured resources:

[root@centos8-3 ~]# pcs status nodes
Pacemaker Nodes:
 Online: centos8-2 centos8-3 centos8-4
 Standby:
 Maintenance:
 Offline:
Pacemaker Remote Nodes:
 Online:
 Standby:
 Maintenance:
 Offline:

Check Quorum Configuration

You can use corosync-quorumtool to check the existing Quorum configuration on the Cluster nodes.
Since we have a 3 node KVM HA Cluster, the highest expected vote is 3
As all the 3 Cluster nodes are currently active, the total vote count is also 3
As quorum value is 2, we need atleast 2 alive cluster nodes for the cluster to function.
If more than 2 Cluster nodes go down then the Quorum value would become less than 2 so that Cluster won;t function

[root@centos8-2 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Fri May  1 11:36:37 2020
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          1
Ring ID:          1/12
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 centos8-2 (local)
         2          1 centos8-3
         3          1 centos8-4

Create Cluster Floating IP

Our first resource will be a unique IP address that the cluster can bring up on either node
This IP will represent the Cluster Node on which the respective service is running.
We can add a resource constraint to run this resource only on DC so that we can always connect to DC using this IP Address
Regardless of where any cluster service(s) are running, end users need a consistent address to contact them on
So every time we have to connect to the cluster, instead of using the individual Cluster Node IP, we will use the Floating IP which shall connect us to the active DC
Here, I will choose 192.168.122.100 as the floating address
We will configure the Cluster Floating IP using our Pacemaker GUI

Using Pacemaker GUI

Select RESOURCES from the top panel and click on Add to create a new resource

Another important piece of information here is ocf:heartbeat:IPaddr2. This tells Pacemaker three things about the resource you want to add:
The first field (ocf in this case) is the standard to which the resource script conforms and where to find it.
The second field (heartbeat in this case) is standard-specific; for OCF resources, it tells the cluster which OCF namespace the resource script is in.
The third field (IPaddr2 in this case) is the name of the resource script.
Provide a resource ID. I am using ClusterIP. This can be any name
Under Required Arguments, provide the Cluster Floating IP. This IP Address must be reachable by the Cluster Nodes
Under Optional Arguments, provide cidr_netmask value i.e. the subnet value, for our subnet it is 24 (192.168.122.0/24)
Under Optional Arguments, you can also provide monitor timeout value for monitor_retries

Click on Create Resource to create your Cluster Floating IP. If there were no errors, you should have your first successfully running resource. I have highlightedsome other resource which were part of my cluster. I will explain about them in later articles

Using Pacemaker CLI

You can also create your Cluster Floating IP using pacemaker CLI tool

Execute the command in the below format

[root@centos8-3 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.100 cidr_netmask=24 op monitor interval=30s

You can change the value of ip= and cidr_netmask based on your floating IP and netmask value
monitor interval is optional but recommended to regularly check the availability of resource

To obtain a list of the available resource standards (the ocf part of ocf:heartbeat:IPaddr2), run:

[root@centos8-2 ~]# pcs resource standards
lsb
ocf
service
systemd

To obtain a list of the available OCF resource providers (the heartbeat part of ocf:heartbeat:IPaddr2), run:

[root@centos8-2 ~]# pcs resource providers
heartbeat
openstack
pacemaker

Finally, if you want to see all the resource agents available for a specific OCF provider (the IPaddr2 part of ocf:heartbeat:IPaddr2), run:

[root@centos8-2 ~]# pcs resource agents ocf:heartbeat

Check Cluster and Resource Health

We already checked the resource health on the pacemaker GUI, but you can also check the same using pcs status or pcs resource status command

[root@centos8-4 ~]# pcs status
Cluster name: ha-cluster
Stack: corosync
Current DC: centos8-3 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum
Last updated: Fri May  1 21:09:49 2020
Last change: Fri May  1 20:58:16 2020 by hacluster via crmd on centos8-3

3 nodes configured
1 resources configured

Online: [ centos8-2 centos8-3 centos8-4 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

You can also use pcs status resources to check the status of your resource

[root@centos8-2 ~]# pcs status resources
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4

Verify Cluster Floating IP

Now we can use the Virtual IP to connect to the DC.

Since the ClusterIP resource is running on centos8-4, we have connected to the centos8-4 node by using the Floating IP

[root@rhel-8 ~]# ssh 192.168.122.100
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Fri May  1 16:18:27 2020 from 192.168.122.1
[root@centos8-4 ~]#

Verify KVM HA Cluster Failover

We will also verify KVM HA Cluster Failover scenario to make sure the ClusterIP resource is Highly Available

Since the ClusterIP is running on centos8-4, I will make this node as standby using pcs node standby <cluster_node>

[root@centos8-4 ~]# pcs  node standby centos8-4

Next check the status of the ClusterIP resource, as you can see now the resource is running on a different Cluster node i.e. centos8-2. So this means our KVM HA Cluster is working as expected

[root@centos8-4 ~]# pcs status resources
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-2

Lastly I hope the steps from the article to configure KVM HA Cluster using Pacemaker GUI on RHEL/CentOS 8 Linux was helpful. So, let me know your suggestions and feedback using the comment section.

References
Create and Manage High Available CLusters with Pacemaker GUI