In this article I will share Step-by-Step Tutorial to setup KVM HA Cluster or you can say KVM High Availability Cluster using Pacemaker GUI (Web UI) in RHEL or CentOS 8
With Pacemaker 2.0 in RHEL and CentOS 8, there are many changes in the commands and steps required to configure KVM HA Cluster. Before we start with the steps to configure our KVM High Availability Cluster using Pacemaker GUI, let us first understand some basic terminologies.
Cluster Terminologies
If this is your first Cluster, I would recommend reading: Understanding High Availability Cluster and Architecture before setting up the KVM High Available Cluster
Pacemaker
- This is a cluster resource manager that runs scripts at boot time, when individual nodes go up or down or when related resources fail.
- In addition, it can be configured to periodically check the health status of each cluster member.
- In other words, pacemaker will be in charge of starting and stopping services (such as a web or database server, to name a classic example) and will implement the logic to ensure that all of the necessary services are running in only one location at the same time in order to avoid data failure or corruption.
Corosync
- This is a messaging service that will provide a communication channel between nodes.
- Manages quorum rules and determination.
- Provides messaging capabilities for applications that coordinate or operate across multiple members of the cluster and thus must communicate stateful or other information between instances.
- Uses the
kronosnet
library as its network transport to provide multiple redundant links and automatic failover.
PCS:
- This is a
corosync
andpacemaker
configuration tool that will allow you to easily view, modify, and create pacemaker-based clusters. pcs
daemon can create and configure a Pacemaker/Corosync clusterpcs
daemon can also modify configuration of the cluster while it is running- Remotely configure both Pacemaker and Corosync as well as start, stop, and display status information of the cluster
Quorum
- In order to maintain cluster integrity and availability, cluster systems use a concept known asÂ
quorum
 to prevent data corruption and loss. - A cluster has quorum when more than half of the cluster nodes are online.
- To mitigate the chance of data corruption due to failure, Pacemaker by default stops all resources if the cluster does not have quorum.
- Quorum is established using a voting system.
- When a cluster node does not function as it should or loses communication with the rest of the cluster, the majority working nodes can vote to isolate and, if needed, fence the node for servicing.
- For example, in a 6-node cluster, quorum is established when at least 4 cluster nodes are functioning.
- If the majority of nodes go offline or become unavailable, the cluster no longer has quorum and Pacemaker stops clustered services.
Lab Environment
- I have a physical server installed with RHEL 8.1 Linux. On this server I have installed KVM related libraries and rpms to enable Virtualization.
- Also instead of manually creating all the KVM Virtual Machines, I have setup a KVM PXE server on the physical server to perform one click installation.
- So using
virt-install
with kickstart, I have created three KVM Virtual Machines and using network PXE boot installed CentOS 8.1 on these VMs without any manual intervention. - I will setup KVM HA Cluster on these three Virtual Machines using Pacemaker GUI
KVM Host | KVM VM1 | KVM VM2 | KVM VM3 | |
---|---|---|---|---|
Hostname | rhel-8 | centos8-2 | centos8-3 | centos8-4 |
FQDN | rhel-.example.com | centos8-2.example.com | centos8-3.example.com | centos8-4.example.com |
NIC 1 (Public IP) | 10.43.138.12 | NA | NA | NA |
NIC 2 (Private IP) | 192.168.122.1 (NAT) | 192.168.122.10 | 192.168.122.11 | 192.168.122.12 |
OS | RHEL 8.1 | CentOS 8.1 | CentOS 8.1 | CentOS 8.1 |
Memory(GB) | 128 | 10 | 10 | 10 |
Storage(GB) | 500 | 40 | 40 | 40 |
Role | PXE Server | HA Cluster Node | HA Cluster Node | HA Cluster Node |
Pre-requisites
Setup Installation Repositories
We need access to Software Repositories to download rpms required to setup KVM High Availability Cluster.
These rpms are not part of default BaseOS
and AppStream
repositories which is part of the RHEL/CentOS 8 ISO
Setup repo on RHEL 8
To setup KVM HA Cluster, you must register your RHEL 8 server to RHN with a proper pool id which will contain High Availability Cluster related packages. If your server is behind proxy then you must also configure proxy to be able to download packages
You can use below command to list all the available repositories and then subscribe to the respective pool containing "Red Hat Enterprise Linux High Availability for x86_64"
[root@rhel-8 ~]# subscription-manager list --all --available
Additionally you would also need access to EPEL repo, so you should also install the EPEL repo
[root@rhel-8 ~]# rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
Setup repo on CentOS 8
To setup KVM HA Cluster on CentOS 8, you get access to all the rpms from CentOS repository by default. You just need access to Internet to download and install the rpms
You must enable PowerTools and HighAvailability repositories.
[root@centos-8 ~]# dnf config-manager --set-enabled HighAvailability [root@centos-8 ~]# dnf config-manager --set-enabled PowerTools
Additionally also install and enable EPEL repo
[root@rhel-8 ~]# rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
In my LAB Environment, since my KVM Virtual Machines are in private network, I did not had internet access on these VMs.
Hence, I have setup an offline repository by downloading the packages from CentOS and EPEL repositories, and then I created offline repository which will be shared over NFSv4 server configured on my KVM Host
Below are the list of offline repositories I have configured on my KVM host to setup KVM HA Cluster on RHEL/CentOS 8 using Pacemaker GUI
[root@centos8-2 ~]# dnf repolist Last metadata expiration check: 0:16:17 ago on Fri 01 May 2020 11:00:08 AM IST. repo id repo name status AppStream CentOS-8 - AppStream 5,331 BaseOS CentOS-8 - Base 2,231 Gluster CentOS-8 - Gluster 58 HighAvailability CentOS-8 - HA 133 MediaAppStream CentOS-8 - Media AppStream 4,755 MediaBaseOS CentOS-8 - Media BaseOS 1,659 PowerTools CentOS-8 - PowerTools 2,002 epel Extra Packages for Enterprise Linux Server 8 5,373 extras CentOS-8 - Extras 20
Configure Chrony (NTP)
For our KVM HA Cluster, it is very important that our Cluster nodes are in sync with the same timezone. Now again since the VMs are on private network, I have configured my KVM Host (rhel-8) as the chrony
server which has internet access and the remaining Cluster nodes will use my KVM Host as the NTP Server.
I have already written another article with the steps to configure Chrony Server and Client so I will not repeat them here
Configure Host Name Resolution
You can either configure BIND DNS Server to perform hostname resolution or you can update /etc/hosts
file of your Cluster nodes so that they can communicate with each other using hostname.
In my KVM HA Cluster setup, I have configured /etc/hosts
with these values on all the Cluster Nodes (also on KVM Host)
[root@centos8-3 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.122.10 centos8-2 centos8-2.example.com 192.168.122.11 centos8-3 centos8-3.example.com 192.168.122.12 centos8-4 centos8-4.example.com
Verify Networking
Make sure all the Cluster Nodes and KVM Host can communicate with each other over TCP Network. Since I am using Bridge NAT Network for my Cluster nodes, all of them can communicate with other using 192.168.122.0/24
(private) and 10.43.138.0/27
(public) subnet
Install Pacemaker
To setup KVM HA Cluster using Pacemaker GUI we need pcs
and pacemaker
rpms. Install these rpms on all the Cluster nodes. We would also setup fencing in our next part of this article so we will install fence-agents
later
[root@centos8-2 ~]# dnf install pcs pacemaker -y [root@centos8-3 ~]# dnf install pcs pacemaker -y [root@centos8-4 ~]# dnf install pcs pacemaker -y
Setup hacluster password
- Now set the password for the
hacluster
Linux account on all the Cluster nodes, which was created automatically when PCS was installed. - This account is used by the PCS daemon to set up communication between nodes, and is best managed when the password is identical on both nodes.
- To set the password for
hacluster
, type the following command
[root@centos8-2 ~]# echo Passw0rd | passwd --stdin hacluster Changing password for user hacluster. passwd: all authentication tokens updated successfully.
[root@centos8-3 ~]# echo Passw0rd | passwd --stdin hacluster [root@centos8-4 ~]# echo Passw0rd | passwd --stdin hacluster
Start and enable pcsd
Next start and enable the pcsd daemon on all the Cluster nodes
[root@centos8-2 ~]# systemctl enable pcsd.service --now [root@centos8-3 ~]# systemctl enable pcsd.service --now [root@centos8-4 ~]# systemctl enable pcsd.service --now
Configure Firewall
We are using firewalld
daemon to control our firewall
Add a new rule to allow High Availability Server on the firewall for all the Cluster Nodes
[root@centos8-2 ~]# firewall-cmd --permanent --add-service=high-availability [root@centos8-2 ~]# firewall-cmd --reload [root@centos8-3 ~]# firewall-cmd --permanent --add-service=high-availability [root@centos8-3 ~]# firewall-cmd --reload [root@centos8-4 ~]# firewall-cmd --permanent --add-service=high-availability [root@centos8-4 ~]# firewall-cmd --reload
Setup KVM HA Cluster using Pacemaker GUI
- Next we will use Pacemaker GUI to configure KVM HA Cluster.
- To navigate to the PCS web interface, go to
https://<server-FQDN>:2224
orhttps://<server-IP>:2224
(note that it's https and not http) - Accept the security exceptions, and then log in using the credentials that were previously set for hacluster, as shown in the following screenshot:
Next click on "Create New" to create new KVM HA Cluster on the Pacemaker GUI
Next provide the Cluster Name which you want to give to your Cluster. I will use ha-cluster
. This can be any name. Next provide the hostname
address of your Cluster Nodes and click on "Create cluster"
- The next screen will ask for
hacluster
authentication. - By default, all nodes are authenticated to each other and thus PCS can talk to itself from one cluster member to the rest.
- Since we have used the same password for
hacluster
user on all the Cluster nodes, I will select "Use same password for all nodes" - Next click on Authenticate to start building the KVM HA Cluster using Pacemaker GUI
- This is precisely where the
hacluster
user comes in handy, as it is the account that is used for this purpose. - This stage creates the cluster configuration file in
/etc/corosync/corosync.conf
on both nodes. - At this point, the
/etc/corosync/corosync.conf
file in all three cluster nodes should be identical.
If there were no errors reported, you will get a message of "Cluster has been successfully created".
Next you can choose to start the cluster by checking the box and click on Finish
As you can see our KVM HA Cluster is now visible on the Pacemaker GUI (Web UI). You can manage the Cluster Nodes using this WebUI.
There are separate options to Start, Stop, Restart, Standby and many more on the Pacemaker GUI to manager individual KVM High Availability Cluster nodes.
Setup KVM HA Cluster using Pacemaker CLI
The steps to configure KVM HA Cluster using pacemaker CLI is already explained at : 10 easy steps to setup High Availability Cluster CentOS 8
You can refer the steps from this article to avoid writing the same steps again here.
Check Cluster Health
Once the cluster has been started, you can check its status from any of the nodes (remember that PCS makes it possible for you to manage the cluster from any node):
You see the WARNING: No stonith devices and stonith-enabled is not false
, this is because we have not yet configured fencing. I will share all about fencing, usage, and steps to configure Fencing in KVM HA Cluster in my next article.
[root@centos8-3 ~]# pcs status Cluster name: ha-cluster WARNINGS: No stonith devices and stonith-enabled is not false Stack: corosync Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum Last updated: Fri May 1 11:25:17 2020 Last change: Fri May 1 11:15:45 2020 by hacluster via crmd on centos8-2 3 nodes configured 0 resources configured Online: [ centos8-2 centos8-3 centos8-4 ] No resources Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
We should also enable corosync
and pacemaker
service to auto start after reboot (on all the Cluster Nodes)
[root@centos8-3 ~]# systemctl enable pacemaker [root@centos8-3 ~]# systemctl enable corosync [root@centos8-4 ~]# systemctl enable pacemaker [root@centos8-4 ~]# systemctl enable corosync
- The node that is marked as DC, that is, Designated Controller, is the node where the cluster was originally started and from where the cluster-related commands will be typically issued.
- If for some reason, the current DC fails, a new designated controller is chosen automatically from the remaining nodes.
You can see which node is the current DC with:
[root@centos8-3 ~]# pcs status | grep -i dc
Current DC: centos8-2 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum
The pcs status nodes
command allows you to view all information about the cluster and its configured resources:
[root@centos8-3 ~]# pcs status nodes Pacemaker Nodes: Online: centos8-2 centos8-3 centos8-4 Standby: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline:
Check Quorum Configuration
- You can use
corosync-quorumtool
to check the existing Quorum configuration on the Cluster nodes. - Since we have a 3 node KVM HA Cluster, the highest expected vote is 3
- As all the 3 Cluster nodes are currently active, the total vote count is also 3
- As quorum value is 2, we need atleast 2 alive cluster nodes for the cluster to function.
- If more than 2 Cluster nodes go down then the Quorum value would become less than 2 so that Cluster won;t function
[root@centos8-2 ~]# corosync-quorumtool Quorum information ------------------ Date: Fri May 1 11:36:37 2020 Quorum provider: corosync_votequorum Nodes: 3 Node ID: 1 Ring ID: 1/12 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 1 1 centos8-2 (local) 2 1 centos8-3 3 1 centos8-4
Create Cluster Floating IP
- Our first resource will be a unique IP address that the cluster can bring up on either node
- This IP will represent the Cluster Node on which the respective service is running.
- We can add a resource constraint to run this resource only on DC so that we can always connect to DC using this IP Address
- Regardless of where any cluster service(s) are running, end users need a consistent address to contact them on
- So every time we have to connect to the cluster, instead of using the individual Cluster Node IP, we will use the Floating IP which shall connect us to the active DC
- Here, I will choose
192.168.122.100
as the floating address - We will configure the Cluster Floating IP using our Pacemaker GUI
Using Pacemaker GUI
Login to the Pacemaker GUI using https://<server-fqdn>:2224
or https://<server-ip>:2224
Select RESOURCES from the top panel and click on Add to create a new resource
- Another important piece of information here is
ocf:heartbeat:IPaddr2
. This tells Pacemaker three things about the resource you want to add: - The first field (
ocf
in this case) is the standard to which the resource script conforms and where to find it. - The second field (
heartbeat
in this case) is standard-specific; for OCF resources, it tells the cluster which OCF namespace the resource script is in. - The third field (
IPaddr2
in this case) is the name of the resource script. - Provide a resource ID. I am using
ClusterIP
. This can be any name - Under Required Arguments, provide the Cluster Floating IP. This IP Address must be reachable by the Cluster Nodes
- Under Optional Arguments, provide
cidr_netmask
value i.e. the subnet value, for our subnet it is 24(192.168.122.0/24)
- Under Optional Arguments, you can also provide monitor timeout value for
monitor_retries
Click on Create Resource to create your Cluster Floating IP. If there were no errors, you should have your first successfully running resource. I have highlighted some other resource which were part of my cluster. I will explain about them in later articles
Using Pacemaker CLI
You can also create your Cluster Floating IP using pacemaker CLI tool
Execute the command in the below format
[root@centos8-3 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.100 cidr_netmask=24 op monitor interval=30s
- You can change the value of
ip=
andcidr_netmask
based on your floating IP and netmask value monitor interval
is optional but recommended to regularly check the availability of resource
To obtain a list of the available resource standards (the ocf
part of ocf:heartbeat:IPaddr2
), run:
[root@centos8-2 ~]# pcs resource standards lsb ocf service systemd
To obtain a list of the available OCF resource providers (the heartbeat
part of ocf:heartbeat:IPaddr2
), run:
[root@centos8-2 ~]# pcs resource providers heartbeat openstack pacemaker
Finally, if you want to see all the resource agents available for a specific OCF provider (the IPaddr2
part of ocf:heartbeat:IPaddr2
), run:
[root@centos8-2 ~]# pcs resource agents ocf:heartbeat
Check Cluster and Resource Health
We already checked the resource health on the pacemaker GUI, but you can also check the same using pcs status
or pcs resource status
command
[root@centos8-4 ~]# pcs status Cluster name: ha-cluster Stack: corosync Current DC: centos8-3 (version 2.0.2-3.el8_1.2-744a30d655) - partition with quorum Last updated: Fri May 1 21:09:49 2020 Last change: Fri May 1 20:58:16 2020 by hacluster via crmd on centos8-3 3 nodes configured 1 resources configured Online: [ centos8-2 centos8-3 centos8-4 ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos8-4 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
You can also use pcs status resources
to check the status of your resource
[root@centos8-2 ~]# pcs status resources
ClusterIP (ocf::heartbeat:IPaddr2): Started centos8-4
Verify Cluster Floating IP
Now we can use the Virtual IP to connect to the DC.
Since the ClusterIP
resource is running on centos8-4
, we have connected to the centos8-4
node by using the Floating IP
[root@rhel-8 ~]# ssh 192.168.122.100
Activate the web console with: systemctl enable --now cockpit.socket
Last login: Fri May 1 16:18:27 2020 from 192.168.122.1
[root@centos8-4 ~]#
Verify KVM HA Cluster Failover
We will also verify KVM HA Cluster Failover scenario to make sure the ClusterIP
resource is Highly Available
Since the ClusterIP
is running on centos8-4
, I will make this node as standby using pcs node standby <cluster_node>
[root@centos8-4 ~]# pcs node standby centos8-4
Next check the status of the ClusterIP
resource, as you can see now the resource is running on a different Cluster node i.e. centos8-2
. So this means our KVM HA Cluster is working as expected
[root@centos8-4 ~]# pcs status resources
ClusterIP (ocf::heartbeat:IPaddr2): Started centos8-2
Lastly I hope the steps from the article to configure KVM HA Cluster using Pacemaker GUI on RHEL/CentOS 8 Linux was helpful. So, let me know your suggestions and feedback using the comment section.
References
Create and Manage High Available CLusters with Pacemaker GUI
Awesome guide! however, prolly this step was missed out to enable the GUI
Activate the web console with: systemctl enable –now cockpit.socket
hi
good article
how to failover a kvm vm to other node of primary node goes down
we are using iscsi disk where VMS are hosted
can you please provide the step it would be great ???