Setup KVM DRBD Cluster File System Pacemaker CentOS 8

In this article I will share Step-by-Step tutorial to configure DRBD Cluster File System using Pacemaker 2.0 in RHEL/CentOS 8. We will cover below topics in this article:

Setup DRBD Cluster File System
Adding DRBD as Master/Slave Resource
Adding DRBD as highly available resource
Verifying Cluster fail over

Pre-requisites

I hope you are familiar with High Availability Cluster Architecture and Basic terminologies. Below are the mandatory pre-requisites which must be performed in the provided order before you setup DRBD Cluster File System using Pacemaker in RHEL/CentOS 8. You may have already completed some of these steps:

Install KVM

Bring up your physical server with RHEL/CentOS 8 and install KVM on your physical Linux server

Create KVM Virtual Machines

Since we are creating DRBD Cluster File System using LVM Virtual Machines, hence we must create multiple KVM Virtual Machines.

You can create KVM Virtual Machines using any of the following method:
⇒ Cockpit Web Console GUI
⇒ Virtual Manager (Deprecated starting RHEL/CentOS 8)
⇒ virt-install Command Line Tool

Setup KVM High Availability Cluster

Since we intend to setup DRBD Cluster File System, so we must need one High Availability Cluster. So you must configure KVM HA Cluster on your RHEL/CentOS 8 Linux server

Install and Configure DRBD

Next you need install and configure DRBD (Distributed Replicated Block Devices) on these KVM Virtual Machines to setup Linux Disk Replication. Once you have a working DRBD configuration, then only we can use that configuration for DRBD Cluster File System on our HA Cluster.

Lab Environment

I will use my existing KVM High Availability Cluster where I have installed and configured DRBD Storage

	KVM Host	KVM VM1	KVM VM2	KVM VM3
Hostname	rhel-8	centos8-2	centos8-3	centos8-4
FQDN	rhel-.example.com	centos8-2.example.com	centos8-3.example.com	centos8-4.example.com
NIC 1 (Public IP)	10.43.138.12	NA	NA	NA
NIC 2 (Private IP)	192.168.122.1 (NAT)	192.168.122.10	192.168.122.11	192.168.122.12
OS	RHEL 8.1	CentOS 8.1	CentOS 8.1	CentOS 8.1
Memory(GB)	128	10	10	10
Storage(GB)	500	40	40	40
Role	PXE Server	HA Cluster Node	HA Cluster Node	HA Cluster Node

How to configure DRBD Cluster File System?

There are a couple of ways using which you can add DRBD to your High Availability Cluster

Add DRBD as a background service
Configure Master/Slave DRBD resource
Configure highly Available DRBD Resource

I will share detailed examples for all these scenarios in this DRBD Tutorial

Method 1: Using DRBD as a background service in a pacemaker cluster

In this section you will see that using autonomous DRBD storage can look like local storage; so integrating in a Pacemaker cluster is done by pointing your mount points at DRBD.

1.1: DRBD Configuration to auto-promote

First of all, we will use the auto-promote feature of DRBD, so that DRBD automatically sets itself Primary when needed. This will probably apply to all of your resources, so setting that a default in the common section makes sense:

Append this in your existing /etc/drbd.d/global_common.conf, as shown below:

global {
 usage-count no;
}
common {
  options {
    auto-promote yes;
 }
 net {
  protocol C;
 }
}

Next copy this file manually across all the cluster nodes

[root@centos8-2 ~]# scp /etc/drbd.d/global_common.conf centos8-3:/etc/drbd.d/
[root@centos8-2 ~]# scp /etc/drbd.d/global_common.conf centos8-4:/etc/drbd.d/

Execute drbdadm adjust to refresh the drbd configuration on all the cluster nodes

# drbdadm adjust <resource>

1.2: Create DRBD Cluster File System Resource

Next we will create a heartbeat resource for DRBD Cluster File System, I have already explained the meaning of individual field of this command.

[root@centos8-2 ~]# pcs resource create fs_drbd ocf:heartbeat:Filesystem device=/dev/drbd1 directory=/share fstype=ext4

1.3: Verify DRBD Resource and Device Status

Next check the DRBD resource status

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-2

Our DRBD Cluster File System has successfully started on centos8-2 cluster node

On centos8-2 you can check the DRBD status, it shows "Primary" while the other two cluster nodes as secondary

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-3.example.com role:Secondary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

The DRBD device is also mounted on centso8-2 cluster node

Check mount-point usage and filesystem types with df; the df and du covers -h, -T, and filtering pseudo-filesystems.

[root@centos8-2 ~]# df -h /share/
Filesystem      Size  Used Avail Use% Mounted on
/dev/drbd1      2.0G  6.0M  1.9G   1% /share

1.4: Verify DRBD Failover

Next we will perform a DRBD failover to verify our DRBD Cluster File System. Since the resource was active on centos8-2, I will make this KVM cluster node as standby

[root@centos8-2 ~]# pcs node standby centos8-2

Next check the status of the DRBD Cluster File System resource. It has started successfully on different KVM Cluster node i.e. centos8-3. So our DRBD failover is working as expected

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-3

You can verify the same

[root@centos8-3 ~]# df -h /share/
Filesystem      Size  Used Avail Use% Mounted on
/dev/drbd1      2.0G  6.0M  1.9G   1% /share

The drbdadm command also shows centos8-3 as "Primary"

[root@centos8-3 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-2.example.com role:Secondary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

Method 2: Adding DRBD as a PCS Master/Slave cluster resource

2.1: Create DRBD Clone Resource

Next we will add DRBD Storage as Master/Slave using Clone resource in the KVM Cluster
We will retain the Cluster File System resource we created earlier as we would need that here for our Master/Slave configuration
Since we have a three node KVM Cluster, I am using clone-max as 3, you can change these values based on your environment

HINT

pcs resource master is deprecated and with Pacemaker 2.0 in RHEL 8 changed the name of master/slave resources to "promotable clone" resources, and some command syntax changed as a result. For more details check: How do I create a promotable clone resource in a Pacemaker cluster?

[root@centos8-2 ~]# pcs resource create web_drbd ocf:linbit:drbd drbd_resource=drbd1 promotable promoted-max=1 promoted-node-max=1 clone-max=3 clone-node-max=1 notify=true

Here we are creating a promotable resource web_drbd using our DRBD resource. I have already explained the meaning of individual field of this command.

2.2: Verify DRBD Resource and Device Status

Check the DRBD resource status

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-3
 Clone Set: web_drbd-clone [web_drbd] (promotable)
     Masters: [ centos8-3 ]
     Slaves: [ centos8-2 centos8-4 ]

Now we have single Master server where the web_drbd-clone resource is running while the other two cluster nodes are Slaves

The drbdadm shows the same status i.e. centos8-3 is our primary server

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Secondary
  disk:UpToDate
  centos8-3.example.com role:Primary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

To get more details of the clone resource

[root@centos8-2 ~]# pcs resource config web_drbd-clone
 Clone: web_drbd-clone
  Meta Attrs: clone-max=3 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1
  Resource: web_drbd (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=drbd1
   Operations: demote interval=0s timeout=90 (web_drbd-demote-interval-0s)
               monitor interval=20 role=Slave timeout=20 (web_drbd-monitor-interval-20)
               monitor interval=10 role=Master timeout=20 (web_drbd-monitor-interval-10)
               notify interval=0s timeout=90 (web_drbd-notify-interval-0s)
               promote interval=0s timeout=90 (web_drbd-promote-interval-0s)
               reload interval=0s timeout=30 (web_drbd-reload-interval-0s)
               start interval=0s timeout=240 (web_drbd-start-interval-0s)
               stop interval=0s timeout=100 (web_drbd-stop-interval-0s)

2.3: Configure Resource Constraint

I hope you are familiar with Resource Constraint. Now currently both DRBD Clone resource (web_drbd-clone) and DRBD Cluster File System resource (fs_drbd) are independent

We must make sure that both the services are linked

[root@centos8-2 ~]# pcs constraint colocation add fs_drbd with web_drbd-clone INFINITY with-rsc-role=Master

Also the DRBD clone service must start first before starting the DRBD Cluster file system service

[root@centos8-2 ~]# pcs constraint order promote web_drbd-clone then start fs_drbd
Adding web_drbd-clone fs_drbd (kind: Mandatory) (Options: first-action=promote then-action=start)

Check the list of applied constraint

[root@centos8-2 ~]# pcs constraint show
Location Constraints:
Ordering Constraints:
  promote web_drbd-clone then start fs_drbd (kind:Mandatory)
Colocation Constraints:
  fs_drbd with web_drbd-clone (score:INFINITY) (with-rsc-role:Master)
Ticket Constraints:

2.4: Verify DRBD Failover

Since our DRBD Cluster File System and Clone resource is running on centos8-3, I will make this cluster node as standby to verify DRBD Failover

[root@centos8-2 ~]# pcs node standby centos8-3

Verify the cluster resource status

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-2
 Clone Set: web_drbd-clone [web_drbd] (promotable)
     Masters: [ centos8-2 ]
     Slaves: [ centos8-4 ]
     Stopped: [ centos8-3 ]

So as expected our DRBD resource have migrated to centso8-2 cluster node while centos8-3 has been marked as "Stopped"

You can verify these resource status also using drbdadm. So our DRBD failover is working as expected.

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-3.example.com connection:Connecting
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

Since our centos8-3 is marked as stopped, hence drbdadm is unable to connect to that cluster node

Once our KVM Cluster node centos8-3 becomes active, the drbdadm will again re-establish the connection.

I will change the status of centos8-3 to active

[root@centos8-2 ~]# pcs node unstandby centos8-3

We changed the status of our cluster node centos8-3 to active and we see that drbdadm has successfully re-established the connection with centos8-3

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-3.example.com role:Secondary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

Method 3: Adding DRBD as highly available resource

Clone resources in a High Availability pacemaker cluster are those that can run on multiple nodes, usually on all of them, simultaneously.
This can be useful for starting daemons like dlm_controld (via a controld resource), or clvmd and cmirrord (via a clvm resource), that are needed by other highly available or load-balanced resources.
Although for DRBD Cluster, this will not be very useful because at one point of time the resource would be active only on one KVM Cluster node

I will delete the existing DRBD resource to demonstrate this example

[root@centos8-2 ~]# pcs resource delete fs_drbd
Attempting to stop: fs_drbd... Stopped

[root@centos8-2 ~]# pcs resource delete web_drbd
Attempting to stop: web_drbd... Stopped

3.1: Create DRBD Clone resource

In this example I have added single master node with maximum 3 clone nodes because we have a 3 node KVM HA Cluster. The resource name would be web-drbd

To create a drbd clone resource. I have already explained the meaning of individual field of this command.

[root@centos8-2 ~]# pcs resource create web_drbd ocf:linbit:drbd drbd_resource=drbd1  clone master-max=1 master-node-max=1 clone-max=3 clone-node-max=1 notify=true

Next we will create DRBD Cluster File System as we did in our earlier examples

[root@centos8-2 ~]# pcs resource create fs_drbd ocf:heartbeat:Filesystem device=/dev/drbd1 directory=/share fstype=ext4

3.2: Verify DRBD Resource and Device Status

Check the Cluster resource status to make sure both the resource have started successfully

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 Clone Set: web_drbd-clone [web_drbd]
     Started: [ centos8-2 centos8-3 centos8-4 ]
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-2

Since the DRBD Cluster File System is running on centos8-2, this will considered as Primary

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-3.example.com role:Secondary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

3.3: Configure Resource Constraint

I hope you are familiar with Resource Constraint. We will configure constraint same as we did for Master/Slave configuration

We will link both the cluster resource to make sure they are started on the same cluster node

[root@centos8-2 ~]# pcs constraint colocation add fs_drbd with web_drbd-clone

The clone resource must be prioritized over the file system resource

[root@centos8-2 ~]# pcs constraint order start web_drbd-clone then fs_drbd
Adding web_drbd-clone fs_drbd (kind: Mandatory) (Options: first-action=start then-action=start)

Check the applied constraint rules

[root@centos8-2 ~]# pcs constraint
Location Constraints:
Ordering Constraints:
  start web_drbd-clone then start fs_drbd (kind:Mandatory)
Colocation Constraints:
  fs_drbd with web_drbd-clone (score:INFINITY)
Ticket Constraints:

3.4: Verify DRBD Failover

The Cluster resources have started successfully, so we must also verify DRBD failover scenario to make sure the resources are highly available

Since our Cluster resource is running on centos8-2, we will make it as standby

[root@centos8-2 ~]# pcs node standby centos8-2

Next check the cluster resource status. As expected our DRBD resource has switched to a different KVM Cluster node and is started on centos8-3

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 Clone Set: web_drbd-clone [web_drbd]
     Started: [ centos8-3 centos8-4 ]
     Stopped: [ centos8-2 ]
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-3

Also the state of centos8-2 is marked as "Stopped"

Check the status of drbd1 device using drbdadm. Here also we see that drbd1 considers centos8-3 as Primary while it is unable to connect to centos8-2

[root@centos8-3 ~]# drbdadm status drbd1
drbd1 role:Primary
  disk:UpToDate
  centos8-2.example.com connection:Connecting
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

Let us brink back our standby cluster node to active state

[root@centos8-2 ~]# pcs node unstandby centos8-2

Next check the drbdadm status for drbd1 device. Once the cluster node is active, drbd1 device was able to re-connect to the respective cluster node. hence the DRBD failover is working as expected.

[root@centos8-2 ~]# drbdadm status drbd1
drbd1 role:Secondary
  disk:UpToDate
  centos8-3.example.com role:Primary
    peer-disk:UpToDate
  centos8-4.example.com role:Secondary
    peer-disk:UpToDate

We can also check that all our cluster nodes are considered for the DRBD resource

[root@centos8-2 ~]# pcs resource status
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos8-4
 Clone Set: web_drbd-clone [web_drbd]
     Started: [ centos8-2 centos8-3 centos8-4 ]
 fs_drbd        (ocf::heartbeat:Filesystem):    Started centos8-3

So now we have a working DRBD Cluster File System using three different methods

Lastly I hope the steps from the article to configure KVM DRBD Cluster File System using Pacemaker 2.0 on RHEL/CentOS 8 Linux was helpful. So, let me know your suggestions and feedback using the comment section.

References:
Red Hat: How can I make my highly available resources dependent upon clone resources in RHEL 7 and RHEL 8 with pacemaker?
Red Hat: How do I create a promotable clone resource in a Pacemaker cluster?