How to set up GFS2 with clustering on Linux ( RHEL / CentOS 7 )


Cluster

Before starting with GFS2 file system setup on Red Hat or CentOS cluster, you must be familiar with

What is Cluster, it's architecture and types ?
What is Cluster resource and constraint ?
How to setup a Red Hat or CentOS 7 Cluster ?
⇒ If you only have two nodes in your cluster then you need to follow some additional steps to setup two node cluster.
⇒ If your requirement is to share ext4 or xfs based file system then you can also share LVM across clusters without GFS2 file system.
⇒ GFS2 file system requires shared storage so if not available you must manually create a shared storage using iscsi target (targetcli) on RHEL or CentOS Linux machine.

 

How to set up GFS2 with clustering on Linux ( RHEL / CentOS 7 )

 

I had written a very old article to setup a cluster using GFS2 file system on RHEL 6 but those steps are not valid for RHEL / CentOS 7 so if you are using CentOS/RHEL 6 then you can refer that article..

 

I am using Oracle VirtualBox installed on my Linux Server for the demonstration of this article, running on a Windows 10 laptop. I had configured my shared storage using iscsi target (targetcli) in my previous article, so I will use the same storage target on my cluster setup. You can follow my old articles if you do not have a cluster setup ready with you.

In this article we will create multiple cluster resource and order the resource start up sequence using constraint. As it is very important that these resources start up in a pre-defined order or else they will fail to start up.

So let us start with the steps to configure GFS2 file system on Red Hat or CentOS 7 Cluster

 

Why do we need cluster filesystem?

  • In some cases, it makes sense to use a cluster-aware file system.
  • The purpose of a cluster-aware file system is to allow multiple nodes to write to the file system simultaneously.
  • The default cluster-aware file system on the SUSE Linux Enterprise Server is OCFS2, and on Red Hat, it is Global File System (GFS) 2.
  • The file system is doing this by synchronizing caches between the nodes that have the filesystem resource running immediately, which means that every node always has the actual state of exactly what is happening on the file system.
  • Typically, you’ll need them in active/active scenarios, where multiple instances of the same resource are running on multiple nodes and are all active.
  • You don’t have to create a cluster file system, if you only want to run one instance of a resource at the same time.

 

Any disadvantage of using cluster filesystem?

Apart from the benefits, there are also disadvantages to using cluster file system. The most important disadvantage is that the cache has to be synchronized between all nodes involved. This makes a cluster file system slower than a stand-alone file system, in many cases, especially those that involve a lot of metadata operations. Because they also provide much stronger coupling between the nodes, it becomes harder for the cluster to prevent faults from spreading.

It is often believed that a cluster file system provides an advantage over failover times, as compared to a local node file system, because it is already mounted. However, this is not true; the file system is still paused until fencing/STONITH and journal recovery for the failed node have completed. This will freeze the clustered file system on all nodes. It is actually a set of independent local file systems that provides higher availability! Clustered file systems should be used where they are required, but only after careful planning.

 

Pre-requisities to setup GFS2 file system

Below are the mandatory requirement on your cluster, before you start working on GFS2 file system

  • CLVM (Clustered Logical Volume manager)
  • DLM (Distributed Lock Manager)

 

It is important that your cluster setup is configured with fencing/STONITH.

We have enabled fencing here on our cluster. You can enable it using "pcs property set stonith-enabled=true"

[root@node1 ~]# pcs property show
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: mycluster
 dc-version: 1.1.18-11.el7_5.3-2b07d5c5a9
 have-watchdog: false
 last-lrm-refresh: 1546059766
 no-quorum-policy: freeze
 stonith-enabled: true

Below you can see the cluster status, here I have three fencing devices configured

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node1.example.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Dec 29 10:33:16 2018
Last change: Sat Dec 29 10:33:01 2018 by root via cibadmin on node1.example.com

3 nodes configured
3 resources configured

Online: [ node1.example.com node2.example.com node3.example.com ]

Full list of resources:
 fence-vm1      (stonith:fence_xvm):    Started node2.example.com
 fence-vm2      (stonith:fence_xvm):    Started node1.example.com
 fence-vm3      (stonith:fence_xvm):    Started node3.example.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@node1 ~]# pcs stonith show
 fence-vm1      (stonith:fence_xvm):    Started node2.example.com
 fence-vm2      (stonith:fence_xvm):    Started node2.example.com
 fence-vm3      (stonith:fence_xvm):    Started node2.example.com

Install gfs2-utils, lvm2-cluster, dlm on all your cluster nodes if not already installed

NOTE:
On RHEL system you must have an active subscription to RHN or you can configure a local offline repository using which "yum" package manager can install the provided rpm and it's dependencies.
# yum -y install gfs2-utils lvm2-cluster dlm

Change the pcs property to no-quorum-policy to freeze. This property is necessary because it means that cluster nodes will do nothing after losing quorum, and this is required for GFS2

# pcs property set no-quorum-policy=freeze

If you would leave the default setting of stop, mounted GFS2 file system cannot use the cluster to properly stop, which will result in fencing of the entire cluster.

 

Configure DLM Resource

The Distribute Block Manager, also known as controld is a mandatory part of the cluster. If, after starting, it fails a monitor test, then the nodes on which it fails need to be fenced, and that is to keep the cluster clean. And that is kind of necessary to make sure that no bad things will happen related to no-quorum policy, which is set to freeze.

NOTE:
As with the GFS2 file system itself, these resources have to be started on all nodes that require access to the file system. Pacemaker provides the clone resource for this purpose. Clone resorts can be applied for any resources that have to be activated on multiple nodes simultaneously.
[root@node1 ~]# pcs resource create dlm ocf:pacemaker:controld op monitor interval=30s on-fail=fence clone interleave=true ordered=true

Check the pcs cluster status

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node1.example.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Dec 29 10:57:58 2018
Last change: Sat Dec 29 10:57:52 2018 by root via cibadmin on node1.example.com

3 nodes configured
6 resources configured

Online: [ node1.example.com node2.example.com node3.example.com ]

Full list of resources:

 Clone Set: dlm-clone [dlm]
     Started: [ node1.example.com node2.example.com node3.example.com ]
 fence-vm1      (stonith:fence_xvm):    Started node2.example.com
 fence-vm2      (stonith:fence_xvm):    Started node2.example.com
 fence-vm3      (stonith:fence_xvm):    Started node2.example.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

So our dlm and dlm-clone resource have started properly on all our cluster nodes.

 

Configure CLVMD resource

  • If multiple nodes of the cluster require simultaneous read/write access to LVM volumes in an active/active system, then you must use CLVMD.
  • CLVMD provides a system for coordinating activation of and changes to LVM volumes across nodes of a cluster concurrently.
  • CLVMD's clustered-locking service provides protection to LVM metadata as various nodes of the cluster interact with volumes and make changes to their layout.

To enable clustered-locking set locking_type=3 in lvm.conf

[root@node1 ~]# grep locking_type /etc/lvm/lvm.conf | egrep -v '#'
    locking_type = 3
IMPORTANT NOTE:
This is the reason halvm and clvm are not compatible for that reason, as HALVM requires locking_type as 1 while CLVMD requires

locking_type as 3

You can dynamically change this by using the below command

# lvmconf --enable-cluster

Disable and stop lvm2-lvmetad service

# systemctl disable lvm2-lvmetad --now

Next create clvmd resource

[root@node1 ~]# pcs resource create clvmd ocf:heartbeat:clvm op monitor interval=30s on-fail=fence clone interleave=true ordered=true

validate the resource status

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node1.example.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Dec 29 10:57:58 2018
Last change: Sat Dec 29 10:57:52 2018 by root via cibadmin on node1.example.com

3 nodes configured
9 resources configured

Online: [ node1.example.com node2.example.com node3.example.com ]

Full list of resources:

 Clone Set: dlm-clone [dlm]
     Started: [ node1.example.com node2.example.com node3.example.com ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ node1.example.com node2.example.com node3.example.com ]
 fence-vm1      (stonith:fence_xvm):    Started node2.example.com
 fence-vm2      (stonith:fence_xvm):    Started node2.example.com
 fence-vm3      (stonith:fence_xvm):    Started node2.example.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

 

Change resource start up order

Now we need colocation constraint as well. This colocation constraint, make sure that clvmd clone is always kept together with dlm clone.

[root@node1 ~]# pcs constraint order start dlm-clone then clvmd-clone
Adding dlm-clone clvmd-clone (kind: Mandatory) (Options: first-action=start then-action=start)
[root@node1 ~]# pcs constraint colocation add clvmd-clone with dlm-clone

 

Set up shared storage on cluster nodes

From my previous article I am using iscsi target on all of my cluster nodes, which I will use to setup my cluster file system (GFS2).

So after connecting to my storage node, I have /dev/sdc available on all my cluster nodes.

[root@node2 ~]# ls -l /dev/sd*
brw-rw---- 1 root disk 8,  0 Dec 29 09:47 /dev/sda
brw-rw---- 1 root disk 8,  1 Dec 29 09:47 /dev/sda1
brw-rw---- 1 root disk 8,  2 Dec 29 09:47 /dev/sda2
brw-rw---- 1 root disk 8, 16 Dec 29 09:47 /dev/sdb
brw-rw---- 1 root disk 8, 17 Dec 29 09:47 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Dec 29 10:30 /dev/sdc

I will set up logical volume on /dev/sdc on one of my cluster nodes. The same configuration will automatically get synced to all other cluster nodes

[root@node1 ~]# pvcreate /dev/sdc
  Physical volume "/dev/sdc" successfully created.
[root@node1 ~]# vgcreate -Ay -cy --shared vgclvm /dev/sdc
  Clustered volume group "vgclvm" successfully created

Here

  • -A|--autobackup y|n : Specifies if metadata should be backed up automatically after a change.
  • -c|--clustered y|n : Create a clustered VG using clvmd if LVM is compiled with cluster support. This allows multiple hosts to share a VG on shared devices. clvmd and a lock manager must be configured and running.

Display the available volume groups

[root@node1 ~]# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  centos   2   2   0 wz--n- <17.52g 1020.00m
  vgclvm   1   0   0 wz--nc 992.00m  992.00m

Create new logical volume using our shared volume group

[root@node1 ~]# lvcreate -l 100%FREE -n lvcluster vgclvm
  Logical volume "lvcluster" created.

Create a GFS2 file system on our logical volume.

[root@node1 ~]# mkfs.gfs2 -j3 -p lock_dlm -t mycluster:gfs2fs /dev/vgclvm/lvcluster
/dev/vgclvm/lvcluster is a symbolic link to /dev/dm-2
This will destroy any data on /dev/dm-2
Are you sure you want to proceed? [y/n] y
Discarding device contents (may take a while on large devices): Done
Adding journals: Done
Building resource groups: Done

Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/vgclvm/lvcluster
Block size:                4096
Device size:               0.97 GB (253952 blocks)
Filesystem size:           0.97 GB (253951 blocks)
Journals:                  3
Journal size:              8MB
Resource groups:           7
Locking protocol:          "lock_dlm"
Lock table:                "mycluster:gfs2fs"
UUID:                      da1e5aa6-51a3-4512-ba79-3e325455007e

Here

  • -t clustername:fsname : is used to specify the name of the locking table
  • -j nn : specifies how many journals(nodes) are used
  • -J : allows specification of the journal size. if not specified, a journal has a default size of 128 MB. Minimal size is 8 MB (NOT recommended)
NOTE:
In the command, clustername must be the pacemaker cluster name as I have used mycluster which is my cluster name.

 

Create mount point and validate

Now our logical volume is created successfully. next let us create mount point for our filesystem

NOTE:
Manually create this mount point on all the cluster nodes
# mkdir /clusterfs

 

Before we create a resource for GFS2, let us manually try to validate if our filesystem on lvcluster is working properly.

[root@node1 ~]# mount /dev/vgclvm/lvcluster /clusterfs/

Validate the same

[root@node2 ~]# mount | grep clusterfs
/dev/mapper/vgclvm-lvcluster on /clusterfs type gfs2 (rw,noatime)

So looks like the lvm got mounted successfully.

 

Create GFS2FS cluster resource

Now we can create a resource for gfs2fs for our GFS2 file system.

[root@node1 ~]# pcs resource create gfs2fs Filesystem device="/dev/vgclvm/lvcluster" directory="/clusterfs" fstype=gfs2 options=noatime op monitor interval=10s on-fail=fence clone interleave=true
Assumed agent name 'ocf:heartbeat:Filesystem' (deduced from 'Filesystem')

Validate the cluster status

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node1.example.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Dec 29 10:58:08 2018
Last change: Sat Dec 29 10:57:52 2018 by root via cibadmin on node1.example.com

3 nodes configured
12 resources configured

Online: [ node1.example.com node2.example.com node3.example.com ]

Full list of resources:

 Clone Set: dlm-clone [dlm]
     Started: [ node1.example.com node2.example.com node3.example.com ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ node1.example.com node2.example.com node3.example.com ]
 fence-vm1      (stonith:fence_xvm):    Started node2.example.com
 fence-vm2      (stonith:fence_xvm):    Started node2.example.com
 fence-vm3      (stonith:fence_xvm):    Started node2.example.com
 Clone Set: gfs2fs-clone [gfs2fs]
     Started: [ node1.example.com node2.example.com node3.example.com ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

So our gfs2fs service is started automatically on all our cluster nodes.

 

Now arrange the resource start-up order for GFS2 and CLVMD so that after a node reboot the services are started in proper order or else they will fail to start

[root@node1 ~]# pcs constraint order start clvmd-clone then gfs2fs-clone
Adding clvmd-clone gfs2fs-clone (kind: Mandatory) (Options: first-action=start then-action=start)

[root@node1 ~]# pcs constraint colocation add gfs2fs-clone with clvmd-clone

 

Validate our Cluster with GFS2 file system

Now since our resource/service is running properly on our cluster nodes. Let us create a file on one of our cluster node.

[root@node1 ~]# cd /clusterfs/
[root@node1 clusterfs]# touch file

Now connect to any other cluster node, and this file should exist there as well

[root@node2 ~]# ls /clusterfs/
file

So our Cluster with GFS2 file system configuration is working as expected.

 

Deepak Prasad

Deepak Prasad

Deepak Prasad is the founder of GoLinuxCloud, bringing over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, Networking, and Security. His extensive experience spans development, DevOps, networking, and security, ensuring robust and efficient solutions for diverse projects.

Certifications and Credentials:

  • Certified Kubernetes Application Developer (CKAD)
  • Go Developer Certification
  • Linux Foundation Certified System Administrator (LFCS)
  • Certified Ethical Hacker (CEH)
  • Python Institute PCAP (Certified Associate in Python Programming)
You can connect with him on his LinkedIn profile and join his Facebook and LinkedIn page.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

23 thoughts on “How to set up GFS2 with clustering on Linux ( RHEL / CentOS 7 )”

  1. One of our environments has been established with RHEL 7.9 GFS2 cluster – 2node. While manual backups taking place on Shared Volume of 500 GB (by 2nodes), the Read/Write traffic increases to 45 MB/sec and 250 IOPS. This is almost impacting other servers in the network. Any suggestions please?

    Reply
  2. OK no worries, I’ll see what I can find. This post def helps too: https://www.golinuxcloud.com/create-cluster-resource-in-ha-cluster-examples/

    Was thinking I could create a CRON job to run ever 10 seconds to confirm process health, then if it fails, put the cluster node in standby.

    1 last thing on clusters – nearly every post I have found on clustering is for CentOS, my company initially tried pcs on ubuntu but found it unstable. Is there anything about CentOS that makes it better for clustering?

    Reply
    • cron job would be a nice hack but I would also suggest asking in a larger forum in stackoverflow. May be there are more possible ways to achieve what you are trying to do.

      I had used pacemaker cluster in past with Red Hat. You may already know that Red hat creates its own version of these open source software such as pacemaker, Kubernetes, Openstack with an added advantage of support. This is the reason why we see it being used mostly in CentOS as it was downstream project of Red Hat. So we can expect stable code in CentOS as they were already tested in Red Hat. Now since CentOS is going to be upstream project so we can’t say about that any more. End users may have to wait longer to get a fix with CentOS Stream.

      Reply
  3. Thanks so much for this post and this incredible site. I have found everything I needed! I was able to setup a proper 2 node cluster with fencing, stonith & a gfs2 shared disk from scratch on CentOS 7.9 by following these 4 posts:

    1) https://www.golinuxcloud.com/ste-by-step-configure-high-availability-cluster-centos-7/

    2) https://www.golinuxcloud.com/how-to-install-configure-two-node-cluster-linux-centos-7/

    3) https://www.golinuxcloud.com/what-is-fencing-configure-kvm-cluster-fencing/#Setup_KVM_HA_Cluster

    4) https://www.golinuxcloud.com/configure-gfs2-setup-cluster-linux-rhel-centos-7/

    I am creating an HA implementation of Apache ActiveMQ, evaluating & testing both Classic & Artemis. So far shared storage and failover works great. Had a couple questions:

    – Can you add a service/process as a Cluster Resource so if the process fails it will stonith the bad node and move all resources including IP to the healthy node?

    – Do you have a donate section bc I am def using an ad blocker 😀

    Reply
    • Thank you for your kind words and I am glad these artciles helped.

      Which resource are you asking about? As that would depend.
      I have created a buymeacoffee page, so you can make any donations here 🙂

      Reply
      • I think I saw a post about adding apache as a cluster resource, was wondering if this could apply to other processes. If possible, I would like to have the AMQ process be a cluster resource. On the offchance that the process fails on node1 but everything else about node1 is healthy, the vIP would stay with node1 but node2 would be promoted from passive to active.

        In Windows clustering, if the monitored process fails, the whole node is essentially stonith’d and the 2nd node becomes active with Cluster IP & with the process. Just wondering if the same is possible with pcs.

        Reply
        • I am not sure if that is possible because if a cluster resource fails for some reason then that wouldn’t necessarily mean that the respective node is also faulty and would continue to be active

          But I am afraid it is quite some time since I worked on cluster so I can’t recall if any such config is possible with pacemaker. You will have to test and try.

          Reply
    • Do you mean the filesystem fails or the cluster node? As due to high availability, the cluster node failure will cause GFS2 to mount on next available cluster node

      Reply
  4. I am Getting error – dlm: close connection to node 1 & 2 error than gfs2 shared partition getting hanged on both nodes.
    Please help me.

    Reply
  5. error taks gfs2_quotad:3349 blocked for more than 120 second
    Shared partition is getting hanged , please help me

    Reply
  6. showing taks gfs2_quotad:3349 blocked for more than 120 second and shared partition is getting hanged on both nodes

    Reply

Leave a Comment