How to install and configure two node cluster in Linux ( CentOS / RHEL 7 )


Tips and Tricks

Before starting, I hope you are familiar with different Cluster types and it's Architecture. In this article I will explain the steps to configure two node cluster on CentOS / RHEL 7 Linux node. Now for the sake of this article I am using Oracle VirtualBox installed on my Linux Server.

How to install and configure two node cluster in Linux ( CentOS / RHEL 7 )

 

How a two-node cluster is different from a cluster with 3 or higher node?

Quorum is the minimum number of cluster member votes required to perform a cluster operation. Without quorum, the cluster cannot operate. Quorum is achieved when the majority of cluster members vote to execute a specific cluster operation. If the majority of the cluster members do not vote, the cluster operation will not be performed.

In a two-node cluster configuration, the maximum number of expected votes is two with each cluster node has one vote. In a failure scenario when any one of the node goes down, only one node is active and it has only one vote. In such a configuration, quorum cannot be reached, since a majority of the votes cannot be delivered. The single cluster node is stuck at 50 percent and will never get past it. Therefore, the cluster will never operate normally this way.

 

2-Node Cluster Challenges

  • Quorum problems: More than half is not possible after a failure in the 2-node cluster
  • Split brain can happen. With fencing enabled, both nodes will try to fence one another.
  • The cluster won't start until all nodes are available. This is something that can easily be disabled using the wait_for_all parameter
NOTE:
It is recommended to create a two node cluster using wait_for_all=0 as shown in the example below. When creating a 2-node cluster, the two_node mode will be enabled in corosync.conf and will automatically disappear if you add more nodes to your cluster.
pcs cluster setup --start --enable --name cluster_name --wait_for_all=0 node1.example.com node2.example.com

 

Now earlier for me, I had shared step by step article to configure a three node HA Cluster, on the same setup I have removed node3.example.com so that I can use the same setup to demonstrate this article

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:14:02 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/368
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)
         2          1 node2.example.com

Here I need minimum two votes to keep my Cluster alive and functioning.

 

What if one of my cluster node goes down?

Let us manually try to stop one of my cluster node

[root@node2 ~]# pcs cluster stop node2.example.com
Error: Stopping the node(s) will cause a loss of the quorum, use --force to override

Now since I only have two nodes in my cluster, the service won't allow me easily to shutdown the cluster node. Hence I need to use --force

[root@node2 ~]# pcs cluster stop node2.example.com --force
node2.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (corosync)...

So let us now check the status of our cluster

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:15:54 2018
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          1
Ring ID:          1/372
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)

So as expected our cluster is not in Quorate state any more. Since the expected vote is higher than total votes.

So we must do some additional configuration to have a two-node cluster. Now with CentOS 6 this could have been achieved using a Quorum Disk.

 

Why quorum disk is not possible with Cluster on CentOS 7?

  • The quorum provider in the CentOS 7 cluster stack is corosync.
  • The CentOS 7 cluster stack, as opposed to the CentOS 6 cluster stack, only provides one option to work around the quorum issue, which is a two node-specific cluster configuration.
  • The CentOS 7 cluster stack lacks the Quorum disk workaround option, mainly due to the additional Quorum configuration options provided by Corosync version 2.
  • These additional Corosync version 2 options actually make the Quorum disk unnecessary in a two node or multinode cluster configuration.
  • The new Quorum features of Corosync version 2 are definitely welcome, are well thought out, and can replace the need for a Quorum disk in every way.

As already mentioned, the quorum provider in the CentOS 7 cluster stack is Corosync version 2. Therefore, the cluster quorum configuration is provided in the corosync.conf configuration file. With the previous Corosync version (version 1), the quorum capabilities were provided by CMAN; with Corosync version 2 included in the CentOS 7 cluster stack, the quorum capabilities are provided by Corosync itself, specifically by the votequorum process.

 

How to configure two-node cluster with CentOS / RHEL 7 Linux?

If you are configuring a two node cluster on the CentOS 7 cluster stack, you should enable the two_node cluster option. Before starting with the configuration changes, stop your cluster services

[root@node1 ~]# pcs cluster stop --all
node1.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (pacemaker)...
node1.example.com: Stopping Cluster (corosync)...
node2.example.com: Stopping Cluster (corosync)...

Next add the following parameter to the corosync.conf under quorum section:

# vim /etc/corosync/corosync.conf
quorum {
    provider: corosync_votequorum
    two_node: 1
    wait_for_all: 0
}

By enabling the two_node cluster option, the quorum is artificially set to 1, which means that the cluster will be quorate and continue to operate even in the event of a failure of one cluster node.

NOTE:
By enabling the two_node cluster option automatically enables an additional wait_for_all option.

Let us check the cluster status, as you see we have additional flags enabled for our two node cluster

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:08:19 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/356
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           1
Flags:            2Node Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)
         2          1 node2.example.com

Now let us try to stop one of the cluster node

[root@node1 ~]# pcs cluster stop node2.example.com
node2.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (corosync)...

As you observed this time the tool didnot prevented us from stopping the cluster node as it did earlier.

Let us check the quorum status

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:09:30 2018
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          1
Ring ID:          1/360
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           1
Flags:            2Node Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)

So our cluster is functioning even with one node active only.

 

Let us also understand some other basic terminologies associated with corosync configuration

  • wait_for_all (default: 0): The general behavior of the votequorum process is to switch from inquorate to quorate as soon as possible. As soon as the majority of nodes are visible to each other, the cluster becomes quorate. The wait_for_all option, or WFA, allows you to configure the cluster to become quorate for the first time, but only after all the nodes have become visible. If the two_node option is enabled, the wait_for_all option is automatically enabled as well.
  • last_man_standing (default: 0) / last_man_standing_window (default: 10): The general behavior of the votequorum process is to set the expected_votes parameter and quorum at startup. Enabling the last_man_standing option, or LMS, allows the cluster to dynamically recalculate the expected_votes parameter and quorum under specific circumstances. It is important to enable the WFA option when using the LMS option in high-availability clusters.
  • auto_tie_breaker (default: 0): When the auto_tie_breaker option, or ATB, is enabled, the cluster can suffer because of up to 50 percent of the nodes failing at the same time. The cluster partition, or the set of nodes that are still in contact with the node that has the lowest nodeid parameter, will remain quorate. The other nodes will be inquorate.

 

NOTE:
You must always disable fencing in a two-node cluster configuration without the Quorum disk to avoid fence race scenarios, where the two cluster nodes kill each other.

 

Lastly I hope the steps from the article to configure two-node cluster on Linux ( CentOS / RHEL 7 ) was helpful. So, let me know your suggestions and feedback using the comment section.

Deepak Prasad

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

7 thoughts on “How to install and configure two node cluster in Linux ( CentOS / RHEL 7 )”

  1. On my Test machine i set the wait_for_all=0 and if i turn off one node , State remains quorate and same if i turn on only one node. But resource remains stop even when cluster state becomes quorate.

    Resources only start when both nodes are up

    Reply
  2. hi, some supplement,
    i have the same corosync-quorum setting following the article , doesn’t have quorum disk,but added 2 fence_ipmilan to cluster, every time after 1 node restarted ,the other one will be fenced to poweroff. the stonith action is off, fence_ipmilan has default setting. i doubt this is caused by fence/stonith
    setting without quorum disk?

    Reply
    • Two node clusters are always little tricky, wait_for_all requires that a node see all other nodes at least once before becoming quorate. This helps prevent a split-brain scenario in which multiple cluster partitions claim quorum independently of one another. Together, two_node and wait_for_all options allow one node to maintain quorum if the other node fails. However, if the healthy node reboots or otherwise leaves the cluster and has to rejoin, it cannot form quorum until it sees the failed node. Since the failed node is down, it is necessary to bypass this wait_for_all requirement in order to resume resource management.

      So you may configure your cluster without wait_for_all parameter and in terms of fencing, you may add a delay for a certain node so that the fencing device would wait before removing a node from the cluster.

      Reply
      • hi,thanks for your reply.
        so do you mean in this case we don’t need other votequorum disk and also can add fence devices to cluster?
        does “configure your cluster without wait_for_all parameter” mean set wait_for_all =0 ?
        BR

        Reply
  3. hi,admin,this is an wonderful article, i followed the steps to set up 2-node cluster, but now i have an issue and hope you can help.
    i have 2-node cluster with redhat linux 7.5 and fc san shared storage.
    2-node cluster doesn’t need votequorum to failover? if so, can’t add fence resource to them also?
    now i want to use fence ,so must enable votequorum to cluster? if i have share storage disk as votequorum disk (/dev/mapper/mpatha), how to configure it in cluster?

    do you have such solution or could you give more detail about this, thank you in advance!
    shall

    Reply
  4. We’re ɑ group of volunteers and opening a new scheme in our community.
    Your site οffered us with valuable info to work on.
    You have done an impressive job and our entire community will
    be thankful to you.

    Reply

Leave a Comment