Before starting, I hope you are familiar with different Cluster types and it’s Architecture. In this article I will explain the steps to configure two node cluster on CentOS / RHEL 7 Linux node. Now for the sake of this article I am using Oracle Virtual Box but you can also use Oracle Virtual Box to test the steps.

How to install and configure two node cluster in Linux ( CentOS / RHEL 7 )

 

How a two-node cluster is different from a cluster with 3 or higher node?

Quorum is the minimum number of cluster member votes required to perform a cluster operation. Without quorum, the cluster cannot operate. Quorum is achieved when the majority of cluster members vote to execute a specific cluster operation. If the majority of the cluster members do not vote, the cluster operation will not be performed.

In a two-node cluster configuration, the maximum number of expected votes is two with each cluster node has one vote. In a failure scenario when any one of the node goes down, only one node is active and it has only one vote. In such a configuration, quorum cannot be reached, since a majority of the votes cannot be delivered. The single cluster node is stuck at 50 percent and will never get past it. Therefore, the cluster will never operate normally this way.

 

2-Node Cluster Challenges

  • Quorum problems: More than half is not possible after a failure in the 2-node cluster
  • Split brain can happen. With fencing enabled, both nodes will try to fence one another.
  • The cluster won’t start until all nodes are available. This is something that can easily be disabled using the wait_for_all parameter
NOTE:
It is recommended to create a two node cluster using wait_for_all=0 as shown in the example below. When creating a 2-node cluster, the two_node mode will be enabled in corosync.conf and will automatically disappear if you add more nodes to your cluster.
pcs cluster setup --start --enable --name cluster_name --wait_for_all=0 node1.example.com node2.example.com

 

Now earlier for me, I had shared step by step article to configure a three node HA Cluster, on the same setup I have removed node3.example.com so that I can use the same setup to demonstrate this article

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:14:02 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/368
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)
         2          1 node2.example.com

Here I need minimum two votes to keep my Cluster alive and functioning.

 

What if one of my cluster node goes down?

Let us manually try to stop one of my cluster node

[root@node2 ~]# pcs cluster stop node2.example.com
Error: Stopping the node(s) will cause a loss of the quorum, use --force to override

Now since I only have two nodes in my cluster, the service won’t allow me easily to shutdown the cluster node. Hence I need to use --force

[root@node2 ~]# pcs cluster stop node2.example.com --force
node2.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (corosync)...

So let us now check the status of our cluster

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:15:54 2018
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          1
Ring ID:          1/372
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)

So as expected our cluster is not in Quorate state any more. Since the expected vote is higher than total votes.

So we must do some additional configuration to have a two-node cluster. Now with CentOS 6 this could have been achieved using a Quorum Disk.

 

Why quorum disk is not possible with Cluster on CentOS 7?

  • The quorum provider in the CentOS 7 cluster stack is corosync.
  • The CentOS 7 cluster stack, as opposed to the CentOS 6 cluster stack, only provides one option to work around the quorum issue, which is a two node-specific cluster configuration.
  • The CentOS 7 cluster stack lacks the Quorum disk workaround option, mainly due to the additional Quorum configuration options provided by Corosync version 2.
  • These additional Corosync version 2 options actually make the Quorum disk unnecessary in a two node or multinode cluster configuration.
  • The new Quorum features of Corosync version 2 are definitely welcome, are well thought out, and can replace the need for a Quorum disk in every way.

As already mentioned, the quorum provider in the CentOS 7 cluster stack is Corosync version 2. Therefore, the cluster quorum configuration is provided in the corosync.conf configuration file. With the previous Corosync version (version 1), the quorum capabilities were provided by CMAN; with Corosync version 2 included in the CentOS 7 cluster stack, the quorum capabilities are provided by Corosync itself, specifically by the votequorum process.

 

How to configure two-node cluster with CentOS / RHEL 7 Linux?

If you are configuring a two node cluster on the CentOS 7 cluster stack, you should enable the two_node cluster option. Before starting with the configuration changes, stop your cluster services

[root@node1 ~]# pcs cluster stop --all
node1.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (pacemaker)...
node1.example.com: Stopping Cluster (corosync)...
node2.example.com: Stopping Cluster (corosync)...

Next add the following parameter to the corosync.conf under quorum section:

# vim /etc/corosync/corosync.conf
quorum {
    provider: corosync_votequorum
    two_node: 1
    wait_for_all: 0
}

By enabling the two_node cluster option, the quorum is artificially set to 1, which means that the cluster will be quorate and continue to operate even in the event of a failure of one cluster node.

NOTE:
By enabling the two_node cluster option automatically enables an additional wait_for_all option.

Let us check the cluster status, as you see we have additional flags enabled for our two node cluster

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:08:19 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/356
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           1
Flags:            2Node Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)
         2          1 node2.example.com

Now let us try to stop one of the cluster node

[root@node1 ~]# pcs cluster stop node2.example.com
node2.example.com: Stopping Cluster (pacemaker)...
node2.example.com: Stopping Cluster (corosync)...

As you observed this time the tool didnot prevented us from stopping the cluster node as it did earlier.

Let us check the quorum status

[root@node1 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Wed Dec 26 16:09:30 2018
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          1
Ring ID:          1/360
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           1
Flags:            2Node Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1.example.com (local)

So our cluster is functioning even with one node active only.

 

Let us also understand some other basic terminologies associated with corosync configuration

  • wait_for_all (default: 0): The general behavior of the votequorum process is to switch from inquorate to quorate as soon as possible. As soon as the majority of nodes are visible to each other, the cluster becomes quorate. The wait_for_all option, or WFA, allows you to configure the cluster to become quorate for the first time, but only after all the nodes have become visible. If the two_node option is enabled, the wait_for_all option is automatically enabled as well.
  • last_man_standing (default: 0) / last_man_standing_window (default: 10): The general behavior of the votequorum process is to set the expected_votes parameter and quorum at startup. Enabling the last_man_standing option, or LMS, allows the cluster to dynamically recalculate the expected_votes parameter and quorum under specific circumstances. It is important to enable the WFA option when using the LMS option in high-availability clusters.
  • auto_tie_breaker (default: 0): When the auto_tie_breaker option, or ATB, is enabled, the cluster can suffer because of up to 50 percent of the nodes failing at the same time. The cluster partition, or the set of nodes that are still in contact with the node that has the lowest nodeid parameter, will remain quorate. The other nodes will be inquorate.

 

NOTE:
You must always disable fencing in a two-node cluster configuration without the Quorum disk to avoid fence race scenarios, where the two cluster nodes kill each other.

 

Lastly I hope the steps from the article to configure two-node cluster on Linux ( CentOS / RHEL 7 ) was helpful. So, let me know your suggestions and feedback using the comment section.

Leave a Reply

Your email address will not be published. Required fields are marked *