Setting up Corosync and Pacemaker

tested on Slackware 14.2 and current

Introduction

Although you could still use Heartbeat and plug Pacemaker to it instead, it has been superseded by Corosync.

Also you might prefer the old-school and do it all with full-fledged Heartbeat.

Packages

https://slackbuilds.org/repository/14.2/libraries/libqb/

https://slackbuilds.org/repository/14.2/system/corosync/

https://slackbuilds.org/repository/14.2/system/pacemaker/

https://slackbuilds.org/repository/14.2/system/resource-agents/

https://slackbuilds.org/repository/14.2/system/fence-agents/

https://slackbuilds.org/repository/14.2/system/cluster-glue/

You will also need those and possibly a few python modules for crmsh to run (not even build) properly.

https://slackbuilds.org/repository/14.2/development/check/

https://slackbuilds.org/repository/14.2/system/crmsh/

Installation

ON EVERY NODE

Corosync

ls -lF /var/log/packages/check-*
ls -lF /var/log/packages/libqb-*
ls -lF /var/log/packages/corosync-*

removepkg check
removepkg libqb
removepkg corosync

#sbopkg -i check
#sbopkg -i libqb
#sbopkg -i corosync

slackpkg install check #sbo/slackonly
slackpkg install libqb #sbo/slackonly
slackpkg install corosync #sbo/slackonly

corosync -v

Pacemaker

grep ^hac /etc/group
grep ^hac /etc/passwd

groupadd -g 226 haclient
useradd -u 226 -g 226 -c "heartbeat" -d / -s /bin/false hacluster

ls -lF /var/log/packages/gmp-* #official

ls -lF /var/log/packages/pacemaker-*
removepkg pacemaker
#sbopkg -i pacemaker
slackpkg install pacemaker #sbo/slackonly

pacemakerd --version

Resource Agents

ls -l /var/log/packages/resource-agents*
#sbopkg -i resource-agents
installpkg resource-agents-*_SBo.tgz 

To test the resource executables manually, you will have to define OCF_ROOT and eventually OCF_RESKEY_<param> first. Otherwise this gets around OCF_ROOT,

ls -l /lib/heartbeat
ln -s ../usr/lib/ocf/lib/heartbeat /lib/heartbeat

you will still have to define the RESKEYs anyway.

#export OCF_ROOT=/usr/lib/ocf
export OCF_RESKEY_<param>=...

/usr/lib/ocf/resource.d/heartbeat/IPaddr2
/usr/lib/ocf/resource.d/heartbeat/iSCSILogicalUnit

Check and fix the Xen resource,

ls -l /usr/local/sbin/xl
ls -l /usr/lib/ocf/resource.d/heartbeat/Xen

cd /usr/lib/ocf/resource.d/heartbeat
mv -i Xen Xen.dist
vi Xen

xentool=/usr/local/sbin/xl

CRMSH v2

CRMSH v2 is made available by someone else at SBo,

ls -lF /var/log/packages/pip-*
ls -lF /var/log/packages/crmsh-*

sbopkg -i pip
#pip install --upgrade pip
#pip uninstall pika
#pip install 'pika<0.11,>=0.9'

removepkg crmsh
sbopkg -i crmsh

For CRMSH to run (it builds fine without that), you will have to,

sbopkg -i BeautifulSoup4
sbopkg -i html5lib
sbopkg -i lxml

pip install modules
sbopkg -i pexpect
sbopkg -i python-requests
pip install parallax
sbopkg -i PyYAML

CRMSH v3

and if using CRMSH v3 you might also need,

sbopkg -i six
sbopkg -i setuptools-scm
sbopkg -i python-dateutil

Fence Agents

ls -l /var/log/packages/fence-agents*
#sbopkg -i fence-agents
installpkg fence-agents-*_SBo.tgz

Cluster Glue (optional)

grep ^haclient /etc/group
grep ^hacluster /etc/passwd

ls -l /var/log/packages/cluster-glue*
#sbopkg -i cluster-glue
installpkg cluster-glue-*_SBo.tgz

Сluster Setup

ON EVERY NODE

Set the nodeid (number) according to your taste for every node, e.g. 1, 2, 3. If you do not set the nodeid: Pacemaker will work but complain in the logs, and those are noisy enought. Also it will generate the node identifiers based on their IP addresses. You will still be able to grav them from the logs and retro-actively add it into the configuration.

The bindnetaddr should not be used when defining nodelist.

Setting up the quorum type, this is for at least three cluster nodes.

cd /etc/corosync/
sed '/^[[:space:]]*#/d' corosync.conf.example > corosync.conf
vi corosync.conf

totem {
    interface {
        #bindnetaddr: 192.168.1.0
    }
}

nodelist {
    node {
        ring0_addr: x.x.x.x
        nodeid: 1
        name: node1
    }
    node {
        ring0_addr: x.x.x.x
        nodeid: 2
        name: node2
    }
    node {
        ring0_addr: x.x.x.x
        nodeid: 3
        name: node3
    }
}

quorum {
    provider: corosync_votequorum
}

Double-check

ls -l /var/log/packages/corosync*
ls -l /var/log/packages/pacemaker*
ls -l /var/log/packages/resource-agents*
ls -l /var/log/packages/crmsh*
ls -l /var/log/packages/fence-agents*
ls -l /var/log/packages/cluster-glue*

pacemakerd --version
cibadmin -!

Pacemaker 2.0.0 (Build: 63ff11d357):  generated-manpages agent-manpages ascii-docs ncurses libqb-logging libqb-ipc lha-fencing nagios  corosync-native atomic-attrd acls

ls -l /etc/corosync/corosync.conf
ls -ldF /usr/lib/ocf/resource.d/heartbeat

Ready to go

In case the permissions are messed up,

#touch /var/log/pacemaker/pacemaker.log
#chown hacluster:hacluster /var/log/pacemaker/pacemaker.log
#chmod o+x /var/log/pacemaker/
#chown -R hacluster:haclient /var/lib/pacemaker
#chown -R hacluster:haclient /var/log/pacemaker

Get all the logs

cat > /root/log <<-EOF
tail -n0 -F /var/log/* /var/log/xen/* /var/log/cluster/* /var/log/pacemaker/*
EOF
chmod +x /root/log
/root/log

and proceed

/etc/init.d/pacemaker stop
/etc/init.d/corosync stop

/etc/init.d/corosync start
/etc/init.d/corosync status

/etc/init.d/pacemaker start
/etc/init.d/pacemaker status

#/etc/init.d/logd start

Eventually enable at boot time,

cat >> /etc/rc.d/rc.local <<-EOF
/etc/init.d/corosync start
/etc/init.d/pacemaker start
EOF

Troubleshooting

while compiling Pacemaker

logsys_qb_init: Assertion `"implicit callsite section is populated, otherwise target's build is at fault, preventing reliable logging" && QB_ATTR_SECTION_START != QB_ATTR_SECTION_STOP' failed.

[ClusterLabs] Upgrade corosync problem https://lists.clusterlabs.org/pipermail/users/2018-July/015328.html

Re: [ClusterLabs] Upgrade corosync problem https://www.mail-archive.com/users@clusterlabs.org/msg07277.html

# bash -c 'paste <(ld --version) <(ld.bfd --version) | head -n1'
GNU ld version 2.26.20160125    GNU ld version 2.26.20160125

# readelf -s `which corosync` | grep ___verbose
41: 00000000006f64c0    40 OBJECT  GLOBAL DEFAULT bad section index[ 35] __start___verbose
62: 00000000006f64c0    40 OBJECT  GLOBAL DEFAULT bad section index[ 35] __stop___verbose

Resources

SourceInstall https://wiki.clusterlabs.org/wiki/SourceInstall

https://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1

https://www.digitalocean.com/community/tutorials/how-to-create-a-high-availability-setup-with-corosync-pacemaker-and-floating-ips-on-ubuntu-14-04

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/admin_cm_ha_failover.html

https://wiki.clusterlabs.org/wiki/FAQ

https://wiki.clusterlabs.org/wiki/Debian_Lenny_HowTo

homepages

Corosync https://github.com/corosync/corosync http://corosync.github.io/corosync/

Pacemaker https://github.com/ClusterLabs/pacemaker/ https://wiki.clusterlabs.org/wiki/Pacemaker

Resource Agents https://github.com/ClusterLabs/resource-agents

CRMSH https://github.com/ClusterLabs/crmsh http://crmsh.github.io

Fence Agents https://github.com/ClusterLabs/fence-agents

(optional) Cluster Glue http://www.linux-ha.org/wiki/Cluster_Glue

corosync

http://corosync.github.io/corosync/

Why is UDP unicast (UDPU) not recommended for use in a cluster with GFS2 in a RHEL 6 or 7 Resilient Storage cluster? https://access.redhat.com/solutions/162193

https://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_manual_config_crm_corosync.html

pacemaker

https://wiki.clusterlabs.org/wiki/Pacemaker

https://wiki.clusterlabs.org/wiki/Releases

https://wiki.clusterlabs.org/wiki/Install

https://clusterlabs.org/pacemaker/man/

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html

with-stonith https://gist.github.com/sammcj/9a8be565b29032bc2a9e

crmsh

http://crmsh.github.io/

https://github.com/ClusterLabs/crmsh/issues/83

resources

Debugging Resource Failures https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures

http://clusterlabs.org/pacemaker/doc/

http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-resource-supported.html

http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/primitive-resource.html

http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_resource_instance_attributes.html

http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_resource_instance_attributes.html

Chapter 10. Advanced Resource Types https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch10.html

Chapter 11. Reusing Parts of the Configuration https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch11.html

Appendix B. More About OCF Resource Agents http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ap-ocf.html

https://wiki.clusterlabs.org/wiki/Example_configurations

https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures

https://geekpeek.net/linux-cluster-resources/

https://www.unixmen.com/adding-deleting-cluster-resources-corosync-pacemaker-2/

https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_config_crm_resources.html

glue

https://github.com/ClusterLabs/cluster-glue

cluster-glue bug https://pastebin.com/4xm1NF40

stonith

http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_configuring_stonith.html

troubles

http://lists.corosync.org/pipermail/discuss/2014-September/003317.html

https://lists.clusterlabs.org/pipermail/users/2015-March/000069.html

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html

https://serverfault.com/questions/236325/linux-ha-cluster-w-xen-heartbeat-pacemaker-domu-does-not-failover-to-secondar

tosort

https://clusterlabs.org/components.html

http://linux-ha.org/doc/man-pages/re-ra-Xen.html

https://oss.clusterlabs.org/pipermail/pacemaker/2012-January/012808.html

https://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en


HOME | GUIDES | LECTURES | LAB | SMTP HEALTH | HTML5 | CONTACT
Copyright © 2024 Pierre-Philipp Braun