Setting up Corosync and Pacemaker

tested on Slackware 14.2

Homepages

Although you could still use Heartbeat and plug Pacemaker to it instead, it has been superseded by Corosync.

Packages

You will also need those and possibly a few python modules for crmsh to run (not build) properly.

Installation

ON EVERY NODE

Corosync

ls -lF /var/log/packages/check-*
ls -lF /var/log/packages/libqb-*
ls -lF /var/log/packages/corosync-*

removepkg check
removepkg libqb
removepkg corosync

#sbopkg -i check
#sbopkg -i libqb
#sbopkg -i corosync

slackpkg install check #sbo/slackonly
slackpkg install libqb #sbo/slackonly
slackpkg install corosync #sbo/slackonly

corosync -v

gives

Corosync Cluster Engine, version '3.0.2'
Copyright (c) 2006-2018 Red Hat, Inc.

Pacemaker

grep ^hac /etc/group
grep ^hac /etc/passwd

groupadd -g 226 haclient
useradd -u 226 -g 226 -c "Pacemaker" -d / -s /bin/false hacluster

ls -lF /var/log/packages/gmp-* #official

ls -lF /var/log/packages/pacemaker-*
removepkg pacemaker
#sbopkg -i pacemaker
slackpkg install pacemaker #sbo/slackonly

pacemakerd --version

gives

Pacemaker 2.0.2
Written by Andrew Beekhof

Resource Agents

ls -l /var/log/packages/resource-agents*
#sbopkg -i resource-agents
installpkg resource-agents-*_SBo.tgz 

To test the resource executables manually, you will have to define OCF_ROOT and eventually OCF_RESKEY_<param> first. Otherwise this gets around OCF_ROOT,

ls -l /lib/heartbeat
ln -s ../usr/lib/ocf/lib/heartbeat /lib/heartbeat

you will still have to define the RESKEYs anyway.

#export OCF_ROOT=/usr/lib/ocf
export OCF_RESKEY_<param>=...

/usr/lib/ocf/resource.d/heartbeat/IPaddr2
/usr/lib/ocf/resource.d/heartbeat/iSCSILogicalUnit

Check and fix the Xen resource,

ls -l /usr/local/sbin/xl
ls -l /usr/lib/ocf/resource.d/heartbeat/Xen

cd /usr/lib/ocf/resource.d/heartbeat
mv -i Xen Xen.dist
vi Xen

xentool=/usr/local/sbin/xl

Debugging Resource Failures https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures

CRMSH v2

CRMSH v2 is made available by someone else at SBo,

ls -lF /var/log/packages/pip-*
ls -lF /var/log/packages/crmsh-*

sbopkg -i pip
#pip install --upgrade pip
#pip uninstall pika
#pip install 'pika<0.11,>=0.9'

removepkg crmsh
sbopkg -i crmsh

For CRMSH to run (it builds fine without that), you will have to,

sbopkg -i BeautifulSoup4
sbopkg -i html5lib
sbopkg -i lxml

pip install modules
sbopkg -i pexpect
sbopkg -i python-requests
pip install parallax
sbopkg -i PyYAML

CRMSH v3

and if using CRMSH v3 you might also need,

sbopkg -i six
sbopkg -i setuptools-scm
sbopkg -i python-dateutil

Fence Agents

ls -l /var/log/packages/fence-agents*
#sbopkg -i fence-agents
installpkg fence-agents-*_SBo.tgz

Cluster Glue (optional)

grep ^haclient /etc/group
grep ^hacluster /etc/passwd

ls -l /var/log/packages/cluster-glue*
#sbopkg -i cluster-glue
installpkg cluster-glue-*_SBo.tgz

–°luster Setup

ON EVERY NODE

Set the nodeid (number) according to your taste for every node, e.g. 1, 2, 3. If you do not set the nodeid: Pacemaker will work but complain in the logs, and those are noisy enought. Also it will generate the node identifiers based on their IP addresses. You will still be able to grav them from the logs and retro-actively add it into the configuration.

The bindnetaddr should not be used when defining nodelist.

Setting up the quorum type, this is for at least three cluster nodes.

cd /etc/corosync/
sed '/^[[:space:]]*#/d' corosync.conf.example > corosync.conf
vi corosync.conf

totem {
    interface {
        #bindnetaddr: 192.168.1.0
    }
}

nodelist {
    node {
        ring0_addr: x.x.x.x
        nodeid: 1
        name: node1
    }
    node {
        ring0_addr: x.x.x.x
        nodeid: 2
        name: node2
    }
    node {
        ring0_addr: x.x.x.x
        nodeid: 3
        name: node3
    }
}

quorum {
    provider: corosync_votequorum
}

Double-check

ls -l /var/log/packages/corosync*
ls -l /var/log/packages/pacemaker*
ls -l /var/log/packages/resource-agents*
ls -l /var/log/packages/crmsh*
ls -l /var/log/packages/fence-agents*
ls -l /var/log/packages/cluster-glue*

pacemakerd --version
cibadmin -!

Pacemaker 2.0.0 (Build: 63ff11d357):  generated-manpages agent-manpages ascii-docs ncurses libqb-logging libqb-ipc lha-fencing nagios  corosync-native atomic-attrd acls

ls -l /etc/corosync/corosync.conf
ls -ldF /usr/lib/ocf/resource.d/heartbeat

Ready to go

In case the permissions are messed up,

#touch /var/log/pacemaker/pacemaker.log
#chown hacluster:hacluster /var/log/pacemaker/pacemaker.log
#chmod o+x /var/log/pacemaker/
#chown -R hacluster:haclient /var/lib/pacemaker
#chown -R hacluster:haclient /var/log/pacemaker

Get all the logs

cat > /root/log <<-EOF
tail -n0 -F /var/log/* /var/log/xen/* /var/log/cluster/* /var/log/pacemaker/*
EOF
chmod +x /root/log
/root/log

and proceed

/etc/init.d/pacemaker stop
/etc/init.d/corosync stop

/etc/init.d/corosync start
/etc/init.d/corosync status

/etc/init.d/pacemaker start
/etc/init.d/pacemaker status

#/etc/init.d/logd start

Eventually enable at boot time,

cat >> /etc/rc.d/rc.local <<-EOF
/etc/rc.d/init.d/corosync start
/etc/rc.d/init.d/pacemaker start
EOF

References

Corosync

Pacemaker

CRMSH

Dealing with resources

Cluster Glue

OBSOLETE

STONITH

troubles

TOSORT

Troubleshooting

while compiling Pacemaker

logsys_qb_init: Assertion `"implicit callsite section is populated, otherwise target's build is at fault, preventing reliable logging" && QB_ATTR_SECTION_START != QB_ATTR_SECTION_STOP' failed.

[ClusterLabs] Upgrade corosync problem https://lists.clusterlabs.org/pipermail/users/2018-July/015328.html

Re: [ClusterLabs] Upgrade corosync problem https://www.mail-archive.com/users@clusterlabs.org/msg07277.html

# bash -c 'paste <(ld --version) <(ld.bfd --version) | head -n1'
GNU ld version 2.26.20160125    GNU ld version 2.26.20160125

# readelf -s `which corosync` | grep ___verbose
41: 00000000006f64c0    40 OBJECT  GLOBAL DEFAULT bad section index[ 35] __start___verbose
62: 00000000006f64c0    40 OBJECT  GLOBAL DEFAULT bad section index[ 35] __stop___verbose

Nethence | Pub | Lab | Pbraun | SNE Russia | xhtml