XEN & DRBD Highly Available Convergent Farm

tested on slackware-current (aug 2021)

Introduction

The goal is to get a few cluster nodes running XEN + DRBD + old-school Linux-HA.

For the purpose of a PoC, we’re assuming a KVM host with nested virtualization enabled, to run the few storage nodes and XEN hypervisors within.

Binaries

grab the binary packages from there

wget -r -N --no-if-modified-since --relative -l 1 -nH https://lab.nethence.com/slackbuild-pkgs/
# --mirror == -r -N -l inf --no-remove-listing
# --progress=dot:mega --tries=0
rm -f slackbuilds/index.html

# rsync -avz --delete xc:/root/slackbuild-pkgs/ slackbuild-pkgs/
# rsync -avz --delete slackbuild-pkgs/ slack2:/root/slackbuild-pkgs/
# rsync -avz --delete slackbuild-pkgs/ slack3:/root/slackbuild-pkgs/

    groupadd -g 226 haclient
    useradd -u 226 -g 226 -d / -s /sbin/nologin hacluster

installpkg slackbuild-pkgs/*.tgz
# upgradepkg slackbuild-pkgs/*.tgz

for pkg in \
    acpica \
    cluster-glue \
    drbd-kernel \
    drbd-utils \
    fence-agents \
    heartbeat \
    keepalived \
    libaal \
    pexpect \
    ptyprocess \
    python3-ninja \
    python3-skbuild \
    reiser4progs \
    resource-agents \
    sbopkg \
    syslinux \
    ucl \
    upx \
    wheel \
    yajl \
    ; do
    echo $pkg
    ls -lF /var/log/packages/$pkg-*
    echo
done; unset pkg

Libraries

slackpkg update

which xl
ldd /usr/sbin/xl | grep not
slackpkg install libnl3 lzo pixman fuse3

ldd /usr/lib/xen/bin/qemu-system-i386 | grep not
    slackpkg install gnutls glib2-2 ndctl liburing nettle libssh p11-kit

# no need for those
# SDL2 SDL2_image gtk+3

Kernels

boot-loader will look for kernels at the root of the file-system

ln -s boot/xen.gz /xen.gz

beware to use 5.12 or 5.13 at least, REISER4 got broken around 5.10

cd /
wget https://lab.nethence.com/nunux/5.12.6.xenreiser4.vmlinuz -O vmlinuz

cd /lib/modules/
wget https://lab.nethence.com/nunux/5.12.6.xenreiser4.modules.tar.gz
tar xzf 5.12.6.xenreiser4.modules.tar.gz
rm -f 5.12.6.xenreiser4.modules.tar.gz
depmod -a 5.12.6.xenreiser4

Booting

install the boot-loader and setup serial console mode

[an error occurred while processing the directive]

don’t forget to tune the system init for both Linux and XEN entries

vi /etc/inittab

s1:12345:respawn:/sbin/agetty --noclear --flow-control --keep-baud --local-line 115200,38400,9600 ttyS0 linux
s2:12345:respawn:/sbin/agetty --noclear --flow-control --keep-baud --local-line 115200,38400,9600 hvc0 linux

make sure the default kernel still boots with syslinux

reboot

uname -r

then SWITCH TO XEN AS DEFAULT BOOT ENTRY, reboot again and check

vi /boot/syslinux/syslinux.cfg

default XEN

reboot

/etc/init.d/xencommons start
xl info

enable at boot-time

cat >> /etc/rc.d/rc.local <<-EOF

# self-verbose
/etc/rc.d/init.d/xencommons start
EOF

DRBD

after XEN & booting – in case we need the kernel source

slack1

build the kernel module

cd ~/
wget https://lab.nethence.com/nunux/5.10.54.xenreiser4.src.tar.gz
tar xzf 5.10.54.xenreiser4.src.tar.gz -C /usr/src/
rm -f 5.10.54.xenreiser4.src.tar.gz

#sbopkg -i drbd-kernel

cd ~/slackbuilds/
tar xzf drbd-kernel.tar.gz
cd drbd-kernel/
wget https://pkg.linbit.com//downloads/drbd/9/drbd-9.1.3.tar.gz
md5sum drbd-9.1.3.tar.gz # f2ff06d1225ef1c21f8803069e13b7b3
./drbd-kernel.SlackBuild
installpkg /tmp/drbd-kernel-9.1.3-x86_64-1_SBo.tgz
find /lib/modules/`uname -r`/ | grep drbd
modprobe drbd_transport_tcp
lsmod | grep drbd
# depmod -a

share the customized package binary with the other nodes

scp /tmp/drbd-kernel-9.1.3-x86_64-1_SBo.tgz slack2:~/
scp /tmp/drbd-kernel-9.1.3-x86_64-1_SBo.tgz slack3:~/

now that you you’ve got the right kernel already

slack2 and slack3

installpkg ~/drbd-kernel-9.1.3-x86_64-1_SBo.tgz
# depmod -a

check drbd-utils

drbdadm status
cat /proc/drbd

/etc/rc.d/rc.drbd start
ls -lF /etc/rc.d/rc.drbd # enabled already

vi /etc/rc.d/rc.local

# self-verbose
/etc/rc.d/rc.drbd start

you’re now ready to proceed with a distributed thin volume such as

ls -lF /dev/thin1/slack
ls -lF /dev/drbd/by-res/slack/0

Ready to go

    scp /etc/hosts slack2:/etc/
    scp /etc/hosts slack3:/etc/

    scp /etc/drbd.conf slack2:/etc/
    scp /etc/drbd.conf slack3:/etc/
rsync -avz --delete /etc/drbd.d/ slack2:/etc/drbd.d/
rsync -avz --delete /etc/drbd.d/ slack3:/etc/drbd.d/

    scp /etc/rc.d/rc.local slack2:/etc/rc.d/
    scp /etc/rc.d/rc.local slack3:/etc/rc.d/

    scp /root/RESTART-DRBD slack2:/root/
    scp /root/RESTART-DRBD slack3:/root/

now create a XEN guest against the thin volume

Diskless-mode acceptance

no primary and start the slack guest –> becomes primary

OK

migrate slack guest from slack1 to slack3 –> primary moves

OK

migrate slack guest from slack3 to slack2 (diskless)

OK

write some change on slack while diskless

echo POUET > POUET_OK

migrate slack guest back to slack1 and check the data

cat POUET_OK

migrate slack guest to slack2 yet again and write something else on it and power off

echo OK > CHECK2

start slack guest on another node e.g. slack3 and look for the second data checkup, then also write a third test

OK
echo OK > CHECK3

shutdown slack guest on slack3 and try to start it on slack2 (diskless directly)

STARTUP OK
CHECK3 OK

write something else then destroy slack2 from the KVM host

echo BAD_TIMES > DESTRUCTION
sync

^]
^D
virsh destroy slack2

then try to rise up slack guest on slack1

root@slack1:~/guests/slack# drbdadm primary slack
slack: State change failed: (-7) Refusing to be Primary while peer is not outdated
Command 'drbdsetup primary slack' terminated with exit code 11

bring back slack2

virsh create /data/guests/slack2/slack.xml
virsh console slack2
^]

and try again from slack1 (assuming there are REISER4 file-system tools within the guest)

#fsck.reiser4 /dev/drbd/by-res/slack/0
xl create /data/guests/slack/slack -c

data should remain

cat DESTRUCTION

protoA –> protoC

can adjust from protoA to protoC

drbdadm primary slack
drbdsetup show slack --show-defaults | grep proto # A
vi /etc/drbd.d/global_common.conf

    common {
        net {
            protocol C;
            allow-two-primaries yes;

    rsync -avz --delete /etc/drbd.d/ slack2:/etc/drbd.d/
    rsync -avz --delete /etc/drbd.d/ slack3:/etc/drbd.d/

drbdadm adjust-with-progress slack
drbdsetup show slack --show-defaults | grep proto # C

ssh slack2 drbdsetup show slack --show-defaults | grep proto # C/A
ssh slack3 drbdsetup show slack --show-defaults | grep proto # C/A
ssh slack2 drbdadm adjust-with-progress slack
ssh slack3 drbdadm adjust-with-progress slack

HELPER SCRIPTS FOR DRBD DEVICE STATIC PROVISIONING

Generate the ThinLVs and DRBD resource confs (here 10 vdisks per node)

(gen-guestids.bash](/bin/slack/converge/gen-guestids.bash.txt)

Create the DRBD meta-data at the end of the local ThinLVs, synchronize and discard/re-thin in turns (otherwise you will blow up your LVM thin pool.

draft

slack1

for guestid in seq -w 001 010; do drbdadm create-md a$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm create-md c$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm up a$guestid; done; unset guestid for guestid in seq -w 001 010; do echo drbdadm primary –force a$guestid; done; unset guestid

in turns

for guestid in seq -w 001 010; do blkdiscard /dev/drbd/by-disk/thina/a$guestid; done; unset guestid

slack2

for guestid in seq -w 001 010; do drbdadm create-md b$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm create-md a$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm up b$guestid; done; unset guestid for guestid in seq -w 001 010; do echo drbdadm primary –force b$guestid; done; unset guestid

in turns

for guestid in seq -w 001 010; do blkdiscard /dev/drbd/by-disk/thinb/b$guestid; done; unset guestid

slack3

for guestid in seq -w 001 010; do drbdadm create-md c$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm create-md b$guestid; done; unset guestid for guestid in seq -w 001 010; do drbdadm up c$guestid; done; unset guestid for guestid in seq -w 001 010; do echo drbdadm primary –force c$guestid

in turns

for guestid in seq -w 001 010; do blkdiscard /dev/drbd/by-disk/thinc/c$guestid; done; unset guestid

Troubleshooting

xl: /usr/lib64/libxlutil.so.4.16: version `VERS_4.15.0' not found (required by xl)

–> your tools version is not in sync with the micro-kernel. maybe you’ve installed the tools twice on the system and are having a PATH issue?

Resources

https://unix.stackexchange.com/questions/165875/resume-failed-download-using-linux-command-line-tool

https://stackoverflow.com/questions/60972064/wget-continue-on-retry

https://unix.stackexchange.com/questions/15176/using-wget-what-is-the-right-command-to-get-gzipped-version-instead-of-the-actu

https://superuser.com/questions/901962/what-is-the-correct-mime-type-for-a-tar-gz-file


GUIDES | LECTURES | BENCHMARKS | SMTP HEALTH