DRBD ON SPARSE LVM FOR A CONVERGENT XEN FARM

no Linstor required

tested on ubuntu and slackware current (aug 2021)

INTRODUCTION

dual-primary are only required temporarily during XEN migration. With it disabled, the replicated vdisk does not come up as primary by itself, and remains in secondary state unless manually told so.

DRBD v8 does not seem to support the diskless feature hence we go for v9.

SYSPREP & REQUIREMENTS

make sure the nodes communicate by short hostnames and also validate SSH fingerprints (not sure they need to talk incl. to themselves by SSH but let’s do it as well)

not sure NTP is that mandatory for this type of cluster, but let’s do it anyway and with the peer feature

eventually build drbd9 from scratch on ubuntu, slackware or prepare RPMs for RHEL/centos. and check the v9 modules are loading

/sbin/modprobe drbd_transport_tcp
/bin/lsmod | grep drbd

also build the tools

and eventually build LVM2 thin provisioning tools

BLOCK DEVICE ARCHITECTURE

the design we’re attempting to PoC as a MWE here is as such:

/data/ as NFS for XEN guest configs
/dev/thinX/GUEST-NAME logical volumes for XEN guest file-systems
/dev/thinX/GUEST-NAME idem for the mirror

assuming additional block devices for DRBD to live upon

/dev/vdb
/dev/vdc

three nodes, on which we are going to use 100G disks

slack1      slack2      slack3
vda     vda     vda
vdb     vdb     vdb

and those will be setup as such

slack1:vdb + slack2:vdc --> drbd1 slack guest
slack2:vdb + slack3:vdc --> drbd2 ubuntu guest
slack3:vdb + slack1:vdc --> drbd3 centos guest

ALL THE NODES HAVE ACCESS TO ALL DRBD DEVICES ANYHOW

slack1 reaches drbd2 diskless
slack2 reaches drbd3 diskless
slack3 reaches drbd1 diskless

BLOCK DEVICE SETUP (LVM2 THIN PROVISIONING)

we don’t need CLVM nor lvmlockd/sanlock/dlm because

We need to enable Discards (and no need for udev)

mv /etc/lvm/lvm.conf /etc/lvm/lvm.conf.dist
sed -r '/^[[:space:]]*#.*/d; /^$/d;' /etc/lvm/lvm.conf.dist > /etc/lvm/lvm.conf
vi /etc/lvm/lvm.conf

obtain_device_list_from_udev = 0

    issue_discards = 1

all nodes

pvcreate /dev/vdb
pvcreate /dev/vdc

slack1

vgcreate thin1 /dev/vdb
vgcreate thin3 /dev/vdc
pvs

slack2

vgcreate thin2 /dev/vdb
vgcreate thin1 /dev/vdc
pvs

slack3

vgcreate thin3 /dev/vdb
vgcreate thin2 /dev/vdc
pvs

create a specific LV as a thin pool

slack1

    lvcreate --extents 100%FREE --thin thin1/pool
    lvcreate --extents 100%FREE --thin thin3/pool

slack2

    lvcreate -l 100%FREE --thin thin2/pool
    lvcreate -l 100%FREE --thin thin1/pool

slack3

    lvcreate -l 100%FREE --thin thin3/pool
    lvcreate -l 100%FREE --thin thin2/pool

you can now proceed with the casual LVs within the thin pool

e.g. on slack1 + slack2

    lvcreate --virtualsize 25G --thin -n slack thin1/pool
    ls -lF /dev/thin1/slack

DRBD v9 SETUP

slack1

mv /etc/drbd.conf /etc/drbd.conf.dist
vi /etc/drbd.conf

global {
    usage-count yes;
    udev-always-use-vnr;
}

common {
    net {
        protocol C;
        # v9
        fencing resource-only;
        allow-two-primaries yes;
    }
    disk {
        read-balancing when-congested-remote;
    }
}

resource slack {
    device /dev/drbd1;
    meta-disk internal;
    on slack1 {
        # v9
        node-id   1;
        address   192.168.122.91:7701;
        disk      /dev/thin1/slack;
    }
    on slack2 {
        node-id   2;
        address   192.168.122.92:7701;
        disk      /dev/thin1/slack;
    }
    on slack3 {
        node-id   3;
        address   192.168.122.93:7701;
        disk none;
    }
    # v9
    connection-mesh {
        hosts slack1 slack2 slack3;
    }
}

scp /etc/hosts slack2:/etc/
scp /etc/hosts slack3:/etc/
scp /etc/drbd.conf slack2:/etc/
scp /etc/drbd.conf slack3:/etc/
scp /etc/rc.d/rc.local slack2:/etc/rc.d/
scp /etc/rc.d/rc.local slack3:/etc/rc.d/
scp /etc/rc.d/rc.local_shutdown slack2:/etc/rc.d/
scp /etc/rc.d/rc.local_shutdown slack3:/etc/rc.d/

INITIALIZE THE MIRROR

all nodes (including the diskless one)

initialize the underlying physical disks and enable the replicated volume

drbdadm create-md slack
drbdadm up slack
ls -lF /dev/drbd1

slack1

state is currently inconsistent on both sides of the mirror. This is why we need to enforce and mark a valid state somewhere to begin with.

drbdadm primary --force slack

and so forth with the other nodes hosting a primary DRBD device.

now check the the nodes are synchronizing

drbdadm status

KEEP THE SPARSE

you will notice on the mirror nodes that you’re not loosing the sparse (100% data). once the mirror got synchronized and although it’s all zeroes but the DRBD meta-data at the end.

slack2

lvs -o+discards thin1

you can fix that by wiping out the content while saying on the the current primary (works both with proto C and A)

slack1

blkdiscard /dev/drbd/by-disk/thin1/slack

the sparse is back in da place (0.01% data)

slack2

lvs -o+discards thin1

READY TO GO

Enable at startup

ls -lF /etc/rc.d/rc.drbd
chmod +x /etc/rc.d/rc.drbd

vi /etc/rc.d/rc.local

# self-verbose
/etc/rc.d/rc.drbd start

ACCEPTANCE

check that the replicated volumes are brought up at startup

echo '/etc/init.d/drbd start' >> /etc/rc.d/rc.local
echo '/etc/init.d/drbd stop' >> /etc/rc.d/rc.local_shutdown
ls -lF /etc/rc.d/rc.local /etc/rc.d/rc.local_shutdown
reboot

check what protocol the resource is living upon

v8

cat /proc/drbd

v9

drbdsetup show all --show-defaults | grep proto

v9 (as long as debug is enabled)

cat /sys/kernel/debug/drbd/resources/<resource>/connections/<connection>/<volume>/proc_drbd

and last but not least, check that you can access the DRBD diskless device from slack3.

OPERATIONS & MONITORING

see operations

TROUBLES

root@slack1:~#         drbdadm primary --force slack
slack: State change failed: (-7) Refusing to be Primary while peer is not outdated
Command 'drbdsetup primary slack --force' terminated with exit code 11

==> slack resource on slack3 was not up (and although it’s diskless)

TODO

RESOURCES

LINBIT DRBD kernel module https://github.com/LINBIT/drbd

DRBD userspace utilities (for 9.0, 8.4, 8.3) https://github.com/LINBIT/drbd-utils

“read-balancing” with 8.4.1+ https://www.linbit.com/en/read-balancing/

v9

DRBD 9.0 Manual Pages https://docs.linbit.com/man/v9/

drbd.conf - DRBD Configuration Files https://docs.linbit.com/man/v9/drbd-conf-5/

v8.4

DRBD 8.4 https://github.com/LINBIT/drbd-8.4

drbd.conf - Configuration file for DRBD’s devices https://docs.linbit.com/man/v84/drbd-conf-5/

How to Install DRBD on CentOS Linux https://linuxhandbook.com/install-drbd-linux/

How to Setup DRBD 9 on Ubuntu 16 https://www.globo.tech/learning-center/setup-drbd-9-ubuntu-16/

more

LINSTOR SDS server https://github.com/LINBIT/linstor-server

ops

CLI management tool for DRBD. Like top, but for DRBD resources. https://github.com/LINBIT/drbdtop

troubles

[DRBD-user] drbd-dkms fails to build under proxmox 6 https://lists.linbit.com/pipermail/drbd-user/2019-August/025208.html

[DRBD-user] Problems compiling kernel module https://lists.linbit.com/pipermail/drbd-user/2016-June/022391.html

thin

5.4.4. Creating Thinly-Provisioned Logical Volumes https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/thinly_provisioned_volume_creation

How to setup thin Provisioned Logical Volumes in CentOS 7 / RHEL 7 https://www.linuxtechi.com/thin-provisioned-logical-volumes-centos-7-rhel-7/

Thin Provisioning in LVM2 https://www.theurbanpenguin.com/thin-provisioning-lvm2/

Setup Thin Provisioning Volumes in Logical Volume Management (LVM) – Part IV https://www.tecmint.com/setup-thin-provisioning-volumes-in-lvm/

vs. ceph rbd

https://linbit.com/blog/drbd-linstor-vs-ceph/

https://linbit.com/blog/drbd-vs-ceph/

https://linbit.com/blog/how-does-linstor-compare-to-ceph/


GUIDES | LECTURES | BENCHMARKS | SMTP HEALTH