Troubleshooting DRBD cluster farms

The casual split-brain

disk:Inconsistent / connection:StandAlone

broken node

res=ocfs2
drbdadm disconnect $res
drbdadm -- --discard-my-data connect $res

survivor node (can be primary already)

res=ocfs2
drbdadm disconnect $res
drbdadm connect $res

you should now see the volume synchronizing

drbdadm status $res

# v8 only
cat /proc/drbd

disk:Outdated

to make it Inconsistent

res=ocfs2
drbdadm invalidate $res

primary/uptodate vs. standalone

that happened after a node2 reboot

node1 shows

primary/uptodate
other nodes: standalone

node2 shows

secondary/consistent
other nodes: standalone

node3 shows

secondary/diskless
other nodes: connecting

==> that one got solved with the –discard-my-data trick while resource was up on node3 (diskless)

Misc errors

cannot disconnect

while trying to disconnect

ocfs2: State change failed: (-7) Refusing to be Primary while peer is not outdated
Command 'drbdsetup disconnect ocfs2 2' terminated with exit code 11

==> become secondary first

while trying to shutdown a node

ocfs2: State change failed: (-6) Refusing to be Outdated while Connected
Command 'drbdsetup down ocfs2' terminated with exit code 11

==> this happens with a three-node setup – you need to have only one primary in the farm for that matter

node1 down

when trying to make a resource primary on a backup node

powerslack: State change failed: (-7) Refusing to be Primary while peer is not outdated
Command 'drbdsetup primary powerslack' terminated with exit code 11

==> force primary

res=ocfs2
#drbdsetup $res primary --overwrite-data-of-peer
drbdadm -- --overwrite-data-of-peer primary $res

Resources

split-brain

The DRBD9 User’s Guide https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/

DRBD recovery from split brain https://www.sidorenko.io/post/2012/03/drbd-recovery-from-split-brain/

drbdadm - Administration tool for DRBD https://www.venea.net/man/drbdadm-8.3(8)

How to get DRBD nodes out of Connection State StandAlone (and WFConnection)? https://serverfault.com/questions/870213/how-to-get-drbd-nodes-out-of-connection-state-standalone-and-wfconnection

HOWTO: Resolve DRBD split-brain recovery manually https://www.recitalsoftware.com/blogs/29-howto-resolve-drbd-split-brain-recovery-manually

DRBD Cheat Sheet https://lzone.de/cheat-sheet/DRBD

backup left-alone

CentOS6.5 installation and configuration drbd https://titanwolf.org/Network/Articles/Article?AID=8b7a4c3f-f46c-4467-a666-538b652f2f8b


How to get DRBD nodes out of Connection State StandAlone (and WFConnection)? https://serverfault.com/questions/870213/how-to-get-drbd-nodes-out-of-connection-state-standalone-and-wfconnection

[DRBD-user] DRBD promoting inconsistent disk? https://lists.linbit.com/pipermail/drbd-user/2008-August/009922.html


HOME | GUIDES | LECTURES | LAB | SMTP HEALTH | HTML5 | CONTACT
Copyright © 2024 Pierre-Philipp Braun