keepalived | conntrackd | fuck-martinez
THIS IS STILL A DRAFT AS OF NOV 2022
tested on slackware150
This is a very special use-case. We want multiple XEN hosts (dom0) to provide SNAT gateways for their respective guests, so the traffic does not go through a dedicated router (nor a VRRP‘ed single internal VIP, for that matter).
The goal is to have multiple gateways with the same internal IP on every node.
This is assuming the XEN hosts have a white IP and are directly connected to the outside world hence two bridges.
xenbr0 -- public bridge guestbr0 -- internal bridge for guests
It has been tested as a XEN/PV nested into KVM PoC as well as in real on a Slackware Linux 15.0 based XEN / DRBD convergent farm. The following instructions however are simplified to reproduce the former PoC. Otherwise GNS3 is also fine and helps to sniff around with Wireshark — however GNS3 won’t display the linux bridge of the XEN hosts so you will have to sniff remotely anyways.
enable ip4 fwd and SNAT
on all nodes - with a different external IP on xenbr0
vi /etc/rc.d/rc.inet1 echo -n xenbr0 ... ifconfig eth0 up brctl addbr xenbr0 brctl addif xenbr0 eth0 ifconfig xenbr0 192.168.122.11/24 up && echo done || echo FAIL echo -n default route ... route add default gw 192.168.122.1 && echo done || echo FAIL echo -n guestbr0 ... ifconfig eth1 up brctl addbr guestbr0 brctl addif guestbr0 eth1 ifconfig guestbr0 10.1.1.254/24 up && echo done || echo FAIL # self-verbose sysctl -w net.ipv4.ip_forward=1 echo -n snat... nft -f /etc/nftables.conf && echo done || echo FAIL
prepare a template to be text processed
we discard ARP packets from the other nodes on the guestbr0 interface
vi /etc/nftables.conf.tpl define nic=xenbr0 define gst=guestbr0 table arp filter flush table arp filter table arp filter { chain input { type filter hook input priority filter; policy accept; iif $gst arp saddr ether @@@othernode1@@@ drop iif $gst arp saddr ether @@@othernode2@@@ drop } chain output { type filter hook output priority filter; policy accept; oif $gst arp daddr ether @@@othernode1@@@ drop oif $gst arp daddr ether @@@othernode2@@@ drop } table inet filter flush table inet filter table inet filter { chain input { type filter hook input priority filter; policy accept; } # NAT --> accept chain forward { type filter hook forward priority 0; policy accept; } chain output { type filter hook output priority filter; policy accept; } } table ip nat flush table ip nat table ip nat { # SNAT chain postrouting { type nat hook postrouting priority srcnat; ip saddr 10.1.1.0/24 oif $nic snat 192.168.122.11; } # DNAT to reach guest6 from the KVM host chain prerouting { type nat hook prerouting priority dstnat; iif $nic tcp dport 22 dnat 10.1.1.6; } }
apply on node1
mac1=xx:xx... mac2:xx:xx... mac3=xx:xx... sed -r "s/@@@othernode1@@@/$mac2/; s/@@@othernode2@@@/$mac3/" /etc/nftables.conf.tpl > /etc/nftables.conf && echo done nft --check -f /etc/nftables.conf && echo done nft -f /etc/nftables.conf && echo reloaded
and respectively for node2 and node3
sed -r "s/@@@othernode1@@@/$mac1/; s/@@@othernode2@@@/$mac3/" /etc/nftables.conf.tpl > /etc/nftables.conf.node2 && echo done sed -r "s/@@@othernode1@@@/$mac1/; s/@@@othernode2@@@/$mac2/" /etc/nftables.conf.tpl > /etc/nftables.conf.node3 && echo done nft --check -f /etc/nftables.conf.node2 && echo done nft --check -f /etc/nftables.conf.node3 && echo done ssh node2 "nft -f /etc/nftables.conf && echo reloaded" ssh node3 "nft -f /etc/nftables.conf && echo reloaded"
Eventually setup a DNAT rule (port-forwarding) as seen above to reach some guest from the underlying bridge holding the whole PoC (here virbr0
).
Then check the MAC address of the local gateway as seen from the guest and ping the public network continuously.
While doing those tests, eventually sniff the whole thing with Wireshark (ideally remotely against the network interface of the guest).
from guest6, two virtualization layers above
arp -a | grep \\.254$ ping opendns.com
==> should be the MAC of node1
from the XEN node that runs the guest, in this example, node1
xl migrate guest6 node2
check which MAC it’s using as a peer gateway
back to the guest
arp -a | grep \\.254$
==> should be the MAC of node2 already
eventually remove that arp entry and try again
arp -d 10.1.1.254
https://wiki.nftables.org/wiki-nftables/index.php/Scripting
https://wiki.nftables.org/wiki-nftables/index.php/Simple_rule_management
https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes
https://wiki.nftables.org/wiki-nftables/index.php/Configuring_chains
https://wiki.nftables.org/wiki-nftables/index.php/Performing_Network_Address_Translation_(NAT)
https://wiki.nftables.org/wiki-nftables/index.php/Main_Page
https://wiki.nftables.org/wiki-nftables/index.php/Concatenations
https://wiki.nftables.org/wiki-nftables/index.php/Sets
https://wiki.gentoo.org/wiki/Nftables
https://wiki.gentoo.org/wiki/Nftables/Examples
https://unix.stackexchange.com/questions/192313/how-do-you-clear-the-arp-cache-on-linux
https://linux-audit.com/how-to-clear-the-arp-cache-on-linux/
https://wiki.archlinux.org/title/nftables –> arp there is
https://wiki.polaire.nl/doku.php?id=nftables –> big example
https://paulgorman.org/technical/linux-nftables.txt.html –> arp family
https://wiki.nftables.org/wiki-nftables/index.php/Nftables_families –> arp family
https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes#Arp
https://wiki.nftables.org/wiki-nftables/index.php/Data_types
https://wiki.nftables.org/wiki-nftables/index.php/Matching_packet_headers#Matching_ARP_headers
https://stackoverflow.com/questions/68089536/nftables-drop-arp-traffic-on-specific-bridge
https://www.xmodulo.com/how-to-add-or-remove-static-arp-entry-on-linux.html
https://www.oreilly.com/library/view/network-security-hacks/0596006438/ch03s03.html
https://askubuntu.com/questions/22998/add-static-arp-entries-when-network-is-brought-up
https://a.ndronic.us/set-up-static-arp-table/
https://www.redhat.com/sysadmin/arp-versus-ip –> ip6
https://serverfault.com/questions/1083698/linux-what-causes-static-arp-entries-to-flush-on-link-down –> beware it goes away