deal with inbound traffic

we now have a working poc on xen or kvm, but what about traffic flows that were inbound initiated?
assuming DNS round-robin, the client requests arrive on varying nodes, and not necessarily the one where the service lives as a guest system. the guest systems living on different nodes have differing outbound gateways, so this would obviously bring some problems (for TCP at least, and apparently even for UDP. there are three solutions for this:
FULL-NAT: we do not attempt to optimize the TCP responses' route and let those find the way back through the entering node — the one we are discussing here
CT-SYNC: we use conntrackd to synchronize the states so the answers can go right through the host gateway, just like for initiated outbound traffic in the previous pocs (but in that case, we probably need to mangle the source IP of the answer)
STATELESS-NAT: we rebuild the DNAT state based on per-node tags — this is what became part4
we need to test a few use-cases:
on guestbr0 we differentiate node IP (e.g. 10.1.255.251) and duplicated outbound gateway IP (10.1.255.254) – and then we do full-nat instead of dnat – for the outbound packet to find its route back to where the DNAT inbound connection came from (you won’t have the issue if you are using a reverse-proxy already)
the trick is to define what destination ip you want to arp filter out, instead of using the mac address – and to carefully craft a custom subnet-wide snat rule that goes along with the port-specific dnat rules
flush ruleset
table ip nat {
# SNAT
chain postrouting {
type nat hook postrouting priority srcnat;
# node1
# casual outbound
ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.11
# full-nat inbound
ip daddr 10.5.5.0/24 oif guestbr0 snat 10.5.5.251
# node2
# casual outbound
#ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.12
# full-nat inbound
#ip daddr 10.5.5.0/24 oif guestbr0 snat 10.5.5.252
}
# DNAT
chain prerouting {
type nat hook prerouting priority dstnat;
# node1
iif xenbr0 tcp dport 80 dnat 10.5.5.202
# node2
#iif xenbr0 tcp dport 80 dnat 10.5.5.201
# shared
iif xenbr0 tcp dport 2201 dnat 10.5.5.201:22
iif xenbr0 tcp dport 2202 dnat 10.5.5.202:22
}
}
table netdev filter {
chain egress {
type filter hook egress devices = { eth1.100, eth2.100 } priority -500;
arp saddr ip 10.5.5.254 drop
arp daddr ip 10.5.5.254 drop
}
}
prepare two XEN guests which live resp. on two different nodes –or– for the purpose of this PoC, KVM guests w/o libvirt instead
vi /etc/network/interfaces
auto eth0
iface eth0 inet static
# guest1
address 10.5.5.201/24
# guest2
#address 10.5.5.202/24
gateway 10.5.5.254
connect to the hosts and start their respective guest systems
ssh bookworm1 ssh bookworm2
screen -S guest
guest=guest1
guest=guest2
vdisk=/data/guests/$guest/$guest.ext4
kvm --enable-kvm -m 256 \
-display curses -serial pty \
-drive file=$vdisk,media=disk,if=virtio,format=raw \
-kernel $kernel -initrd $initrd -append "ro root=/dev/vda net.ifnames=0 biosdevname=0 mitigations=off" \
-nic bridge,br=guestbr0,model=virtio-net-pci
curl -i 192.168.122.11
==> <pre>this is guest2 OK
curl -i 192.168.122.12
==> <pre>this is guest1 OK
ssh 192.168.122.11 -p 2201 -l root
ssh 192.168.122.12 -p 2201 -l root
==> guest1 all good
ssh 192.168.122.11 -p 2202 -l root
ssh 192.168.122.12 -p 2202 -l root
==> guest2 all good