Convergent Gateways in Da Place - part 3

part1 | part2 | part3 | part4

deal with inbound traffic

IMAGE HERE

descriptiondescription

we now have a working poc on xen or kvm, but what about traffic flows that were inbound initiated?

assuming DNS round-robin, the client requests arrive on varying nodes, and not necessarily the one where the service lives as a guest system. the guest systems living on different nodes have differing outbound gateways, so this would obviously bring some problems (for TCP at least, and apparently even for UDP. there are three solutions for this:

  1. FULL-NAT: we do not attempt to optimize the TCP responses' route and let those find the way back through the entering node — the one we are discussing here

  2. CT-SYNC: we use conntrackd to synchronize the states so the answers can go right through the host gateway, just like for initiated outbound traffic in the previous pocs (but in that case, we probably need to mangle the source IP of the answer)

  3. STATELESS-NAT: we rebuild the DNAT state based on per-node tags — this is what became part4

network requirementsnetwork requirements

we need to test a few use-cases:

  1. simple reverse-proxy setup e.g. for an HTTP service pointing to a precise guest system — no problem there as it is L7 with a clear segmentation on L3 (no routing required). this use-case simply helps to validate the dual-ip setup on the guest bridge (see below)
  2. stateful TCP connections e.g. SSH or HTTP against a precise guest system — this helps to validate FULL-NAT and CT-SYNC
  3. stateless but still bi-directional UDP connections e.g. NTP — which doesn’t seem to allow different return-path either

FULL-NATFULL-NAT

on guestbr0 we differentiate node IP (e.g. 10.1.255.251) and duplicated outbound gateway IP (10.1.255.254) – and then we do full-nat instead of dnat – for the outbound packet to find its route back to where the DNAT inbound connection came from (you won’t have the issue if you are using a reverse-proxy already)

the trick is to define what destination ip you want to arp filter out, instead of using the mac address – and to carefully craft a custom subnet-wide snat rule that goes along with the port-specific dnat rules

flush ruleset

table ip nat {
    # SNAT
    chain postrouting {
        type nat hook postrouting priority srcnat;
        # node1
        # casual outbound
        ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.11
        # full-nat inbound
        ip daddr 10.5.5.0/24 oif guestbr0 snat 10.5.5.251
        # node2
        # casual outbound
        #ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.12
        # full-nat inbound
        #ip daddr 10.5.5.0/24 oif guestbr0 snat 10.5.5.252
    }

    # DNAT
    chain prerouting {
        type nat hook prerouting priority dstnat;
        # node1
        iif xenbr0 tcp dport 80 dnat 10.5.5.202
        # node2
        #iif xenbr0 tcp dport 80 dnat 10.5.5.201
        # shared
        iif xenbr0 tcp dport 2201 dnat 10.5.5.201:22
        iif xenbr0 tcp dport 2202 dnat 10.5.5.202:22
    }
}

table netdev filter {
    chain egress {
        type filter hook egress devices = { eth1.100, eth2.100 } priority -500;
        arp saddr ip 10.5.5.254 drop
        arp daddr ip 10.5.5.254 drop
    }
}

guest systemsguest systems

prepare two XEN guests which live resp. on two different nodes –or– for the purpose of this PoC, KVM guests w/o libvirt instead

vi /etc/network/interfaces

auto eth0
iface eth0 inet static
    # guest1
    address 10.5.5.201/24
    # guest2
    #address 10.5.5.202/24
    gateway 10.5.5.254

ready to goready to go

connect to the hosts and start their respective guest systems

ssh bookworm1
ssh bookworm2
screen -S guest
guest=guest1
guest=guest2
vdisk=/data/guests/$guest/$guest.ext4
kvm --enable-kvm -m 256 \
    -display curses -serial pty \
    -drive file=$vdisk,media=disk,if=virtio,format=raw \
    -kernel $kernel -initrd $initrd -append "ro root=/dev/vda net.ifnames=0 biosdevname=0 mitigations=off" \
    -nic bridge,br=guestbr0,model=virtio-net-pci

acceptanceacceptance

http

curl -i 192.168.122.11

==> <pre>this is guest2 OK

curl -i 192.168.122.12

==> <pre>this is guest1 OK

ssh

ssh 192.168.122.11 -p 2201 -l root
ssh 192.168.122.12 -p 2201 -l root

==> guest1 all good

ssh 192.168.122.11 -p 2202 -l root
ssh 192.168.122.12 -p 2202 -l root

==> guest2 all good


HOME | GUIDES | LECTURES | LAB | SMTP HEALTH | HTML5 | CONTACT
Licensed under MIT