consider two bridges connected to each other, and two groups of clients, some connected to bridge1 and some to bridge2. now bridges also act as outbound gateways, with internal interface 10.5.5.254 (duplicate) blocking ARPs from one bridge to another allows the clients to use the closest gw, while being able to communicate among each others, incl. with the clients connected to the other bridge.
this PoC goes with only two systems/bridges but the concept can be extended to a larger cluster.
arp
table only deals with the local system apparently – I should use the netdev
family insteadnetdev
didn’t seem to work when device is a linux bridge – works when I tackle the interface withingns3 wtih two casual KVM debian nodes
make sure the GNS3 guest settings Base MAC address is christal clear to yourself and the audience – notice those are private MAC addresses
node1 0a:00:00:00:00:00 node2 0e:00:00:00:00:00
assuming a boostrapped debian, you need to pimp it up a little bit
ip addr add 192.168.122.9/24 dev eth0 ip route add default via 192.168.122.1 ping 192.168.122.1 cat /etc/resolv.conf apt update apt install bridge-utils net-tools openssh-server tcpdump ethtool
and proceed with the poc1 setup
on node1
vi /etc/network/interfaces auto eth0 iface eth0 inet manual auto eth1 iface eth1 inet manual auto eth2 iface eth2 inet manual auto xenbr0 iface xenbr0 inet static bridge_ports eth0 bridge_hw eth0 # node1 address 192.168.122.11/24 gateway 192.168.122.1 auto guestbr0 iface guestbr0 inet static # poc1 needs both bridge_ports eth1 eth2 bridge_hw eth1 # duplicate on both nodes address 10.5.5.254/24
on node2
# node2 address 192.168.122.12/24
static name resolution is always a good thing to have
vi /etc/hosts # communicate through the front door # as we are filtering the back door 192.168.122.11 bookworm1 192.168.122.12 bookworm2
prevent the guest bridge from sending its own ARP replies to the other guest bridges
node1
cp -pi /etc/sysctl.conf /etc/sysctl.conf.dist echo net.ipv4.ip_forward = 1 >>/etc/sysctl.conf sysctl -p vi /etc/nftables.conf flush ruleset table ip nat { chain postrouting { type nat hook postrouting priority srcnat; ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.11; } } table netdev filter { chain egress { type filter hook egress device eth1 priority -500; policy accept; arp saddr ether 0a:00:00:00:00:01 drop } } chmod -x /etc/nftables.conf systemctl status nftables systemctl restart nftables systemctl enable nftables
node2
ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.12; arp saddr ether 0e:00:00:00:00:01 drop
VPCS
ip 10.5.5.201/24 10.5.5.254 ip 10.5.5.202/24 10.5.5.254 save
sniff the link between node1 and vpcs1 – you should see a single ARP reply. even better, now sniff the link between node1 and node2 – you should not see any ARP reply at all
from some vpcs
clear arp ping 10.5.5.254
meanwhile the users can still reach eacher other
from vpcs2
ping 10.5.5.201
https://wiki.nftables.org/wiki-nftables/index.php/Nftables_families
https://netfilter.org/projects/nftables/manpage.html
http://superuser.com/questions/907827/private-mac-address
https://stackoverflow.com/questions/14710389/reserved-mac-addresses-some-are-assigned-anyway
https://en.wikipedia.org/wiki/MAC_address#Ranges_of_group_and_locally_administered_addresses