
consider two bridges connected to each other, and two groups of clients, some connected to bridge1 and some to bridge2. now bridges also act as outbound gateways, with internal interface 10.5.5.254 (duplicate) blocking ARPs from one bridge to another allows the clients to use the closest gw, while being able to communicate among each others, incl. with the clients connected to the other bridge.
this PoC goes with only two systems/bridges but the concept can be extended to a larger cluster.
arp table only deals with the local system apparently – I should use the netdev family insteadnetdev didn’t seem to work when device is a linux bridge – works when I tackle the interface withingns3 wtih two casual KVM debian nodes
make sure the GNS3 guest settings Base MAC address is christal clear to yourself and the audience – notice those are private MAC addresses
node1 0a:00:00:00:00:00 node2 0e:00:00:00:00:00
assuming a boostrapped debian, you need to pimp it up a little bit
ip addr add 192.168.122.9/24 dev eth0 ip route add default via 192.168.122.1 ping 192.168.122.1 cat /etc/resolv.conf apt update apt install bridge-utils net-tools openssh-server tcpdump ethtool
and proceed with the poc1 setup
on node1
vi /etc/network/interfaces
auto eth0
iface eth0 inet manual
auto eth1
iface eth1 inet manual
auto eth2
iface eth2 inet manual
auto xenbr0
iface xenbr0 inet static
bridge_ports eth0
bridge_hw eth0
# node1
address 192.168.122.11/24
gateway 192.168.122.1
auto guestbr0
iface guestbr0 inet static
# poc1 needs both
bridge_ports eth1 eth2
bridge_hw eth1
# duplicate on both nodes
address 10.5.5.254/24
on node2
# node2
address 192.168.122.12/24
static name resolution is always a good thing to have
vi /etc/hosts # communicate through the front door # as we are filtering the back door 192.168.122.11 bookworm1 192.168.122.12 bookworm2
prevent the guest bridge from sending its own ARP replies to the other guest bridges
node1
cp -pi /etc/sysctl.conf /etc/sysctl.conf.dist
echo net.ipv4.ip_forward = 1 >>/etc/sysctl.conf
sysctl -p
vi /etc/nftables.conf
flush ruleset
table ip nat {
chain postrouting {
type nat hook postrouting priority srcnat;
ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.11;
}
}
table netdev filter {
chain egress {
type filter hook egress device eth1 priority -500; policy accept;
arp saddr ether 0a:00:00:00:00:01 drop
}
}
chmod -x /etc/nftables.conf
systemctl status nftables
systemctl restart nftables
systemctl enable nftables
node2
ip saddr 10.5.5.0/24 oif xenbr0 snat 192.168.122.12;
arp saddr ether 0e:00:00:00:00:01 drop
VPCS
ip 10.5.5.201/24 10.5.5.254 ip 10.5.5.202/24 10.5.5.254 save
sniff the link between node1 and vpcs1 – you should see a single ARP reply. even better, now sniff the link between node1 and node2 – you should not see any ARP reply at all
from some vpcs
clear arp ping 10.5.5.254
meanwhile the users can still reach eacher other
from vpcs2
ping 10.5.5.201
https://wiki.nftables.org/wiki-nftables/index.php/Nftables_families
https://netfilter.org/projects/nftables/manpage.html
http://superuser.com/questions/907827/private-mac-address
https://stackoverflow.com/questions/14710389/reserved-mac-addresses-some-are-assigned-anyway
https://en.wikipedia.org/wiki/MAC_address#Ranges_of_group_and_locally_administered_addresses