the “disable stp” mesh attempt with Etherswitches

storm-test | mesh-no-stp | mesh-no-stp-mlag

try to survive without stp

warning // lessons learned

ESW doesn’t offer LACP nor VSS nor MLAG/mlacp. balancing without HA is the best we can get here.

we’ve tried balance-rr mode but there’s packet loss. it expects an aggregate on the other end, not sure which one. anyhow:

  1. none but PAgP is available.
  2. we’re in a mesh and the aggregate would have to be multichassis

so we go for static balance-tlb (default is dynamic but we don’t need that - all links are equal).

we do acceptance testing using layer 4 distribution algo thanks to iperf3 running on different tcp ports.

architecture & lessons learned

IMAGE HERE

the architecture described above does not necessarily provoke a full-blown storm, as the leaf-nodes do not forward packets, even broadcast packets. worst thing you would see is duplicate broadcast packets on the leaf-nodes because of the multiple pathes. beware however, if you add yet another link between ESW1 and ESW2, you would provoke a storm indeed.

2 x Etherswitch (c3725 with NM-16ESW)

3 x Slackware Linux (15.0 64-bit)

cli setup

disable STP completely for the overall default vlan

on both etherswitches

no banner exec

no spanning-tree vlan 1

bonding setup

on node1

echo alias bond0 bonding > /etc/modprobe.d/bonding.conf

modprobe bonding
echo balance-tlb > /sys/class/net/bond0/bonding/mode
echo layer3+4 > /sys/class/net/bond0/bonding/xmit_hash_policy
echo 0 > /sys/class/net/bond0/bonding/tlb_dynamic_lb
ifconfig bond0 10.5.5.1/24 up
echo +eth1 > /sys/class/net/bond0/bonding/slaves
echo +eth2 > /sys/class/net/bond0/bonding/slaves
ifconfig eth1 up
ifconfig eth2 up

on node2,3

...
ifconfig bond0 10.5.5.2/24 up
ifconfig bond0 10.5.5.3/24 up
...

ready to go

show interfaces status
show etherchannel summary

cat /proc/net/bonding/bond0

acceptance

no full-blown storm

sniff the link between the two switchen. there should be no storm unless you add a secondary link between the switchen.

# from node2 or 3
    ping -c1 10.5.5.1
    ping -b -c1 10.5.5.255

leaf-node packet duplicates

since we have multiple path there should be duplicate broadcast packets on the end nodes. there are multiple pathes.

# node1
tcpdump -i all

# from node2 or 3
    ping -c1 10.5.5.1
    ping -b -c1 10.5.5.255

ha acceptance test

well this is a fail, even with the the dynamic setting. the balance-tlb bonding mode is not designed for handling a failover.

throughput acceptance

    # on node1
    ( iperf3 -s & ); iperf3 -s -p 5202

    # on node2 or 3 - single pipe
    iperf3 -c 10.5.5.1

==> e.g. 100 mbit/s

    # on node2 or 3 - multiple tcp/ports
    ( iperf3 -c 10.5.5.1 & ); iperf3 -c 10.5.5.1 -p 5202

==> should be about the double of the previous amount (for two pipes within the LACP aggregate)

resources

https://www.kernel.org/doc/html/latest/networking/bonding.html


HOME | GUIDES | LECTURES | LAB | SMTP HEALTH | HTML5 | CONTACT
Copyright © 2024 Pierre-Philipp Braun