Linux for switches

September 14th, 2015 by

For a long time at Mythic Beasts we’ve had a fairly healthy dislike for managed switches. The configuration method of switches is akin to a database with auto-commit on every command – you can’t batch a series of configuration changes into an atomic update. This means that you not only need to think about your starting and end configurations, but you also need to think about all the intermediate configuration too and make sure you don’t accidentally explode everything with an unexpected switch loop. Switches are also expensive and it’s always rankled that we’re paying a lot of money in order to use a network operating system that’s user unfriendly. Some of them are often less stable than the servers they connect to and they seem to manage excellent vendor lock-in – there is no end of advice that you can’t plug standards compatible switches from different manufacturers into each other because you risk inter-operability issues.

We’ve recently started trying Linux switches — commodity switches running Cumulus Linux.

Cumulus Linux makes your switch appear like a standard-ish debian server, with a lot of NICs.
The interfaces on our “1G” model are:

eth0 management interface
swp1 – swp48 1G switch ports
swp49 – swp52 10G switch ports

The switch is configured via /etc/network/interfaces, and uses bridges, VLANs and bonds to set up the configuration.

Linux has lots of advantages as a switch operating system. For a start if you need to patch ssh, under linux you download a replacement digitally signed openssh package and restart the process, on a traditional switch you download a whole new firmware over insecure tftp and reboot the switch – unlucky for the people connected to the switch.

The first obvious difference when configuring these switches is that by default, the switch doesn’t switch any traffic until some configuration is put in.

We can set up a simple network:

 # The primary network interface
 auto eth0
 iface eth0 inet static
        address xxx
        gateway xxx

 auto br0
 iface br0
         bridge-ports glob swp1-48
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp49-52
         bridge-stp on
         setmcsnoop 0

This sets up the 1G ports (1-48) as a single VLAN, the 10G ports (49-52) as second VLAN, a management interface on the management port (eth0).

In this case we have an uplink on port 48 to a different network. So to migrate the uplink from our 1G network to our 10G network we would write out a new configuration file:

 auto br0
 iface br0
         bridge-ports glob swp1-47
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp48-52
         bridge-stp on
         setmcsnoop 0

then bring the interfaces up with

 ifup -a

Note that ifup under Cumulus is different to standard Debian. It links to ifupdown2 which can inspect the current running state and apply only changes, rather than having to take an interface down and up on a standard server.

One deeply troubling thing about Cumulus Linux is it includes a minimal vi, but not a full implementation of vim.

But there are many other advantages that make up for this inexplicable oversight: being Debian-ish it has sudo, so you can give arbitrary permissions to multiple users rather than just show / enable / configure. You can easily update things with ssh. You can configure your switch with puppet. You can easily back up the entire configuration with rsync, version control it with etckeeper and bzr (sadly no git!). You can write code and run it directly on the switch which allows all kinds of options for monitoring and configuration.

We now have a few Cumulus Linux switches in production for private client networks. Here’s one providing lots and lots of bandwidth:

Even complex configurations can be handled relatively easily. For example, we have a customer with a private cloud who wants to run 20Gbps into each host, exposing different 10 different VLANs to their virtual servers, and then routing between them. This can be done on a 10G switch by bonding pairs of interfaces together, and then bridging the required VLANs on each of the bonded interfaces.

This config turns out to be nice and simple to write, and has the advantage of looking very similar on the switch and the server:

auto bond13            
iface bond13           
  bond-slaves swp1 swp2         
  bond-mode 802.3ad             
  bond-miimon 100               
  bond-use-carrier 1            
  bond-lacp-rate 1              
  bond-min-links 1              
  bond-xmit-hash-policy layer3+4
                       
auto bond14            
iface bond14           
  bond-slaves swp3 swp4         
  ....

auto br-tag130
iface br-tag130
  bridge-ports bond13.130 bond14.130 ...

auto br-tag2544
iface br-tag2544
  bridge-ports bond0.2544 bond1.2544  ...