Selling hardware into the cloud

September 22nd, 2015 by

A Cambridge start-up approached us with an interesting problem. In this age of virtualisation, they have a new and important service, but one which can’t be virtualised as it relies on trusted hardware. They know other companies will want to use their service from within their private networks within the big cloud providers, but they can’t co-locate their hardware within Amazon or Azure.

This picture is a slight over simplification of the process

This picture is a slight over simplification of the process

The interesting thing here is that the solution is simple. It is possible to link directly into Amazon via Direct Connect and to Azure via Express Route. To use Direct Connect or Express Route within the UK you need to have a telco circuit terminating in a Telecity data centre, or to physically colocate your servers. As many of you will know, Mythic Beasts are physically present in three such data centres, the most important of which is Telecity Sovereign House, the main UK point of presence for both Amazon and Microsoft.

So our discussion here is nice and straightforward. Our future customer can co-locate their prototype service with Mythic Beasts in our Telecity site in Docklands. They can then connect to Express Route and Direct Connect over dedicated fibre within the datacentre when they’re ready to take on customers. Their customers then have to set up a VPC Peering connection and the service is ready to use. This is dedicated specialised hardware from the inside of ‘the cloud’, and it’s something we can offer to all manner of companies, start-up or not, from any dedicated or colocated service. You only need ask.

Ethernet Speeds: expect 2.5Gbps on copper, 25Gbps on fibre

September 18th, 2015 by


Recently we went to UKNOF where Alcatel Lucent gave a helpful presentation on new ethernet speeds.

Currently most network connectivity is 1Gbps ethernet over Cat5e copper which stretches up to 100m. There is an infrequently used standard for 10Gbps over Cat6 copper to 55m for higher speeds.

Now demand is starting to appear for faster than 1Gbps speeds, and it’s very attractive to do this without replacing the installed base of Cat5e and Cat6 cabling. There are new standards in the pipeline for 2.5Gbps and 5Gbps ethernet over Cat5e/Cat6 cabling.

In the data centre it’s common to have 10Gbps over SFP+ direct attach for short interconnects (up to 10m) and 1Gbps/10Gbps/40Gbps/100Gbps over fibre for longer distances. 1Gbps and 10Gbps are widely adopted. 40Gbps and 100Gbps are a different design, implemented by combining multiple lanes of traffic at 10Gbps to act as a single link. 100Gbps has changed to be 4 lanes at 25Gbps rather than 10 at 10Gbps.

The more lanes you have in use, the more switches and switching chips you need – but effectively this means that 40Gbps has the same cost in port count as 100Gbps. The next generation of 100Gbps switching hardware will consist of a large number of lanes that run at either 10Gbps or 25Gbps. With current interfaces, you’d use 4 lanes for 100Gbps, 4 lanes for 40Gbps or 1 lane for 10Gbps. The obvious gap is using a single lane for 25Gbps standard so you can connect vastly more devices at greater than 10Gbps speeds.

So in the near future, we’re expecting to see 2.5Gbps and 25Gbps ethernet becoming available, and in the longer term work has now started on 400Gbps standards.

Linux for switches

September 14th, 2015 by

For a long time at Mythic Beasts we’ve had a fairly healthy dislike for managed switches. The configuration method of switches is akin to a database with auto-commit on every command – you can’t batch a series of configuration changes into an atomic update. This means that you not only need to think about your starting and end configurations, but you also need to think about all the intermediate configuration too and make sure you don’t accidentally explode everything with an unexpected switch loop. Switches are also expensive and it’s always rankled that we’re paying a lot of money in order to use a network operating system that’s user unfriendly. Some of them are often less stable than the servers they connect to and they seem to manage excellent vendor lock-in – there is no end of advice that you can’t plug standards compatible switches from different manufacturers into each other because you risk inter-operability issues.

We’ve recently started trying Linux switches — commodity switches running Cumulus Linux.

Cumulus Linux makes your switch appear like a standard-ish debian server, with a lot of NICs.
The interfaces on our “1G” model are:

eth0 management interface
swp1 – swp48 1G switch ports
swp49 – swp52 10G switch ports

The switch is configured via /etc/network/interfaces, and uses bridges, VLANs and bonds to set up the configuration.

Linux has lots of advantages as a switch operating system. For a start if you need to patch ssh, under linux you download a replacement digitally signed openssh package and restart the process, on a traditional switch you download a whole new firmware over insecure tftp and reboot the switch – unlucky for the people connected to the switch.

The first obvious difference when configuring these switches is that by default, the switch doesn’t switch any traffic until some configuration is put in.

We can set up a simple network:

 # The primary network interface
 auto eth0
 iface eth0 inet static
        address xxx
        gateway xxx

 auto br0
 iface br0
         bridge-ports glob swp1-48
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp49-52
         bridge-stp on
         setmcsnoop 0

This sets up the 1G ports (1-48) as a single VLAN, the 10G ports (49-52) as second VLAN, a management interface on the management port (eth0).

In this case we have an uplink on port 48 to a different network. So to migrate the uplink from our 1G network to our 10G network we would write out a new configuration file:

 auto br0
 iface br0
         bridge-ports glob swp1-47
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp48-52
         bridge-stp on
         setmcsnoop 0

then bring the interfaces up with

 ifup -a

Note that ifup under Cumulus is different to standard Debian. It links to ifupdown2 which can inspect the current running state and apply only changes, rather than having to take an interface down and up on a standard server.

One deeply troubling thing about Cumulus Linux is it includes a minimal vi, but not a full implementation of vim.

But there are many other advantages that make up for this inexplicable oversight: being Debian-ish it has sudo, so you can give arbitrary permissions to multiple users rather than just show / enable / configure. You can easily update things with ssh. You can configure your switch with puppet. You can easily back up the entire configuration with rsync, version control it with etckeeper and bzr (sadly no git!). You can write code and run it directly on the switch which allows all kinds of options for monitoring and configuration.

We now have a few Cumulus Linux switches in production for private client networks. Here’s one providing lots and lots of bandwidth:

Even complex configurations can be handled relatively easily. For example, we have a customer with a private cloud who wants to run 20Gbps into each host, exposing different 10 different VLANs to their virtual servers, and then routing between them. This can be done on a 10G switch by bonding pairs of interfaces together, and then bridging the required VLANs on each of the bonded interfaces.

This config turns out to be nice and simple to write, and has the advantage of looking very similar on the switch and the server:

auto bond13            
iface bond13           
  bond-slaves swp1 swp2         
  bond-mode 802.3ad             
  bond-miimon 100               
  bond-use-carrier 1            
  bond-lacp-rate 1              
  bond-min-links 1              
  bond-xmit-hash-policy layer3+4
                       
auto bond14            
iface bond14           
  bond-slaves swp3 swp4         
  ....

auto br-tag130
iface br-tag130
  bridge-ports bond13.130 bond14.130 ...

auto br-tag2544
iface br-tag2544
  bridge-ports bond0.2544 bond1.2544  ...

Stormy weather, the clouds are growing.

August 26th, 2015 by

Photo-2015-08-26-12-31-10_1016

A customer of ours has been extending their private cloud. This adds another 160 cores, 160Gbps, 2TB of RAM and over half a petabyte of storage. On the left you can see the black mains cable, then the serial for out of bound configuration, then red cabling for 1Gbps each to our main network, then 20Gbps per server to the very secure private LAN on SFP+ direct attach.

The out of place yellow cable is network for the serial server above, and the out of place black one is serial to the 720Gbps switch which isn’t quite long enough to route neatly.

There’s a few more bits and pieces to add, but soon it will join their OpenStack cloud and substantially increase the rate at which their data gets crunched.

 

Bandwidth Upgrades for Cambridge servers

February 16th, 2015 by

Taking a break from our usual articles about upgrades for VPS customers and mocking the hopelessly incompetent, we’d like to announce an upgrade for dedicated and colo customers in our Cambridge data centre. We’ve finally completed the upgrade of both of our links into Cambridge, so have increased bandwidth quotas, and reduced excess rates to just 7p/GB.

Details of the new specs can be found on our Dedicated Server, Colocation and Mac Mini Colo pages.

Depending on nowhere by peering with everyone everywhere

August 29th, 2014 by

We’ve been adding some more peering sessions to improve our network redundancy. We already had direct peering with every significant UK ISP, we’ve now enhanced this so that one peering session terminates at one of the Telehouse sites, and the second terminates at one of the Telecity or Equinix sites. Each peering session is on a different London Internet Exchange (LINX) network which are physically separate from each other, and where possible we’ve preferred peering sessions that remain within a single building.

We have equal capacity on both networks at LINX, so unlike many ISPs with a single peering port or unequal capacity, in the event of a severe failure (e.g. a whole network or data centre) we just automatically migrate our traffic to our other peering link, rather than falling back to burst bandwidth with our transit providers. We feel that’s a risky strategy because failures are likely to be correlated, lots of ISPs will fall back to transit all at the same time in a badly planned and uncoordinated fashion which could cause a huge traffic spike upstream.

We light our own fibre ring around our core Docklands data centres, and have full transit and peering at both of our core POPs, with dual routers in each, and can offer full or partial transit at any of our data centres.

512k routes

August 13th, 2014 by

Some ISPs have started to report that their IPv4 routing tables now exceed 512k individual routes. At present we’re only seeing 502k routes but we’re nearly there.

Now for us this isn’t going to make any difference, our routers can all easily handle the routing table of this size, and the full IPv6 routing table of 30k routes all at the same time. However, it may start to affect things within other ISPs that we connect to. The likely things that we’ll see happening are,

  • Some ISPs will just drop some IPv4 routes, or cease processing updates, which means gradually odd bits of the internet will cease to work.
  • Some ISPs may fall into a software routing mode reducing in reduced performance.
  • Some ISPs will rely on filtering to reduce the routing table in size by aggregating routes together.
  • Some ISPs will alter their configuration for Cisco CAT6500 routers and disable IPv6 in order to increase the memory available in their routers for IPv4.

So watch out for oddities, and expect them to occur more and more frequently as the growth in the routing table gradually reveals which ISPs are running into trouble having not planned ahead.

Enabling Anycast DNS with Esgob

May 15th, 2014 by

Nat Morris, UK Network Operators Forum director recently gave a presentation to DNS Operations, Analysis and Research centre, which included this remarkably nice slide:

Screen Shot 2014-05-12 at 12.25.22

What is Anycast?

Normally a server has a globally unique IP address, and the Internet knows how to send traffic from any other machine in the world to that IP address. With Anycast we share a single address across multiple machines, and your traffic is sent to the nearest machine with that address. This means that UK customers can be answered from a server in the UK and Australian customers from a server in Australia allowing you to have very fast responses to things like DNS queries because you’re always served by a server that’s close by, rather than your query having to travel half way around the world.

To set up an Anycast network, you need your own address space, your own network number (ASN), multiple BGP-aware routers that can announce your address space, and multiple servers that can answer the queries. Typically this would require a pretty hefty budget, but if you’re Nat Morris and you know what you’re doing with software routing on Linux, and you know all the right providers then you can bring up a global Anycast network with 10+ servers and sites on an annual budget of well under $1,000.

The key to doing this is finding ISPs, ideally well-connnected ISPs in key internet hubs, who will provide you with a BGP feed to your hosted server. That’s where a UK clueful hosting company comes into the picture having excellent connectivity, inexpensive virtual machines (VMs) and a willingness to support customers with more unusual configurations.

Quick introduction to BGP and routing

Normally when you have a VM you get a default route, which looks like this:

# ip route 
...
default via 93.93.128.1 dev eth1 

which says that to get to anywhere on the internet, send packets to our router at 93.93.128.1.

Over BGP, instead we send you the whole routing table:

# ip route 
...
1.0.7.0/24 via 5.57.80.128 dev eth3.4  proto zebra  metric 1 
1.0.20.0/23 via 93.93.133.46 dev eth6.220  proto zebra  metric 142080 
... 
500,000 more lines like this

For every block on the whole internet you have a different gateway depending on what you’ve decided is the preferred route. At today’s count this is about 490,000 entries in the routing table. Don’t type ‘route’ if you’re logged in over 3G!

So for this VM, instead of having a default route, Nat has four full BGP sessions, two to each of our two routers to the site. On each router, one session provides 490,000 IPv4 routes, the other provides 18,000 IPv6 routes, and the VM gets to decide which router to send data to.

The other side of the BGP relationship, and the important bit for Anycasting, is that we receive an advert from Nat’s VM for his /24 of IPv4 space and /48 of IPv6 space, which we then advertise out to the world. The 10+ other providers in this Anycast setup will do the same, and hosts will direct traffic to whichever is nearest.

Filtering

As Paul Vixie pointed out in the first question to Nat, the main customers of VMs with BGP are spammers who hijack address space for nefarious usage. At Mythic Beasts we filter our announcements and our customer routes, so if Nat messes up his configuration and accidentally announces that his VM is responsible for the whole of Youtube we’ll drop the announcement rather than expecting one very small VM to handle one fifth of the internet.

BGP on a virtual or dedicated server

If you’re a DNS provider or a content delivery network, you’ll probably want to have an Anycast setup at some point. At Mythic Beasts we remember what it was like to be the little guy which is why we offer full BGP routing (including IPv6 BGP) as an option to any virtual server, dedicated server, colocated server or router. Providing you own your own ASN and IP space we can transit it for you and we can keep the start-up costs very low and scale with you. You can locate your VM or server directly with us in Telecity, mere tens of metres from LINX and LoNAP for minimal latency and maximal available bandwidth.

If you’ve no idea what an ASN, BGP, LIR, RIPE are, we can help arrange your ASN, IP space and BGP config.

Router fails, no packets dropped.

January 29th, 2014 by

This morning one of our routers in our Cambridge data centre stopped reporting bandwidth data to our billing system. We investigated and whilst it was still routing packets without issue, it appeared to be experiencing hardware failure.

We’ve powered the router down, pending full investigation on our data centre visit this afternoon. Currently all traffic from our Cambridge site is being handled by our other router. This seamlessly failed over with no customer impact.

Depending on your choice of terminology ‘Redundancy has been reduced to N’, or ‘The network is at-risk’. In Mythic Beasts we like to speak English so this translates to, if something else fails before the router is restored to service, there is a risk of a network outage to our Cambridge data centre.

Update : Friday 31st we fully restored our network to it’s usual redundant configuration by replacing the router with a similarly over specified replacement. Customers may have received free bandwidth for some of this period.

More bits

January 10th, 2014 by

At the end of last year we took the decision to significantly upgrade our two connections to LINX – our busiest connections to the outside world.

This turned out to be a good plan as Mythic Beasts got a Christmas present in the form of a new company bandwidth record, thanks to two customers, Blinkbox Music and Raspberry Pi getting a substantial spike in hits as people unwrapped their Christmas presents.

And it seems that the excitement of all the presents hasn’t worn off, as the Christmas day record has just been toppled by a new all time high yesterday. With the Blinkbox apps very high in the free music app charts, we’re not expecting it to stand for long.