Ten years on, Chris Lightfoot looks more prescient than ever

February 13th, 2017 by

(gif from imgur via gify)

(title shamelessly stolen from Tom Steinberg, MySociety founder from his tribute to Chris, Mythic Beasts founder who died ten years ago).

Lots of people have been excited recently about this script, which allows you to remotely reinstall a Linux system with a different version of Linux by giving you a shell in a ramdisk and letting you reinstall the operating system from there.

Chris did this the hard way. Back in 2005 I remember being asked to code review ‘evil.c’, a script that allocated a lot of RAM (800MB!), compressed a filesystem into it, then uncompressed it back to the disk. On reboot it should come up with Debian instead of FreeBSD that it had earlier. It’s really very important not to swap during this process.

Amazingly it worked, and the first test was on a remote box and it saved us a data centre visit. Here’s the code in its full glory.

#include <sys/types.h>

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <zlib.h>

#include <sys/mman.h>
#include <sys/reboot.h>

#define SIZE        ((size_t)896058269)

#define E(...)      fprintf(stderr, __VA_ARGS__)
#define die()       do { fprintf(stderr, "%s; aborted\n", strerror(errno)); exit(1); } while (0)
#define zdie()      do { fprintf(stderr, "%s; aborted\n", Z.msg); exit(1); } while (0)

int main(void) {
    unsigned char *buf, *outbuf, *p;
    int fd;
    FILE *fp;
    z_stream Z = {0};
    unsigned int nin = 0, nout = 0;

    E("size = %lu\n", (unsigned long)SIZE);

    E("open /dev/amrd0... ");
    if (-1 == (fd = open("/dev/amrd0", O_RDWR | O_DIRECT)))
        die();
    E("done\n");
    close(fd);

    E("allocate file buffer... ");
    if (!(buf = malloc(SIZE)))
        die();
    E("done\n");

    E("allocate write buffer... ");
    if (!(outbuf = malloc(1024 * 1024)))
        die();
    E("done\n");

    E("lock into memory... ");
    if (-1 == mlockall(MCL_CURRENT | MCL_FUTURE))
        die();
    E("done\n");

    E("open file... ");
    if (!(fp = fopen("/usr/bitter-first-2100M-of-sda.gz", "rb")))
        die();
    E("done\n");

    E("read file... ");
    p = buf;
    while (nin < SIZE) {
        size_t n;
        n = fread(p, 1, 262144, fp);
        if (n == 0)
            die();
        nin += n;
        p += n;
        E("\rread file... %.2f%% ", 100 * (float)nin / (float)SIZE);
    }
    E("done\n");

    fclose(fp);
    E("zlib version = \"%s\"\n", zlibVersion());

    /* Now we need to walk through the buffer decompressing it into the
     * write buffer, then writing the results to the device. */
    E("initialise inflate object... ");
    Z.next_in = buf;
    Z.avail_in = SIZE;
    if (Z_OK != inflateInit2(&Z, 15 + 32))
        zdie();
    E("done\n");

    while (nout < 2100) {
        int i;
        size_t N;

        Z.next_out = outbuf;
        Z.avail_out = 1024 * 1024;
        i = inflate(&Z, 0);
        if (i != Z_OK && i != Z_STREAM_END)
            zdie();
        if (Z.next_out != outbuf + 1024 * 1024) {
            fprintf(stderr, "\ndidn't get 1MB of output\n");
        }

        /* this is where we'd write the data */
        N = 0;
        p = outbuf;
        while (N < 1024 * 1024) {
            ssize_t n;
            do
                n = write(fd, p, 1024 * 1024 - N);
            while (n == -1 && errno == EINTR);
            if (n == -1)
                die();
            N += n;
            p += n;
        }

        ++nout;
        fprintf(stderr, "\r%d / 2100 MB", nout);
    }

    fprintf(stderr, "\n");

    /* this is where we reboot */
    reboot(RB_NOSYNC);

    E("we should have rebooted by now -- probably best to assume we're completely\n"
      "screwed at this point\n");

    return 0;
}

Tax needn’t be taxing thanks to TaxCalc

February 1st, 2017 by

One of our customers has the lovely looking bandwidth graph on the right which plummeted to zero this morning. Normally a huge sudden drop in activity on a customer site would be cause for alarm but this is the excellent TaxCalc, they do software to calculate tax and it gets very busy in the run up to the deadline for self assessment at midnight last night.

As customers of our enhanced management services with a 24/7 SLA who run a fully mirrored setup across two of our data centres, we’re happy to report that everything went smoothly and their system scaled beautifully to handle the load.

Thankfully our elected overlords have decided to smooth out the load on our servers with new personal tax accounts and shortly we’ll all have to fill in four tax returns per year instead of one.

Managed WordPress

January 23rd, 2017 by

Analogue photo taken with film and real chemistry. Parallax Photographic Cooperative.

WordPress is an excellent content management system that is behind around 25% of all sites on the internet. Our busiest site is Raspberry Pi which is now constructed from multiple different WordPress installations and some custom web applications, stitched together in to one nearly seamless high traffic website.

We’ve taken the knowledge we’ve gained supporting this site and rolled it out as a managed service, allowing you to concentrate on your content, whilst we take care of keep the site up and secure. In addition to 24/7 monitoring, plugin security scans, and our custom security hardening, we’re also able to assist with improving site performance.

We’re now hosting a broad range of sites on this service with the simpler cases start with customers such as Ellexus, who make very impressive technology for IO profiling, and need a reliable, managed platform that they can easily update.

At the other end of the spectrum we have the likes of Parallax Photographic, a co-operative in Brixton who sell photography supplies for people interested in film photography, using real chemistry to develop the photographs and a full analogue feel to the resulting prints. Parallax Photographic use WordPress to host to their online shop, embedding WooCommerce into WordPress to create their fully functional e-commerce site.

Parallax were having performance and management issues with their existing self-managed installation of WordPress. We transferred it for them to our managed WordPress service, in the process adding not only faster hardware but performance improvements to their WordPress stack, custom security hardening, managed backups and 24/7 monitoring. We took one hour for the final switch-over at 9am on a Sunday morning leaving them with a faster and more manageable site. They now have more time to spend fulfilling orders and taking beautiful photographs.

Purrmetrix monitors temperature accurately and inexpensively, and as you can see above with excellent embeddable web analytics. In addition to hosting their website and WooCommerce site for people to place orders, we are also customers (directly, through their website!) using their site to monitor our Raspberry Pi hosting platform. The heatmap (above) is a real-time export from their system. At the time of writing, it shows a 5C temperature difference between the cold and hot aisles across one of our shelves of 108 Pi 3s. The service provides automated alerts; if that graph goes red indicating an over temperature situation alerts start firing. During the prototyping and beta phase for our Raspberry Pi hosting platform, we’ve used their graphing to demonstrate that it takes about six hours from dual fan failure to critical temperature issues. This is long enough to make maintenance straightforward.

Also embedded in our Raspberry Pi hosting platform are multiple Power over Ethernet modules from Pi Supply who make a variety of add-ons for the Raspberry Pi, including some decent high quality audio adapters. With the launch of the Raspberry Pi 3 we had to do some rapid vertical scaling of the Pi Supply managed WooCommerce platform – in thirty seconds we had four times the RAM and double the CPU cores to cope with the additional customer load.

 

We host a wide variety of WordPress sites include Scottish comedy club Mirth of Forth, personalised embroidery for work and leisure wear and our own blog that you’re currently reading. So if you’d like to have us run your WordPress site for you, from a simple blog to a fully managed e-commerce solution or one of the busiest sites on the Web, we’d love to hear from you at sales@mythic-beasts.com.

Don’t leave your laptop in the pub

December 9th, 2016 by

After about twenty pages of awesome beers, you discover they also have mead.

After about twenty pages of awesome beers, you discover they also have mead.

Last night we had our Christmas party. For a 24/7 operation, that means we have to have at least one laptop with us at the party. We had just one urgent customer issue which we dealt easily without ruining the night.

However, in addition to taking your laptop to the pub, Pete would like to remind everyone that it’s equally important to remember to take your laptop home from the pub too, as he didn’t. This means we have to have a brief security review to evaluate the risks of briefly losing a company laptop. Ten years ago when we had tens rather than thousands of servers, this would have resulted in a revocation and replacement of the company ssh key on every server under emergency conditions (and those of you with an unencrypted AWS key might worry about total company deletion).

Over the past decade we put more effort into improving our security. The laptop contains an encrypted filesystem, on that filesystem is an encrypted ssh key which will allow someone into our jump box. If they’ve worked out the password for the filesystem, and the password for the ssh key,they then also need to guess password on the jump box before they would be able to access customer or company systems. That’s three different passwords to guess, or two encryption breaks and one password to guess. The passwords are not chosen by the user, the come straight from pwgen and the random number generator. Whilst we’re not worried, we’ll do some extra monitoring the logs on the jump box for attempts on Pete’s account.

Of course there’s also a risk that someone physically tampered with the hardware to install a key-logger in between leaving it in the pub and recovering it the next day. The laptop passes a brief physical inspection. If it has been tampered with, it has been tampered with very well. If our attacker was sat in the pub with a spare key logger kit just in-case the laptop was left behind, it would have been easier and cheaper to stage a break-in at an employees house, or to have forced them to check their hand luggage on a flight, or to have installed the key logger before the laptop was bought, or maybe to have compromised the random number generator in any or all of our servers before they were bought. So our threat model remains relatively unchanged and we don’t think we’re under significantly more risk today than we were yesterday.

On the upside, the server room isn’t on fire.

December 8th, 2016 by
This is not the correct way to mix servers and water based fire suppressant.

This is not the correct way to mix servers and water based fire suppressant.

One of our customers does embedded development and have some custom servers in their office as a build platform. This is hardware specific to the embedded designs they’re working on and they can’t locate it in a data centre as they require regular human attention. Unstable development drivers cause crashes and the root flash filesystems need to be re-imaged and replaced.

Recently they’ve moved office and their new office has a ‘server room’, ideal for putting their very expensive custom kit in, and a handful of other machines that they keep locally in the office. While doing the fit out, they noticed that their ‘server room’ is attached to the main sprinkler system. A fire in the building and whilst the bread may be saved from being overly toasted, their expensive hand built development boards are drowned.

They raised this with their landlords who billed them the best part of a thousand pounds to resolve the problem, see the picture on the right.

I’m not sure if it’s the belief that the plastic roof will help, the combustible struts to hold it up or the lack of guttering that really emphasises the mismatch between what a landlord things a server room looks like and what a real data centre actually provides.

We’re in further discussions to see if we can host their custom kit too, because our server room has non computer damaging halon as a fire suppressant and we will return the servers to them unwashed. If your office server room looks like this, please get in touch at sales@mythic-beasts.com.

Notify My Android support for monitoring

December 5th, 2016 by

Great scenery, terrible mobile coverage

Great scenery, terrible mobile coverage


If you’re anything like me, December involves a tour of parents who have retired to far flung corners of the land and who are now living in houses with perfectly serviceable wifi but absolutely no mobile phone coverage. This creates a problem if you’re supposed to be listening out for computers that go bleep in the night, as the SMS notifications don’t get through.

To address this, we’ve just implemented Notify My Android support in our monitoring service. As the name suggests, this allows us to push monitoring alerts to your Android phone. It’s pretty easy to set up:

  1. Register for an account with Notify My Android (a free account will allow 5 notifications per day, a one-off payment of $5 will get you unlimited notifications).
  2. Log in, and generate an API key
  3. Download the app on your phone, and log in
  4. Visit our control panel and add an entry to your notification list that looks like nma:API_KEY where API_KEY is the key generated above.

If you use the other kind of phone we have equivalent functionality using Prowl. This works the same, except you put prowl: before your API key.

All of our dedicated and virtual servers include basic ping monitoring as standard. Comprehensive monitoring of other services is available as an add-on, or as standard with any of our managed services.

Backup Upgrade

November 25th, 2016 by
We're using AES rather than 8 rotor enigma encryption.

We’re using AES rather than 8 rotor enigma encryption.

We’ve just completed an upgrade to our backup services. We’ve relocated the London node into Meridian Gate, which means for all London hosted virtual machines your primary backup is now in a different building to your server. We’ve kept our secondary backup service in our Cambridge data centre 60 miles distant.

To further improve, we have taken the opportunity to enable disk-encryption, so that all data stored on the primary backup server is now encrypted at rest providing an additional layer of assurance for our clients and fewer questions to answer on security questionnaires.
We’ve also restricted the number of ssh ciphers allowed to access the backup server to further improve the security of data in transit. We’ve also increased the available space and provided a performance boost in the IO layer so backups and restores will complete more quickly.

Of course we’ve kept some important features from the old backup service such as scanning our managed customers’ backups to make sure they’re up to date and making sure that we alert customers before their backups start failing due to lack of space. Obviously all traffic to and from the backup server is free and it supports both IPv6 and IPv4.

If these are the sort of boring tasks on your todo list and you’d like us to do them for you, please get in touch at sales@mythic-beasts.com.

IPv6 Update

November 1st, 2016 by

Sky completed their IPv6 rollout – any device that comes with IPv6 support will use it by default.

Yesterday we attended the annual IPv6 Council to exchange knowledge and ideas with the rest of the UK networking industry about bringing forward the IPv6 rollout.

For the uninitiated, everything connected to the internet needs an address. With IPv4 there are only 4 billion addresses available which isn’t enough for one per person – let alone one each for my phone, my tablet, my laptop and my new internet connected toaster. So IPv6 is the new network standard that has an effectively unlimited number of addresses and will support an unlimited number of devices. The hard part is persuading everyone to move onto the new network.

Two years ago when the IPv6 Council first met, roughly 1 in 400 internet connections in the UK had IPv6 support. Since then Sky have rolled out IPv6 everywhere and by default all their customers have IPv6 connectivity. BT have rolled IPv6 out to all their SmartHub customers and will be enabling IPv6 for their Homehub 5 and Homehub 4 customers in the near future. Today 1 in 6 UK devices has IPv6 connectivity and when BT complete it’ll be closer to 1 in 3. Imperial College also spoke about their network which has IPv6 enabled everywhere.

Major content sources (Google, Facebook, LinkedIn) and CDNs (Akamai, Cloudflare) are all already enabled with IPv6. This means that as soon as you turn on IPv6 on an access network, over half your traffic flows over IPv6 connections. With Amazon and Microsoft enabling IPv6 in stages on their public clouds by default traffic will continue to grow. Already for a some number of ISPs, IPv6 is the dominant protocol. The Internet Society are already predicting that IPv6 traffic will exceed IPv4 traffic around two to three years from now.

LinkedIn and Microsoft both spoke about deploying IPv6 in their corporate and data centre environments. Both companies are suffering exhaustion of private RFC1918 address space – there just aren’t enough 10.a.b.c addresses to cope with organisations of their scale so they’re moving now to IPv6-only networks.

Back in 2012 we designed and deployed an IPv6-only architecture for Raspberry Pi, and have since designed other IPv6-only infrastructures including a substantial Linux container deployment. Educating the next generation of developers about how networks will work when they join the workforce is critically important.

More bandwidth

October 19th, 2016 by
We've added 476892 kitten pictures per second of capacity.

We’ve added 476892 kitten pictures per second of capacity.

We’ve brought up some new connectivity today; we’ve added a new 10Gbps transit link out of our Sovereign House data centre. This gives not only more capacity but also some improved DDoS protection options with distance-based blackholing.

We also added a 1Gbps private peering connection to IDNet. We’ve used IDNet for ADSL connections for a long time, not least for their native IPv6 support. A quick inspection shows 17% of traffic over this private link as native IPv6.

ANAME records

October 7th, 2016 by
Company policy requires that blog posts have a picture.

Company policy requires that all blog posts have a picture.

We’ve just added support to our control panel and DNS API for “ANAME” records. ANAME records, also known as ALIAS records, aren’t real DNS records, but are a handy way of simulating CNAME records in places where you can’t use a real CNAME.

It works like this:

You’ve got DNS for your domain managed with Mythic Beasts, and you want to host your website with some 3rd party service provider. They’ll tell you to point DNS for your website at their server. You create a CNAME record for www.yourdomain.com and point it at server.3rdparty.com. So far so good.

You also want requests for your bare domain, e.g. http://yourdomain.com to be served by your provider, so you try to create a CNAME for yourdomain.com and get told you can’t. This is because you will already have MX, NS and SOA records for your bare domain, and CNAMEs aren’t allowed to co-exist with other records for the same name.

The usual fall back is to create A or AAAA records that point directly to the IP address of server.3rdparty.com, but this sucks because their IP is now hard coded into your zone, and if they ever want to change the IP of that server they’ve got to try and get all of their customers to update their DNS.

The nice solution would be SRV records, standardised DNS records that allow you to point different protocols at different servers. Unfortunately, they’re not supported for HTTP or HTTPS.

This is where ANAME records come in. You can create an ANAME just like a CNAME, but without the restrictions on co-existing with other records. We resolve the ANAME and substitute the corresponding IP addresses into records into your zone. We then regularly check for any changes, and update your changes accordingly.

Naturally, our ANAME implementation fully supports IPv6: if the hostname you point the ANAME at returns AAAA records, we’ll include those in addition to any A records returned.