FRμIT: Federated RaspberryPi MicroInfrastructure Testbed

July 3rd, 2017 by

The participants of the FRμIT project, distributed Raspberry Pi cloud.

FRμIT is an academic project that looks at building and connecting micro-data-centres together, and what can be achieved with this kind of architecture. Currently they have hundreds of Raspberry Pis and they’re aiming for 10,000 by the project end. They invited us to join them, we’ve already solved the problem of building a centralised Raspberry Pi data centre and wanted to know if we could advise and assist their project.  We recently joined them in the Cambridge University Computer Lab for their first project meeting.

Currently we centralise computing in data centres as it’s cheaper to pick up the computers and move them to the heart of the internet than it is to bring extremely fast (10Gbps+) internet everywhere. This model works brilliantly for many applications because a central computing resource can support large numbers of users each connecting with their own smaller connections. It works less well when the source data is large and in somewhere with poor connectivity, for example a video stream from a nature reserve. There are also other types of application such as Seti@Home which have huge computational requirements on small datasets where distributing work over slow links works effectively.

Gbps per GHz

At a recent UK Network Operator Forum meeting, Google gave a presentation about their data centre networking where they built precisely the opposite architecture to the one proposed here. They have a flat LAN with the same bandwidth between any two points so that all CPUs are equivalent. This involves around 1Gbps of bandwidth per 1GHz of CPU. This simplifies your software stack as applications don’t have to try and place CPU close to the data but it involves an extremely expensive data centre build.

This isn’t an architecture you can build with the Raspberry Pi. Our Raspberry Pi cloud is as about as close as you can manage with 100Mbps per 4×1.2GHz cores. This is about 1/40th of the network capacity required to run Google architecture applications. But that’s okay, other applications are available. As FRμIT scales geographically, the bandwidth will become much more constrained – it’s easy to imagine a cluster of 100 Raspberry Pis sharing a single low bandwidth uplink back to the core.

This immediately leads to all sort of interesting and hard questions about how to write a scheduler as you need to know in advance the likely CPU/bandwidth mix of your distributed application in order to work out where it can run. Local data distribution becomes important – 100+ Pis downloading updates and applications may saturate the small backbone links. They also have a variety of hardware types, the original Pi model B to the newer and faster Pi 3, possibly even some Pi Zero W.

Our contribution

We took the members of the project through our Raspberry Pi Cloud is built, including how a Pi is provisioned, how the network and operating system are provisioned and the back-end for the entire process from clicking “order” to a booted Pi awaiting customer login.

In discussions of how to manage a large number of Federated Raspberry Pis we were pleased to find considerable agreement with our method of managing lots of servers: use OpenVPN to build a private network and route a /48 of IPv6 space to it.   This enables standard server management tools work, even where the Raspberry Pis are geographically distributed behind NAT firewalls and other creative network configurations.

Donate your old Pi

If you have an old Raspberry Pi, perhaps because you’ve upgraded to a new Pi 3, you can donate it directly to the project through PiCycle. They’ll then recycle your old Raspberry Pi into the distributed compute cluster.

We’re looking forward to their discoveries and enjoyed working with the researchers. When we build solutions for customers we’re aiming to minimise the number of unknowns to de-risk the solution. By contrast tackling difficult unsolved problems is the whole point of research. If they knew how to build the system already they wouldn’t bother trying.

IPv6 Update

November 1st, 2016 by

Sky completed their IPv6 rollout – any device that comes with IPv6 support will use it by default.

Yesterday we attended the annual IPv6 Council to exchange knowledge and ideas with the rest of the UK networking industry about bringing forward the IPv6 rollout.

For the uninitiated, everything connected to the internet needs an address. With IPv4 there are only 4 billion addresses available which isn’t enough for one per person – let alone one each for my phone, my tablet, my laptop and my new internet connected toaster. So IPv6 is the new network standard that has an effectively unlimited number of addresses and will support an unlimited number of devices. The hard part is persuading everyone to move onto the new network.

Two years ago when the IPv6 Council first met, roughly 1 in 400 internet connections in the UK had IPv6 support. Since then Sky have rolled out IPv6 everywhere and by default all their customers have IPv6 connectivity. BT have rolled IPv6 out to all their SmartHub customers and will be enabling IPv6 for their Homehub 5 and Homehub 4 customers in the near future. Today 1 in 6 UK devices has IPv6 connectivity and when BT complete it’ll be closer to 1 in 3. Imperial College also spoke about their network which has IPv6 enabled everywhere.

Major content sources (Google, Facebook, LinkedIn) and CDNs (Akamai, Cloudflare) are all already enabled with IPv6. This means that as soon as you turn on IPv6 on an access network, over half your traffic flows over IPv6 connections. With Amazon and Microsoft enabling IPv6 in stages on their public clouds by default traffic will continue to grow. Already for a some number of ISPs, IPv6 is the dominant protocol. The Internet Society are already predicting that IPv6 traffic will exceed IPv4 traffic around two to three years from now.

LinkedIn and Microsoft both spoke about deploying IPv6 in their corporate and data centre environments. Both companies are suffering exhaustion of private RFC1918 address space – there just aren’t enough 10.a.b.c addresses to cope with organisations of their scale so they’re moving now to IPv6-only networks.

Back in 2012 we designed and deployed an IPv6-only architecture for Raspberry Pi, and have since designed other IPv6-only infrastructures including a substantial Linux container deployment. Educating the next generation of developers about how networks will work when they join the workforce is critically important.

More bandwidth

October 19th, 2016 by
We've added 476892 kitten pictures per second of capacity.

We’ve added 476892 kitten pictures per second of capacity.

We’ve brought up some new connectivity today; we’ve added a new 10Gbps transit link out of our Sovereign House data centre. This gives not only more capacity but also some improved DDoS protection options with distance-based blackholing.

We also added a 1Gbps private peering connection to IDNet. We’ve used IDNet for ADSL connections for a long time, not least for their native IPv6 support. A quick inspection shows 17% of traffic over this private link as native IPv6.

DNSSEC now in use by Raspberry Pi

May 12th, 2016 by

Over the past twelve months we’ve implemented Domain Name Security Extensions, initially by allowing the necessary records to be set with the domain registries, and then in the form of a managed service which sets the records, signs the zone files, and takes care of regular key rotation

Our beta program has been very successful, lots of domains now have DNSSEC and we’ve seen very few issues. We thought that we should do some wider testing with a larger number of users than our own website, so we asked some friends of ours with a busy website if they felt brave enough to give it a go

Eben Upton> I think this would be worth doing.
Ben Nuttall> I'll go ahead and click the green button for each domain.
-- time passes --
Ben Nuttall> Done - for all that use HTTPS.

So now we have this lovely graph that indicates we’ve secured DNS all the way down the chain for every request. Mail servers know for definite they have the correct address to deliver mail to, Web requests know they’re at the correct webserver.

The only remaining task is to remove the beta label in our control panel.

Raspberry Pi DNSSEC visualisation, click for interactive version

Raspberry Pi DNSSEC visualisation, click for interactive version

Additional Managed Rack Capacity

March 14th, 2016 by

We’ve spent even more time than usual in data centres recently as we’ve been kitting out our new cage in the Meridian Gate data centre.

Much of the new capacity is being deployed as “managed racks”.  Racks are generally supplied with the bare essentials of electricity, cooling and locked doors.  At Mythic Beasts, we transform them into managed racks, including all the features you need to effectively administer your equipment remotely, including:

logging serial consoles

Logging serial consoles

  • Internet connectivity – we’ve got 10Gbps connections onto both LINX networks, connecting at different sites.  We’ve also got multiple transit providers, and are present on the LoNAP peering exchange.   Our network has native IPv6 support, and if you have your own address space, we can provide you with BGP feeds from our routers. We can also offer private LANs, both as VLANs or as physically separate networks.
  • Remote power management – power cycle your server immediately, at any time using our customer control panel.
  • Serial connectivity – a 115.2kbps serial connection may seem a bit old fashioned in an age when we’re wiring our switches together at 40Gbps, but they remain an extremely effective mechanism for out-of-band control of servers and other equipment, particularly when coupled with our logging serial console software.
  • On-site support – all of our London facilities have 24/7 access to the data centres’ on-site engineers.  We are also able to arrange for our own staff to carry out routine maintenance, such as replacing failed hard drives.

Meridian Gate is the third London data centre in which we have a presence, along with Sovereign House and Harbour Exchange, with the three sites connected by our own dark fibre ring.

IPv4 is so last century

November 11th, 2015 by
A scary beast that lives in the Fens.

A scary beast that lives in the Fens.

Fenrir is the latest addition to the Mythic Beasts family. It’s a virtual machine in our Cambridge data centre which is running our blog. What’s interesting about it, is that it has no IPv4 connectivity.

eth0 Link encap:Ethernet HWaddr 52:54:00:39:67:12
     inet6 addr: 2a00:1098:0:82:1000:0:39:6712/64 Scope:Global
     inet6 addr: fe80::5054:ff:fe39:6712/64 Scope:Link
     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

It is fronted by our Reverse Proxy service – any connection over IPv4 or IPv6 arrives at one of our proxy servers and is forwarded on over IPv6 to fenrir which generates and serves the page. If it needs to make an outbound connection to another server (e.g. to embed our Tweets) it uses our NAT64 service which proxies the traffic for it.

All of our standard management services are running: graphing, SMS monitoring, nightly backups, security patches, and the firewall configuration is simpler because we only need to write a v6 configuration. In addition, we don’t have to devote an expensive IPv4 address to the VM, slightly reducing our marketing budget.

For any of our own services, IPv6 only is the new default. Our staff members have to make a justification if they want to use one of our IPv4 addresses for a service we’re building. We now also need to see how many addresses we can reclaim from existing servers by moving to IPv6 + Proxy.

IPv6 Graphing

October 15th, 2015 by
it's a server graph!

it’s a server graph!

One of the outstanding tasks for full IPv6 support within Mythic Beasts was to make our graphing server support IPv6 only hosts. In theory this is trivial, in practice it required a bit more work.

Our graphing service uses munin, and we built it on munin 1.4 nearly five years ago; we scripted all the configuration and it has basically run itself ever since. When we added our first IPv6 only server it didn’t automatically get configured with graphs. On investigation we discovered that munin 1.4 just didn’t support IPv6 at all, so the first step was to build a new munin server based on Debian Jessie with munin 2.0.

Our code generates the configuration file by printing a line for each server to monitor which includes the IP address. For IPv4 you print the address as normal, 127.0.0.1, for IPv6 you have to encase the address in square brackets [2a00:1098:0:82:1000:0:1:1]. So a small patch later to spot which type of address is which and we have a valid configuration file.

Lastly we needed to add the IPv6 address of our munin server into the configuration file of all the servers that might be talked to over IPv6. Once this was done, as if by magic, thousands of graphs appeared.

Selling hardware into the cloud

September 22nd, 2015 by

A Cambridge start-up approached us with an interesting problem. In this age of virtualisation, they have a new and important service, but one which can’t be virtualised as it relies on trusted hardware. They know other companies will want to use their service from within their private networks within the big cloud providers, but they can’t co-locate their hardware within Amazon or Azure.

This picture is a slight over simplification of the process

This picture is a slight over simplification of the process

The interesting thing here is that the solution is simple. It is possible to link directly into Amazon via Direct Connect and to Azure via Express Route. To use Direct Connect or Express Route within the UK you need to have a telco circuit terminating in a Telecity data centre, or to physically colocate your servers. As many of you will know, Mythic Beasts are physically present in three such data centres, the most important of which is Telecity Sovereign House, the main UK point of presence for both Amazon and Microsoft.

So our discussion here is nice and straightforward. Our future customer can co-locate their prototype service with Mythic Beasts in our Telecity site in Docklands. They can then connect to Express Route and Direct Connect over dedicated fibre within the datacentre when they’re ready to take on customers. Their customers then have to set up a VPC Peering connection and the service is ready to use. This is dedicated specialised hardware from the inside of ‘the cloud’, and it’s something we can offer to all manner of companies, start-up or not, from any dedicated or colocated service. You only need ask.

Ethernet Speeds: expect 2.5Gbps on copper, 25Gbps on fibre

September 18th, 2015 by


Recently we went to UKNOF where Alcatel Lucent gave a helpful presentation on new ethernet speeds.

Currently most network connectivity is 1Gbps ethernet over Cat5e copper which stretches up to 100m. There is an infrequently used standard for 10Gbps over Cat6 copper to 55m for higher speeds.

Now demand is starting to appear for faster than 1Gbps speeds, and it’s very attractive to do this without replacing the installed base of Cat5e and Cat6 cabling. There are new standards in the pipeline for 2.5Gbps and 5Gbps ethernet over Cat5e/Cat6 cabling.

In the data centre it’s common to have 10Gbps over SFP+ direct attach for short interconnects (up to 10m) and 1Gbps/10Gbps/40Gbps/100Gbps over fibre for longer distances. 1Gbps and 10Gbps are widely adopted. 40Gbps and 100Gbps are a different design, implemented by combining multiple lanes of traffic at 10Gbps to act as a single link. 100Gbps has changed to be 4 lanes at 25Gbps rather than 10 at 10Gbps.

The more lanes you have in use, the more switches and switching chips you need – but effectively this means that 40Gbps has the same cost in port count as 100Gbps. The next generation of 100Gbps switching hardware will consist of a large number of lanes that run at either 10Gbps or 25Gbps. With current interfaces, you’d use 4 lanes for 100Gbps, 4 lanes for 40Gbps or 1 lane for 10Gbps. The obvious gap is using a single lane for 25Gbps standard so you can connect vastly more devices at greater than 10Gbps speeds.

So in the near future, we’re expecting to see 2.5Gbps and 25Gbps ethernet becoming available, and in the longer term work has now started on 400Gbps standards.

Linux for switches

September 14th, 2015 by

For a long time at Mythic Beasts we’ve had a fairly healthy dislike for managed switches. The configuration method of switches is akin to a database with auto-commit on every command – you can’t batch a series of configuration changes into an atomic update. This means that you not only need to think about your starting and end configurations, but you also need to think about all the intermediate configuration too and make sure you don’t accidentally explode everything with an unexpected switch loop. Switches are also expensive and it’s always rankled that we’re paying a lot of money in order to use a network operating system that’s user unfriendly. Some of them are often less stable than the servers they connect to and they seem to manage excellent vendor lock-in – there is no end of advice that you can’t plug standards compatible switches from different manufacturers into each other because you risk inter-operability issues.

We’ve recently started trying Linux switches — commodity switches running Cumulus Linux.

Cumulus Linux makes your switch appear like a standard-ish debian server, with a lot of NICs.
The interfaces on our “1G” model are:

eth0 management interface
swp1 – swp48 1G switch ports
swp49 – swp52 10G switch ports

The switch is configured via /etc/network/interfaces, and uses bridges, VLANs and bonds to set up the configuration.

Linux has lots of advantages as a switch operating system. For a start if you need to patch ssh, under linux you download a replacement digitally signed openssh package and restart the process, on a traditional switch you download a whole new firmware over insecure tftp and reboot the switch – unlucky for the people connected to the switch.

The first obvious difference when configuring these switches is that by default, the switch doesn’t switch any traffic until some configuration is put in.

We can set up a simple network:

 # The primary network interface
 auto eth0
 iface eth0 inet static
        address xxx
        gateway xxx

 auto br0
 iface br0
         bridge-ports glob swp1-48
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp49-52
         bridge-stp on
         setmcsnoop 0

This sets up the 1G ports (1-48) as a single VLAN, the 10G ports (49-52) as second VLAN, a management interface on the management port (eth0).

In this case we have an uplink on port 48 to a different network. So to migrate the uplink from our 1G network to our 10G network we would write out a new configuration file:

 auto br0
 iface br0
         bridge-ports glob swp1-47
         bridge-stp on
         setmcsnoop 0

 auto br1
 iface br1
         bridge-ports glob swp48-52
         bridge-stp on
         setmcsnoop 0

then bring the interfaces up with

 ifup -a

Note that ifup under Cumulus is different to standard Debian. It links to ifupdown2 which can inspect the current running state and apply only changes, rather than having to take an interface down and up on a standard server.

One deeply troubling thing about Cumulus Linux is it includes a minimal vi, but not a full implementation of vim.

But there are many other advantages that make up for this inexplicable oversight: being Debian-ish it has sudo, so you can give arbitrary permissions to multiple users rather than just show / enable / configure. You can easily update things with ssh. You can configure your switch with puppet. You can easily back up the entire configuration with rsync, version control it with etckeeper and bzr (sadly no git!). You can write code and run it directly on the switch which allows all kinds of options for monitoring and configuration.

We now have a few Cumulus Linux switches in production for private client networks. Here’s one providing lots and lots of bandwidth:

Even complex configurations can be handled relatively easily. For example, we have a customer with a private cloud who wants to run 20Gbps into each host, exposing different 10 different VLANs to their virtual servers, and then routing between them. This can be done on a 10G switch by bonding pairs of interfaces together, and then bridging the required VLANs on each of the bonded interfaces.

This config turns out to be nice and simple to write, and has the advantage of looking very similar on the switch and the server:

auto bond13            
iface bond13           
  bond-slaves swp1 swp2         
  bond-mode 802.3ad             
  bond-miimon 100               
  bond-use-carrier 1            
  bond-lacp-rate 1              
  bond-min-links 1              
  bond-xmit-hash-policy layer3+4
                       
auto bond14            
iface bond14           
  bond-slaves swp3 swp4         
  ....

auto br-tag130
iface br-tag130
  bridge-ports bond13.130 bond14.130 ...

auto br-tag2544
iface br-tag2544
  bridge-ports bond0.2544 bond1.2544  ...