rm -rf /var

August 10th, 2017 by

Within Mythic Beasts we have an internal chat room that uses IRC (this is like Slack but free and securely stores all the history on our servers). Our monitoring system is called Ankou, named after Death’s henchman that watches over the dead, and has an IRC bot that alerts through our chat room.

This story starts with Ankoubot, who was the first to notice something was wrong with the world.

15:25:31

15:25:31 ankoubot managed vds:abcdefg-ssh [NNNN-ssh]: 46.235.N.N => bad banner from `46.235.N.N’: [46.235.N.N – VDShost:vds-hex-f]
15:25:31 ankoubot managed vds:abcdefg-web [NNNN-web]: http://www.abcdefg.co.uk/ => Status 404 (<html> <head><title>404 Not Found</title></head> <body bgcolor=”white”> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.10.3</center> </body> </html…) [46.235.N.N www.abcdefg.co.uk VDShost:vds-hex-f]

15:31:42

15:31:42pete I can’t get ssh in, I’m on the console.

15:38:16

15:38:16pete This is an extremely broken install. ssh is blocked, none of the bind mounts work

Debugging is difficult because /var/log is missing. systemd appears completely unable to function and we have no functioning logging. Unable to get ssh to start and fighting multiple broken tools due to missing mounts, we restart the server and mail the customer explaining what we’ve discovered so far. This doesn’t help and it hangs attempting to configure NFS mounts.

15:53:53

15:56:35

Boot to recovery media completes ready for restore from backup.

15:58:34

16:05:07

16:08:36

16:08:36 ankoubot managed vds:abcdefg-ssh [NNNN-ssh]: back to normal
16:08:36 ankoubot managed vds:abcdefg-web [NNNN-web]: back to normal

16:14:22

16:31:42

Customer confirms everything is restored and functional and gives permission to anonymously write up the incident for our blog including the following quote.

Mythic Beasts had come highly recommended to me for the level of support provided, and when it came to crunch time they were reacting to the problem before I’d even raised a support ticket.
This is exactly what we were looking for in a managed hosting provider, and I’m really glad we made the choice. Hopefully however, I won’t be causing quiet the same sort of problem for a looooong while.

In total the customer was offline for slightly over 30 minutes, after what can best be described as a catastrophic administrator error.

FRμIT: Federated RaspberryPi MicroInfrastructure Testbed

July 3rd, 2017 by

The participants of the FRμIT project, distributed Raspberry Pi cloud.

FRμIT is an academic project that looks at building and connecting micro-data-centres together, and what can be achieved with this kind of architecture. Currently they have hundreds of Raspberry Pis and they’re aiming for 10,000 by the project end. They invited us to join them, we’ve already solved the problem of building a centralised Raspberry Pi data centre and wanted to know if we could advise and assist their project.  We recently joined them in the Cambridge University Computer Lab for their first project meeting.

Currently we centralise computing in data centres as it’s cheaper to pick up the computers and move them to the heart of the internet than it is to bring extremely fast (10Gbps+) internet everywhere. This model works brilliantly for many applications because a central computing resource can support large numbers of users each connecting with their own smaller connections. It works less well when the source data is large and in somewhere with poor connectivity, for example a video stream from a nature reserve. There are also other types of application such as Seti@Home which have huge computational requirements on small datasets where distributing work over slow links works effectively.

Gbps per GHz

At a recent UK Network Operator Forum meeting, Google gave a presentation about their data centre networking where they built precisely the opposite architecture to the one proposed here. They have a flat LAN with the same bandwidth between any two points so that all CPUs are equivalent. This involves around 1Gbps of bandwidth per 1GHz of CPU. This simplifies your software stack as applications don’t have to try and place CPU close to the data but it involves an extremely expensive data centre build.

This isn’t an architecture you can build with the Raspberry Pi. Our Raspberry Pi cloud is as about as close as you can manage with 100Mbps per 4×1.2GHz cores. This is about 1/40th of the network capacity required to run Google architecture applications. But that’s okay, other applications are available. As FRμIT scales geographically, the bandwidth will become much more constrained – it’s easy to imagine a cluster of 100 Raspberry Pis sharing a single low bandwidth uplink back to the core.

This immediately leads to all sort of interesting and hard questions about how to write a scheduler as you need to know in advance the likely CPU/bandwidth mix of your distributed application in order to work out where it can run. Local data distribution becomes important – 100+ Pis downloading updates and applications may saturate the small backbone links. They also have a variety of hardware types, the original Pi model B to the newer and faster Pi 3, possibly even some Pi Zero W.

Our contribution

We took the members of the project through our Raspberry Pi Cloud is built, including how a Pi is provisioned, how the network and operating system are provisioned and the back-end for the entire process from clicking “order” to a booted Pi awaiting customer login.

In discussions of how to manage a large number of Federated Raspberry Pis we were pleased to find considerable agreement with our method of managing lots of servers: use OpenVPN to build a private network and route a /48 of IPv6 space to it.   This enables standard server management tools work, even where the Raspberry Pis are geographically distributed behind NAT firewalls and other creative network configurations.

Donate your old Pi

If you have an old Raspberry Pi, perhaps because you’ve upgraded to a new Pi 3, you can donate it directly to the project through PiCycle. They’ll then recycle your old Raspberry Pi into the distributed compute cluster.

We’re looking forward to their discoveries and enjoyed working with the researchers. When we build solutions for customers we’re aiming to minimise the number of unknowns to de-risk the solution. By contrast tackling difficult unsolved problems is the whole point of research. If they knew how to build the system already they wouldn’t bother trying.

Encryption is vital

June 7th, 2017 by


We refuse to bid for government IT work because we can’t handle the incompetence.

At Mythic Beasts we make use of free secure encryption all the time. Like all powerful tools such as roads, trains, aeroplanes, GPS navigation, computers, kitchen knives, vans and Casio watches; things that are very useful for day to day life are also useful for criminals and terrorists. It’s very popular for our politicians and the Home Office, especially our current prime minister King Canute Theresa May and leader of The Thick Party, to suggest that fully secure encryption should be banned and replaced with a weaker version that will reveal all of your secrets but only to the UK security services.

We disagree and think this is a terrible idea. There’s the basic technical objection that a backdoor is a backdoor and keeping knowledge of the backdoor secret is essentially impossible. There’s a recent practical demonstration of this: the NSA knew of an accidental backdoor in Windows and kept it secret.  It was leaked, resulting in the thankfully not very effective WannaCry virus which disabled substantial fractions of the NHS. The government is very good at scope creep: the Food Standards Agency refused to disclose why it needs the power to demand your entire internet history. We think it fundamentally wrong that MPs excluded themselves from the Investigatory Powers act. Then there’s simple commercial objections: it’s a slight commercial disadvantage if every UK product has an ‘Insecure By Order of The Home Office’ sticker on the front when your foreign competitors products don’t.

However, Mathematics does not care what our politicians wish and refuses to change according to their desires. Strong cryptography is free, available on every computer, and can be given away on the front of a magazine. Taking away secure cryptography is going to involve dragging a playstation out of your teenagers’ hands, quite literally stealing from children. Of course secure communications will still be available to any criminal who can illegally access some dice and a pencil.

It’s a good job you can’t build encryption machines with childrens toys.

At Mythic Beasts we make extensive use of open source free cryptography. OpenSSH protects our administrative access to the servers we run and the customers we manage. OpenSSL protects all our secure web downloads, including last month’s million or so copies of Raspbian ensuring that children with a Raspberry Pi don’t have their computer compromised. We make extensive use of free certificates through Let’s Encrypt and we’ve deployed tens of thousands of upgraded packages to customers which are securely verified by GnuPG.

Without these projects, vast quantities of the internet would be insecure. So we’ve made donations to OpenSSH, GnuPG and Let’s Encrypt to support their ongoing work. We’d like to donate to OpenSSL but we can’t see easily how to pay from a UK bank account.

Save the Black Horse

May 26th, 2017 by

The last pub in Dry Drayton has closed and is under threat of development. As a community, we’re working hard to save it.

It’s Beer Festival week in Cambridge. Suddenly official work takes a back seat compared to the importance of drinking, serving and appreciating fine beer in the sunshine. It’s great that the volunteers behind the bar include friends, colleagues, customers, suppliers and the occasional former MP.

However, for 51 weeks of the year the Cambridge Beer Festival isn’t operating and beer lovers among us have to go to a more humble establishment, the pub. Cambridge City is blessed with multiple excellent pubs, but occasionally it’s nice to take a visit to the outlying villages.

So we were very saddened to hear that the only pub in Dry Drayton, the Black Horse, was due to close. However, a community group has started assembling plans to turn it into a community pub on a similar model to the excellent Dykes End in Reach. They asked us to help with setting up their on-line presence. Mythic Beasts fully support the effort to have lovely pubs within walking and cycling distance so we’ve provided a Managed WordPress site to help their campaigning efforts. Today we’ll share a beer with them in the Beer Festival, and in the near future we hope to take a field trip to their re-opened countryside pub.

Update: They reported last night they’ve had a lot of signups to their newsletter and several interested investors. It looks like we’re going to be part of a successful pub rescue!

 

Cambridge Beer Festival, Raspberry Pi powered Apps for Beer

May 22nd, 2017 by

We drew the architecture diagram for the beer festival on a beer mat.

Today marks the first day of the Cambridge Beer Festival, the longest running CAMRA beer festival, one of the largest beer festivals in the UK and in our obviously correct opinion, by far the best. Not only have we run the web back-end for many Cambridge CAMRA websites for many years, this year we’ve been involved with Cambridge App Solutions who run the iPhone Beer Festival App. They’d been having some trouble with their existing hosting provider for the back-end. In frustration they moved it to their Cloud Raspberry Pi which worked rather better. They then suggested that we keep the production service on the Raspberry Pi, despite it being a beta service.

Preparing for production

We’ve set up all our management services for the hosted Pi in question, including 24/7 monitoring and performance graphing. We then met up with Craig, their director in the pub to discuss the app prior to launch. The Pi 3 is fronted by CloudFlare who provide SSL. However, the connection to the Pi3 from Cloudflare was initially unencrypted. We took Craig through our SSL on a Raspberry Pi hosting guide and about a minute later we had a free Let’s Encrypt certificate to enable full end-to-end data security.

 

 

The iPhone app that runs the Cambridge Beer Festival (also found at Belfast and Leestock)

 

The iPhone Beer Festival App tracks which beers are available and the ratings for how good they are. Availability is officially provided, ratings are crowd sourced.  The app is continuously talking to the back end to keep the in app data up to date. All this data is stored and served from the Raspberry Pi 3 in the cloud.

Proximity

The festival also has some Estimote beacons for proximity sensing which use Bluetooth Low Energy to provide precise location data to the phone. On entry to the beer festival the app wakes up and sends a hello message.

Raspberry Pi Cloud upgrades

May 12th, 2017 by

We’ve made some improvements to our Raspberry Pi Cloud.

  • Upgraded kernel to 4.9.24, which should offer improved performance and a fix for a rare crash in the network card.
  • Minor update to temperature logging to ease load on our monitoring server and allow faster CPU speeds.
  • Upgrade to the NFS fileserver to allow significantly improved IO performance.
  • Recent updates applied to both Debian and Ubuntu images.

Thanks to Gordon Hollingworth, Raspberry Pi Director of Engineering for his assistance.


Upgraded backups

March 31st, 2017 by

Servers need different backup strategies to Vampire Slayers.

Our backup report caught a warning from the backup on our monitoring server:

WARN - [child] mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table `log` at row: 6259042
....
ERROR - mysqldump --all-databases .... exited with 3

We investigated, indeed this is an error and we’ve created a truncated backup. As we think backups are very important we investigated immediately rather than adding it to the end of a very long task list that would be ignored in favour of more user visible changes.

An initial guess was that it might be a mismatch in max_allowed_packet between the server and the dump process, a problem that we’ve seen before. We set max_allowed_packet for mysqldump to the maximum allowed value, reran the backup manually and watched it fail again. Hypothesis disproven and still no consistent backup.

Checking the system log, it quickly became apparent that we were running out of memory. The out of memory killer had kicked in and decided to kill mysqld (an unfortunate choice, really). This was what had caused the dump to terminate early.

Now we understand our problem, one solution is to configure a MySQL slave and back up from the slave, another is to move to a bigger MySQL server, another is to exclude the ephemeral data from the backup. We chose to exclude the ephmeral data and now our backup is complete and we’ve tested the restore.

While working on this, our engineer noticed that there was an easy extra check we could make to ensure the integrity of a MySQL dump. When the dump is complete we run the moral equivalent of:

zcat $dump | tail -1 | grep -q '^-- Dump completed'

to check that we have a success message at the end of the dumped file. This is an additional safety check. Previously we were relying on mysqldump to tell us if it found an error, now we require mysqldump to report success and the written file to pass automated tests for completeness.

We pushed out our updated backup package with the additional check to all managed customers yesterday. On World Backup Day, we’d like to remind the entire Internet to check that your backups work. If that sounds boring, we’ll check your backups for you.

PHP7 on a Raspberry Pi 3 in the cloud

March 22nd, 2017 by
Rasberry Pi 3

Two Raspberrys PI using PHP7 during the Pi 3 launch.

Last April we moved the main blog for Raspberry Pi to a small cluster of Raspberry Pi 3s. This went so well we made it commercially available and you can now buy your Raspberry Pi 3 in the cloud.

If you’d like to have PHP 7 running on your Raspberry Pi 3 in the cloud, this guide if for you. Click the link, buy a Pi 3 and install your ssh-key and log in. This should take no more than about a minute.

PHP 7 isn’t yet part of the standard Raspbian OS, so we need to get it from somewhere else.

A brief aside about CPU architectures, Raspbian and Debian

Debian provides three versions for ARM processors:

  • armel – 32 bit and ARMv5
  • armhf – 32 bit, ARMv7 and a floating point unit
  • arm64 – 64 bit ARMv8 and a floating point unit

The Raspberry Pi uses three different architectures:

  • Raspberry Pi A, B, Zero & Zero W – 32 bit ARMv6 with floating point
  • Raspberry Pi 2 – 32 bit ARMv7 with floating point
  • Raspberry Pi 3 – 32/64 bit ARMv8 with floating point unit

Raspbian is an unofficial port for 32bit ARMv6 and a floating point unit, which matches the hardware for an original Raspberry Pi model B. Because we’re working here with the Pi 3 – ARM8 and floating point, we can take official debian armhf packages and run them directly on our Pi 3.

Ondřej Surý is the Debian PHP maintainer who also has a private repository with newer versions of PHP built for Debian and Ubuntu. So we can use 32 bit Debian packages for ARM 7 (armhf) and install directly on top of Raspbian.

PHP 7 packages aren’t available for armel, so this won’t work on an original Raspberry Pi, or a Pi Zero/Zero W.

Add the PHP 7 repository

deb.sury.org includes newer PHP packages built for armhf, which we can use directly. Following the instructions here here we can set up the repository:

apt-get install apt-transport-https lsb-release ca-certificates
wget -O /etc/apt/trusted.gpg.d/php.gpg https://packages.sury.org/php/apt.gpg
echo "deb https://packages.sury.org/php/ $(lsb_release -sc) main" > /etc/apt/sources.list.d/php.list
apt-get update

Now we can install everything we need for php7 and apache2.4:

apt-get install apache2 php7.0 php7.0-curl php7.0-gd php7.0-json \
    php7.0-mcrypt php7.0-mysql php7.0-opcache libapache2-mod-php7.0
echo "<?=phpinfo()?>" >/var/www/html/info.php 

Wait a few moments and we have a webserver running PHP7 on our Pi3 in the cloud.

You’ll note we’ve included php7-opcache. This should accelerate our PHP performance by a factor of two or so.

Now for an application…

Try WordPress

WordPress needs a MySQL server & PHP library for accessing MySQL. We need to restart Apache to make PHP 7 pick up the additional library.

apt-get install php7.0-mysql  mysql-server
apache2ctl restart
mysql -u root -p

mysql> create database wordpress;
mysql> grant all privileges on wordpress.* to wordpress identified by 'password';

We strongly recommend you invent a better password.

cd /var/www/html
wget https://wordpress.org/latest.tar.gz
tar -zxvf latest.tar.gz
chown -R www-data:www-data wordpress

Then navigate to http://www.yourpiname.hostedpi.com/wordpress and finish the install through your browser.

Next steps

For information on how to host on your own domain name, and how to enable HTTPS see our previous blog post on hosting a website on a Raspberry Pi.

Attending a 5 year old’s birthday party

March 7th, 2017 by

Over the weekend Pete went to the Raspberry Pi Big Birthday Weekend. Rather larger than previous years, they booked out the whole of the Cambridge Junction allowing them to have far more guests.

A very small Eben reviewing the extraordinary growth of Raspberry Pi over the last 12 months.

Eben Upton started the day with a keynote about how Raspberry Pi is doing. In short, they’ve made a lot of computers. Raspberry Pi Trading (the company that makes and sells computers) has generated several million pounds of profit that’s been returned to the Raspberry Pi Foundation (the charity that educates people). This makes them a very unusual startup compared to the current Silicon Valley competition of trying to lose money as fast as possible. Pi Zero W, the newest member of the family has sold over 100,000 units in the four days since launch.

Picture of the Pi Logo done on a sense hat by a young girl in the Astro-Pi workshop.

Pete joined David Honess for his Astro-Pi workshop, programming the Sense HAT and writing code that could eventually end up running on the International Space Station. One girl ran away with one of the examples, turning the black and white X into a full colour Raspberry Pi logo using the Sense HAT.



We were particularly interested in the presentation by Gwiddle as it’s a project that we sponsor, and the Pi Party was first opportunity we had to meet the founders in person. Gwiddle provide free Web hosting for people in full time education and are having significant success supporting nearly 1,000 students. Yasmin and Joshua fielded their questions well. We’re massively proud of what Gwiddle have achieved so far and their plans for future expansion into a full multi-national with world domination look exciting.

Pete giving a talk on Pi in the Cloud, photo by Joshua Bayfield. View the slides

Pete gave a talk on Raspberry Pi in the data centre and how we’ve built a Raspberry Pi Cloud. We’ve put the slides up at the request of the Japanese Raspberry Pi Users Group who’d flown over to be at the Party. We’d love it if you translate the slides Japanese and put them on your website!

Sam Aaron and Sonic Pi

Sam Aaron demonstrated new features in Sonic Pi in a live performance. Pete was particularly impressed by the distortion guitar amp effect from the Pi, and also the details of the new Erlang back-end to reduce timing jitter around a millisecond. Somebody needs to take the library and implement an easy real-time mode for the Pi Zero W to ease building Internet of Things applications.

A fantastic, but incredibly tiring weekend. We’re already looking forward to the 6th birthday party!

Rearranging Raspberry Pi

February 27th, 2017 by


Replacing the production DB under a running system.

Two years ago we migrated Raspberry Pi from a single big server to a series of virtual machines (VMs) on an even bigger server. As time has gone on this architecture served us well; we’ve managed the Pi Zero launch and the busier Raspberry Pi 3 launch and even briefly ran the website on Raspberry Pis as a test.

However, we had a number of issues with the setup that we were looking to address:

  1. We were out of space at the back end and needed more capacity;
  2. We wanted more redundancy: the setup was dependent on a single dedicated VM host in a single data centre; and
  3. There’s an apparent hardware fault on the current VM host that causes it to very occasionally spontaneously reboot (or in one case, switch itself off altogether)

It was time for a bit of an upgrade.

Scalable WordPress

The biggest part of the Raspberry Pi configuration is the main WordPress site that serves the front page for www.raspberrypi.org. This consisted of three VMs: two web servers and one database server. WordPress doesn’t provide any built-in functionality for scaling to multiple servers and although the vast majority of pages are driven entirely by the database, some operations, such as installing plugins or uploading media, result in the creation of local files that need to be available to all web servers.

In order to support WordPress in a multi-server configuration, we arrange the two web servers as a primary and a secondary. The primary delivers half of the public requests and also the administration and content creation side. The relevant parts of the local filesystem are then regularly rsynced to the secondary server, which serves public requests and can maintain the support the full public usage of the site if necessary.

The website is fronted by our “CDN”. This is a cluster of Mac Mini servers that we use to offload much of the static content traffic, and to load balance across the two web servers.

Step 1 : Figure out what you’re trying to do and write a plan.

More capacity meant a new VM host, and more redundancy meant that it went in a different data centre (Sovereign House, or “SOV”) to the existing VM host (Harbour Exchange, or “HEX”).

Diagnosing the fault on the current VM host is tricky, firstly because it only occurs once or twice a year, and secondly because it was hosting quite a busy live website. So our plan for this was to migrate all VMs onto entirely new hardware in HEX so that we can prod the old box at our leisure. This also gave us a handy opportunity to do something we’ve been wanting to do for a while, which is an OS upgrade on the VM host itself.

To add redundancy to the main site, we can split the two web server VMs across the two sites, but this doesn’t help unless we also replicate the database at the new site. So overall, we wanted to move one web server VM to SOV, and add a database VM, and in HEX we wanted to move the database VM and remaining web server VM to the new hardware. This all needed to be achieved with minimal downtime; as a highly public site we’d rather not have two hours of downtime if we can avoid it.

Step 2 : Move the database

We brought up a second database VM at the second site (SOV). We set this up as a MySQL replica of the primary database VM, this requires only the briefest of interruptions to service to configure. We then simulated a failure of the primary database server and moved all database services to the alternative site – so the database is now in SOV. Again, this has only a very brief interruption to service (<5s).

We then moved the old database VM to the new VM host in the original site (HEX) and reconfigured it as a MySQL replica. We now have a primary/secondary setup for MySQL on the new VM hosts, the secondary in a different location (HEX) to the primary (SOV), and the HEX server is not longer on the faulty hardware.

It’s worth noting that in normal operation, both web servers use the same database server for all queries. In similar arrangements, it’s quite common to have one or more web server use the slave database server for read queries, and to only send write queries to the master, thereby reducing the load on the primary database server. Unfortunately, standard WordPress doesn’t support this, but there are plugins that do which we may look at in the future.

Step 3 : Move the web servers

We shut down the secondary web server (HEX), moved it to the replacement VM host in the same data centre and brought it back up. The CDN automatically redirected all web traffic to the primary web server until the secondary came backup. Once this was complete we took the primary web server offline disabling all administration functions for the main website. Fortunately we’d told everyone in the Raspberry Pi office to drink coffee while we did this so nobody complained. Again the CDN moved all production traffic to the secondary web server, we then moved the primary VM to our alternative data centre (SOV) to sit next to the primary database server (SOV).

Step 4 : Tell the CDN

Unlike many providers, we have independent routing at each of our sites. This gives us much greater resilience to network problems, but means that moving a VM between sites necessitates a change in IP address. We informed our cluster of Mac Minis in the CDN that the primary web server had moved, and the administration site sprung back into life and the traffic split evenly across the two sites.

Step 5 : Drink coffee

Over the course of about three hours, we’d migrated a high volume production website from a non-redundant, single site configuration to a geographically redundant configuration, moved the primary database and primary web server to a different location and provided a capacity upgrade. This was all done in the middle of the day with no user-facing downtime, and only a modest maintenance window for the administration portal.

With that done, we can start work on the next stage of the plan: migrating the remainder of the VMs away from the old VM host in HEX.