Tunneling across networks with Nebula.

Apr 12 2020

Longtime readers have no doubt observed that I plug a lot weird shit into my exocortex - from bookmark managers to card catalogues to just about anything that has an API.  Sometimes this is fairly straightforward; if it's on the public Net I can get to it (processing that data is a separate issue, of course).  But what about the stuff I have around the lab?  I'm always messing with new toys that are network connected and occasionally useful.  The question is, how do I get it out of the lab and out to my exocortex?  Sometimes I write bots to do that for me, but that can be kind of clunky because a lot of stuff doesn't necessarily need user interaction.  I could always poke some holes in my firewall, lock them to a specific IP address, and set static addresses on my gadgets.  However, out of necessity I've got several layers of firewalls at home and making chains of port forwards work is a huge pain in the ass.  I don't recommend it.  "So, why not a VPN?" you're probably asking.

I'd been considering VPNs as a solution.  For a while I considered the possibility of setting up OpenVPN on a few of my devices-that-are-actually-computers and connecting them to my exocortex as a VPN concentrator.  However, I kept running into problems with trying to make just a single network port available over an OpenVPN connection.  I never managed to figure it out.  Then part of me stumbled across a package called Nebula, originally developed by Slack for doing just what I wanted to do: Make one port inside available to another server in a secure way.  Plus, at the same time it networks all of the servers its running on together.  Here's how I set it up.

Migrating to Restic for offsite backups.

Apr 11 2020

20200426: UPDATE: Fixed the "pruned oldest snapshots" command.

A couple of years back I did a how-to about using a data backup utility called Duplicity to make offsite backups of Leandra to Backblaze B2. (referrer link) It worked just fine; it was stable, it was easy to script, you knew what it was doing.  But over time it started to show its warts, as everything does.  For starters, it was unusually slow when compared to the implementation of rsync Duplicity uses by itself.  I spent some time digging into it and benchmarking as many functional modules as I could and it wasn't that.  The bottleneck also didn't seem to be my network link, as much as I may complain about DSL from AT&T.  Even after upgrading Leandra's network interface it didn't really fix the issue.  Encryption before upload is a hard requirement for me but that didn't seem to be bogging backup runs down either upon investigation.  I even thought it might have been the somewhat lesser read performance of RAID-5 on Leandra's storage array adding up, which is one of the reasons I started using RAID-1 when I upgraded her to btrfs.  That didn't seem to make a difference, either.

Ultimately I decided that Duplicity was just too slow for my needs.  Initial full backups aside (because uploading everything to offsite storage always sucks), it really shouldn't take three hours to do an incremental backup of at most 500 megabytes (out of over 30 terabytes).  On top of that, Duplicity's ability to rotate out the oldest backups... just doesn't seem to work. I wasn't able to clean anything up automatically or manually.  Even after making a brand-new full backup (which I try to do yearly regardless of how much time it takes) I wasn't able to coax Duplicity into rotating out the oldest increments and had to delete the B2 bucket manually (later, of course).  So I did some asking around the Fediverse and listed my requirements.  Somebody (I don't remember whom, sorry) turned me on to Restic because they use it on their servers in production.  I did some research and decided to give it a try.

Using Nginx to spoof HTTP Host headers.

Feb 02 2020

EDIT: s/alice.bob.com/alice.example.com/ to fix part of the backstory.

Let's say that you have a server (like Prosody) that has one or more subsystems (like BOSH and Websockets).  You want to stick them behind a web server like Nginx so that they can be accessed via HTTP - let's say that you want a browser to be able to communicate with those subsystems for some reason.  Or more likely you have a web application that needs to communicate with them in the same way (because Javascript).  Assuming that the above features are already enabled in Prosody, you would put something like this in one of your Nginx config files for, let's say for the sake of argument alice.example.com:

...
    location /http-bind {
        proxy_pass http://localhost:5280/http-bind;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_buffering off;
        tcp_nodelay on;
    }
    location /xmpp-websocket {
        proxy_pass http://localhost:5280/xmpp-websocket;
        proxy_http_version 1.1;
        proxy_set_header Connection "Upgrade";
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_read_timeout 900s;
    }
...

location is the part of the URL Nginx knows it has resources for.  proxy_pass tells Nginx that, whenever something tries to access that part of the URL (https://alice.example.com/http-bind or https://alice.example.com/xmpp-websocket) it should transparently proxy the connection to the given URL (http://localhost:5280/http-bind or /xmpp-websocket, depending) and forward responses back to the client).

But what if you did something a bit less sensible, like put the client on a different host?

Integrating Huginn with a Matrix server.

Jan 19 2020

Throughout this series I've shown you how to set up a Matrix server and client using Synapse and Riot, and make it much more robust as a service by integrating a database server and a mechanism for making VoIP more reliable.  Now we'll wrap it up by doing something neat, building a simple agent network in Huginn to post what I'm listening to into a Matrix Room.  I have an account on libre.fm that my media players log to which we'll be using as our data source.  Of course, this is only a demonstration of the basic technique, you can, in theory plug whatever you want into a Matrix server because the API was designed to be extensible.

We're going to assume that you've already set up a Matrix server and have an account on it, and that you have access to a working Huginn install.

Making a Matrix server STUN-enabled.

Jan 18 2020

Previously in this series I showed you how to migrate a Matrix server to use Postgres, a database server designed for busy workloads, such as those of a busy chat server.  This time around I'll demonstrate how to integrate Synapse with a STUN/TURN server to make the voice and video conferencing features of the Matrix network more reliable.  It's remarkably easy to do but it does take a little planning.  Here's why I recommend doing this:

If you are reading this, chances are you're behind a NATting firewall, which means that your device doesn't have a publically routable IP adresss.  In addition to rewriting all of your network traffic so that it doesn't look like it's coming from a private network, the firewall is also doing port forwarding to pass inbound traffic to your device (least of all replies from web servers), again so it doesn't look like you're behind a firewall.  This works just ducky with TCP traffic because TCP sets up bidirectional connections; TCP packets are acknowledged every time which has the additional effect of letting the firewall keep the connection together.  VoIP traffic, on the other hand, tends to use UDP, which is not connection-oriented.  One way to look at UDP is as a fire-and-forget protocol: The packet gets launched toward its destination, and it may or may not arrive depending upon network core weather patterns, luck, the phase of the moon... packets may also not necessarily arrive in the correct order.  It's an inherently unreliable protocol.  This is what makes it useful for streaming data traffic like audio or video, because it's inherently low latency.  If you've ever been on a call and heard it break up or go into robot mode (or for that matter, seen a television program glitch out) this is probably what happened.  The occasional glitchout is the price you pay for a relatively snappy data stream.

Converting a Matrix server to use Postgres.

Jan 12 2020

In my last post about the Matrix network I covered how to set up a public Synapse server as well as a web-based client called Riot.  In so doing I left out a part of the process for the sake of clarity (because it's a hefty procedure and there's no reason not to break it down into logical modules), which was using a database back-end that's designed for workloads above and beyond what SQLite was meant for.  I'll be the first to tell you, I'm not a database professional, I don't know a whole lot about how to use or admin them aside from installing stuff that uses databases, so there's probably a lot of stuff I'm leaving out.  I'd love to talk to some folks who are more knowledgable than I am.

That said, the only other database server that Synapse supports is Postgres, which seems to have a user interface very different from that of what I use most often (MySQL) so the same procedures don't apply.  I'm going to do my best to break things down.  To make it more clear, I will not be referring back or comparing to MySQL because that will needlessly muddy already unclear waters.

Setting up a private Matrix server.

Jan 11 2020

EDIT - 20200804 - Updated the Nginx stanzas because the newer versions of Certbot do all the work of setting up SSL/TLS support for you, including the most basic Nginx settings.  If you have them there you'll run into trouble unless you delete them or comment them out.  Also, Certbot centralizes all of the appropriate SSL configuration and hardening settings into a single includable file (/etc/letsencrypt/options-ssl-nginx.conf) for ease of maintenance.

A couple of years ago I spent some time trying to set up Matrix, a self-hosted instant messaging and chat system that works a little like Jabber, a little like IRC, a little like Discord and a little like Slack.  The idea is that anyone can set up their own server which can federate with other servers (in effect making a much larger network), and it can be used for group chat or one-on-one instant messaging.  Matrix also has voice and video conferencing capabilities so you could hold conference calls over the network if you wanted.  For example, one possible use case I have in mind is running games over the Matrix network.  You could even build more exotic forms of conferencing on top of Matrix if you wanted to.  Even more handy is that the Matrix protocol supports end-to-end encryption of message traffic between everyone in a channel as well as between private chats between pairs of people.  If you turn encryption on in a channel it can't be turned off; you'd have delete the channel entirely (which would then cause the chat history to be purged).

Chat history is something that was a stumbling block in my threat model the last time I ran a Matrix server, somewhen in 2016.  Things have changed quite a bit since then.  For usability Matrix servers store chat history in their database, in part as a synchronization mechanism (channels can exist across multiple servers at the same time) and in part to provide a history that users can search through to find stuff, especially if they've just joined a channel.  For some applications, like collaboration inside a company this can be a good thing (and in fact, may be legally required).  For other applications (like a bunch of sysadmins venting in a back channel), not so much.  This is why Matrix has three mechanisms for maintaining privacy: End to end encryption of message traffic (of entire channels as well as private chats), peer-to-peer voice and video using WebRTC (meaning that there is no server that can record the traffic, it merely facilitates the initial connection), and deleting the oldest chat logs from the back-end database.  While it is true that there is no guarantee that other servers are also rotating out their message databases, end-to-end encryption helps ensure that only someone who was in the channel would have the keys to decrypt any of it.  It also seems feasible to set up Matrix channels such that all of the users are on a single server (such as an internal chat) which means that the discussion will not be federated to other servers.  Channels can also be made invite-only to limit who can join them.  Additionally, who can see a channel's history and how much of it can be set on a by-channel basis.

For the record, on the server I built for writing this article the minimum lifetime of conversation history is one calendar day, and the maximum lifetime of conversation history is seven calendar days.  If I could I'd set it to Signal's default of "delete everything before the last 300 messages" but Synapse doesn't support that so I tried to split the difference between usability and privacy (maybe I should file a pull request?)  A maintenance mole crawls through the database once every 24 hours and deletes the oldest stuff.  I could probably make it run more frequently than that but I don't yet know what kind of performance impact that would have.

One of the things I'm going to do in this article is gloss over the common fiddly stuff.  I'm not going to explain how to create an account on a server because I'm going to assume that you know how to look up instructions for doing that.  Hell, I google it from time to time because I don't do it often.  I'm also going to break this process up into a couple of articles.  This one will give you a basic, working install of Synapse (a minimum viable server, if you like).  I also won't go over how to install Certbot (the Let's Encrypt client) to get SSL certificates even though it's a crucial part of the process.  I will explain how to migrate Synapse's database off of SQLite and over to Postgres for better performance in a subsequent article.  For what it's worth I have next to no experience with Postgres, so I'm figuring it out as I go along.  Seasoned Postgres admins will no doubt have words for me.  After that I'll talk about how to make Matrix's VoIP functionality work a little more reliably by installing a STUN server on the same machine.  Later, I'll go over a simple integration of Huginn with a Matrix server (because you just know it's not a technical article unless I bring Huginn into it).

A piece of advice: Don't try to go public with a Matrix server all at once.  The instructions are complex and problematic in places, so this article is written from my notes.  Take your time.  If you rush it you will screw it up, just like I did.  Get what you need working, then move on to the next bit in a day or so.  There's no rush.

Rigging up Raspbian Buster to run on a Pi-Top

Jan 06 2020

It doesn't seem that long ago that I put together a Pi-Top and started tricking it out to use as a backup system.  It was problematic in some important ways (the keyboard's a bit wonky), but most of all the supported respin of Raspbian for use with the Pi-Top was really, really slow and a bit fragile.  While Windbringer was busy doing a full backup last week I took my Pi-Top for a spin while out and about, and to be blunt it was too bloody slow to use.  At first I figured that the microSD card I was using for the boot device was one of the lower-quality ones that bogs down once in a while, but that turned out not to be the case.  Out of desperation I started looking into possibly upgrading the RasPi in that particular shell to the latest and greatest version, which I happen to have received as a Yule gift last year.  Lo and behold, I was not the only person to think along these lines. (local mirror)  While the article in question talked at some length about the hardware challenges involved (mostly due to the different arrangement of connectors) the software part was the most valuable to me because it answered, concretely and concisely, how to get unmodified Raspbian working with a Pi-Top's unusual control hardware.  So that this information doesn't get lost in the ether I'm going to write up what I did.

Experimenting with btrfs in production.

Oct 19 2019

EDIT - 20200311 @ 1859 UTC-7 - Added how to replace a dead hard drive in a btrfs pool.

EDIT - 20191104 @ 2057 UTC-7 - Figured out how long it takes to scrub 40TB of disk space.  Also did a couple of experiments with rebalancing btrfs and monitored how long it took.

A couple of weeks ago while working on Leandra I started feeling more and more dissatisfied with how I had her storage array set up.  I had a bunch of 4TB hard drives inside her chassis glued together with Linux's mdadm subsystem into what amounts to a mother-huge hard drive (a RAID-5 array with a hotspare in case one blew out), and LVM on top of that which let me pretend that I was partitioning that mother-huge hard drive so I could mount large-ish pieces of it in different places.  The thing is, while you can technically resize those virtual partitions (logical volumes) to reallocate space, it's not exactly easy.  There's a lot of fiddly stuff that you have to do (resize the file system, resize the logical volume to match, grow the logical volume that needs space, grow the filesystem that needs space, make sure that you actually have enough space) and it gets annoying in a crisis.  There was a second concern, which was figuring out which drive was the one that blew out when none of them were labelled or even had indicators of any kind that showed which drive was doing something (like throwing errors because it had crashed).  This was a problem that required fairly major surgery to fix, on both hardware and software.

By the bye, the purpose of this post isn't to show off how clever I am or brag about Leandra.  This is one part the kind of tutorial I wish I'd had when I was first starting out, and I hope that it helps somebody wrap their mind around some of the more obscure aspects of system administration.  This post is also one part cheatsheet, both for me and for anyone out there in a similar situation who needs to get something fixed in a hurry, without a whole lot of trial and error.  If deep geek porn isn't your thing, feel free to close the tab; I don't mind (but keep it in mind if you know anyone who might need it later).