Setting up a private Matrix server.

Jan 11 2020

A couple of years ago I spent some time trying to set up Matrix, a self-hosted instant messaging and chat system that works a little like Jabber, a little like IRC, a little like Discord and a little like Slack.  The idea is that anyone can set up their own server which can federate with other servers (in effect making a much larger network), and it can be used for group chat or one-on-one instant messaging.  Matrix also has voice and video conferencing capabilities so you could hold conference calls over the network if you wanted.  For example, one possible use case I have in mind is running games over the Matrix network.  You could even build more exotic forms of conferencing on top of Matrix if you wanted to.  Even more handy is that the Matrix protocol supports end-to-end encryption of message traffic between everyone in a channel as well as between private chats between pairs of people.  If you turn encryption on in a channel it can't be turned off; you'd have delete the channel entirely (which would then cause the chat history to be purged).

Chat history is something that was a stumbling block in my threat model the last time I ran a Matrix server, somewhen in 2016.  Things have changed quite a bit since then.  For usability Matrix servers store chat history in their database, in part as a synchronization mechanism (channels can exist across multiple servers at the same time) and in part to provide a history that users can search through to find stuff, especially if they've just joined a channel.  For some applications, like collaboration inside a company this can be a good thing (and in fact, may be legally required).  For other applications (like a bunch of sysadmins venting in a back channel), not so much.  This is why Matrix has three mechanisms for maintaining privacy: End to end encryption of message traffic (of entire channels as well as private chats), peer-to-peer voice and video using WebRTC (meaning that there is no server that can record the traffic, it merely facilitates the initial connection), and deleting the oldest chat logs from the back-end database.  While it is true that there is no guarantee that other servers are also rotating out their message databases, end-to-end encryption helps ensure that only someone who was in the channel would have the keys to decrypt any of it.  It also seems feasible to set up Matrix channels such that all of the users are on a single server (such as an internal chat) which means that the discussion will not be federated to other servers.  Channels can also be made invite-only to limit who can join them.  Additionally, who can see a channel's history and how much of it can be set on a by-channel basis.

For the record, on the server I built for writing this article the minimum lifetime of conversation history is one calendar day, and the maximum lifetime of conversation history is seven calendar days.  If I could I'd set it to Signal's default of "delete everything before the last 300 messages" but Synapse doesn't support that so I tried to split the difference between usability and privacy (maybe I should file a pull request?)  A maintenance mole crawls through the database once every 24 hours and deletes the oldest stuff.  I could probably make it run more frequently than that but I don't yet know what kind of performance impact that would have.

One of the things I'm going to do in this article is gloss over the common fiddly stuff.  I'm not going to explain how to create an account on a server because I'm going to assume that you know how to look up instructions for doing that.  Hell, I google it from time to time because I don't do it often.  I'm also going to break this process up into a couple of articles.  This one will give you a basic, working install of Synapse (a minimum viable server, if you like).  I also won't go over how to install Certbot (the Let's Encrypt client) to get SSL certificates even though it's a crucial part of the process.  I will explain how to migrate Synapse's database off of SQLite and over to Postgres for better performance in a subsequent article.  For what it's worth I have next to no experience with Postgres, so I'm figuring it out as I go along.  Seasoned Postgres admins will no doubt have words for me.  After that I'll talk about how to make Matrix's VoIP functionality work a little more reliably by installing a STUN server on the same machine.  Later, I'll go over a simple integration of Huginn with a Matrix server (because you just know it's not a technical article unless I bring Huginn into it).

A piece of advice: Don't try to go public with a Matrix server all at once.  The instructions are complex and problematic in places, so this article is written from my notes.  Take your time.  If you rush it you will screw it up, just like I did.  Get what you need working, then move on to the next bit in a day or so.  There's no rush.

Generating passwords.

May 20 2018

A fact of life in the twenty-first century are data breaches - some site or other gets pwned and tends to hundreds of gigabytes of data get stolen.  If you're lucky just the usernames and passwords for the service have been taken; if you're not, credit card and banking information has been exfiltrated.  Good times.

You've probably wondered why stolen passwords are dangerous.  There are a few reasons for this: The first is that people tend to re-use passwords on multiple sites or services.  Coupled with the fact that many online services use e-mail addresses as usernames, this means that all someone has to do is try to log into... well, everything.. with those stolen credentials and see which ones work.  The second is that attackers now have lists of passwords that people actually use, and not huge dictionaries of potential passwords assembled for completeness.  This means that password cracking attacks can be much more precisely targeted and will probably take less time.

There is no shortage of helpful suggestions for generating passwords that are relatively strong and easy to remember.  The one that I find the most useful is the Diceware technique, which is fairly straightforward.

  • Get a handful of six sided dice.
  • Take a large dictionary of words where each word is numbered, and each number consists only of the digits 1 through 6, i.e., 41524
  • Roll the dice.  Find the word with the corresponding number in the dictionary.
  • Do this until you have a long passphrase.

It's a bit tedious, though.  Of course, people have written their own implementations of Diceware for various platforms and with varying states of usability.  I use plain old diceware on Windbringer, mostly because it's available through the AUR but it lacks a few features that I really find useful.  For one, to mix things up I like to sprinkle numbers over my generated passwords, like so: rerun-anteater-idly-00877-lining-paddling-8283

(No, I don't really use that passphrase anywhere.  Come on.)

So, I decided to write my own Diceware utility in Python.  I wrote it to be as self-contained as possible, which is to say as long as you have Python installed on a system it should run.  The wordlist is built into the utility (which accounts for most of its size) and it's as easy to use as I can make it.  I deliberately did not make some options I prefer defaults because I wanted it to be as helpful to people as possible.  Per GNU standard, running ./diceware.py --help will print the online help.  It's also open source so feel free to use it anywhere you like.  I've tested it on Arch Linux and Mac OSX, and I don't see any reason why it wouldn't work on, say, Ubuntu or Raspbian.

Share and enjoy!