Setting up a private Matrix server.

Jan 11 2020

A couple of years ago I spent some time trying to set up Matrix, a self-hosted instant messaging and chat system that works a little like Jabber, a little like IRC, a little like Discord and a little like Slack.  The idea is that anyone can set up their own server which can federate with other servers (in effect making a much larger network), and it can be used for group chat or one-on-one instant messaging.  Matrix also has voice and video conferencing capabilities so you could hold conference calls over the network if you wanted.  For example, one possible use case I have in mind is running games over the Matrix network.  You could even build more exotic forms of conferencing on top of Matrix if you wanted to.  Even more handy is that the Matrix protocol supports end-to-end encryption of message traffic between everyone in a channel as well as between private chats between pairs of people.  If you turn encryption on in a channel it can't be turned off; you'd have delete the channel entirely (which would then cause the chat history to be purged).

Chat history is something that was a stumbling block in my threat model the last time I ran a Matrix server, somewhen in 2016.  Things have changed quite a bit since then.  For usability Matrix servers store chat history in their database, in part as a synchronization mechanism (channels can exist across multiple servers at the same time) and in part to provide a history that users can search through to find stuff, especially if they've just joined a channel.  For some applications, like collaboration inside a company this can be a good thing (and in fact, may be legally required).  For other applications (like a bunch of sysadmins venting in a back channel), not so much.  This is why Matrix has three mechanisms for maintaining privacy: End to end encryption of message traffic (of entire channels as well as private chats), peer-to-peer voice and video using WebRTC (meaning that there is no server that can record the traffic, it merely facilitates the initial connection), and deleting the oldest chat logs from the back-end database.  While it is true that there is no guarantee that other servers are also rotating out their message databases, end-to-end encryption helps ensure that only someone who was in the channel would have the keys to decrypt any of it.  It also seems feasible to set up Matrix channels such that all of the users are on a single server (such as an internal chat) which means that the discussion will not be federated to other servers.  Channels can also be made invite-only to limit who can join them.  Additionally, who can see a channel's history and how much of it can be set on a by-channel basis.

For the record, on the server I built for writing this article the minimum lifetime of conversation history is one calendar day, and the maximum lifetime of conversation history is seven calendar days.  If I could I'd set it to Signal's default of "delete everything before the last 300 messages" but Synapse doesn't support that so I tried to split the difference between usability and privacy (maybe I should file a pull request?)  A maintenance mole crawls through the database once every 24 hours and deletes the oldest stuff.  I could probably make it run more frequently than that but I don't yet know what kind of performance impact that would have.

One of the things I'm going to do in this article is gloss over the common fiddly stuff.  I'm not going to explain how to create an account on a server because I'm going to assume that you know how to look up instructions for doing that.  Hell, I google it from time to time because I don't do it often.  I'm also going to break this process up into a couple of articles.  This one will give you a basic, working install of Synapse (a minimum viable server, if you like).  I also won't go over how to install Certbot (the Let's Encrypt client) to get SSL certificates even though it's a crucial part of the process.  I will explain how to migrate Synapse's database off of SQLite and over to Postgres for better performance in a subsequent article.  For what it's worth I have next to no experience with Postgres, so I'm figuring it out as I go along.  Seasoned Postgres admins will no doubt have words for me.  After that I'll talk about how to make Matrix's VoIP functionality work a little more reliably by installing a STUN server on the same machine.  Later, I'll go over a simple integration of Huginn with a Matrix server (because you just know it's not a technical article unless I bring Huginn into it).

A piece of advice: Don't try to go public with a Matrix server all at once.  The instructions are complex and problematic in places, so this article is written from my notes.  Take your time.  If you rush it you will screw it up, just like I did.  Get what you need working, then move on to the next bit in a day or so.  There's no rush.

Neologism: Faraday roundtable

Oct 22 2018

Faraday roundtable - noun phrase - A meeting conducted entirely offline.  All portable devices and computers are powered down, and ideally locked inside conductive and grounded containers to prevent radio transmissions from reaching or being emitted from same.  Similarly, no active computers are permitted at the meeting.  The proceedings of such a meeting are carried out using Chatham house rules.

Named for the Faraday cage.

Building your own Google Alerts with Huginn and Searx.

Sep 30 2017

A Google feature that doesn't ordinarily get a lot of attention is Google Alerts, which is a service that sends you links to things that match certain search terms on a periodic basis.  Some people use it for  vanity searching because they have a personal brand to maintain, some people use it to keep on top of a rare thing they're interested in (anyone remember the show Probe?), some people use it for bargain hunting, some people use it for intel collection... however, this is all predicated on Google finding out what you're interested in, certainly interested enough to have it send you the latest search results on a periodic basis.  Not everybody's okay with that.

A while ago, I built my own version of Google Alerts using a couple of tools already integrated into my exocortex which I use to periodically run searches, gather information, and compile reports to read when I have a spare moment.  The advantage to this is that the only entities that know about what I'm interested in are other parts of me, and it's as flexible as I care to make it.  The disadvantage is that I have some infrastructure to maintain, but as I'll get to in a bit there are ways to mitigate the amount of effort required.  Here's how I did it...

What the loss of the Internet Privacy Bill means to you and I.

Mar 30 2017

It's probably popped up on your television screen that the Senate and then the House of Representatives voted earlier this week, 215 to 205, to repeal an Internet privacy bill passed last year.  In case you're curious, here's a full list of every Senator and Representative that voted to repeal the bill and how much they received specifically from the telecom lobby right before voting. (local mirror)  By the way, if you would like to contact those Senators (local mirror) or Representatives (local mirror) here's how you can do so... When the bill hits Trump's desk it's a foregone conclusion that he's going to sign it.  Some of the talking heads are expressing concern about this, while others are cheering that the removal of this regulation is an all-around win for the market, blah blah blah... but what does this actually mean for you?

First of all, if you're reading this, welcome to the Internet.  You're soaking in it.

Second of all, please read this blog post (local mirror) by the EFF.  Just a few years ago, a couple of very large ISPs (that you're probably a customer of) got caught doing things like monitoring your web searches and hijacking them with different results they were paid to insert and analyzing your net.traffic to figure out what advertisements to inject in realtime.  The bill that just got repealed put a stop to all of that.

I've spoken to a couple of people who expressed disbelief that such a thing was possible.  In point of fact, intercepting and meddling with communications traffic goes back a very long way.  In 1994 a bill called the Communications Assistance for Law Enforcement Act (CALEA) was passed and codified as 47 USC 1001-1010.  In a nutshell, what this law means is that manufacturers of just about every kind of network-side communications device, from the telephony switches that route your phone calls to the carrier class routers that make up the network core have surveillance capability built in.  In theory, only law enforcement agents with warrants are supposed to be able to use them.  In practice, they're used all the time by employees of the companies that own that equipment to silently troubleshoot problems before they get too out of hand, and yes, they get abused all the time for petty shit.  As you may have guessed already, the moment that CALEA-compliant equipment was deployed back in the day hackers immediately figured out how to use them more effectively than even the telecom companies and silently eavesdropping on people using that functionality was a common "This is how 1337 I am" stunt.  So, please keep in mind that this "monitor all the customers" infrastructure is going to be badly abused and constitutes one hell of a security risk.

CALEA is regularly updated as communications technology evolves, and now encompasses things like the backbone of the Net, Voice-over-IP telephony, cellular telephony and companies whose business it is happens to be running wireless hotspots.  As it so happens, much of this functionality is perfect for monitoring customers' traffic, analyzing it, and packaging it for sale as large bundles of anonymized information or as discrete dossiers, ala Cambridge Analytica.  Let me paint you a picture, based in part of how things worked before that bill was passed originally...