If Microsoft buys Github, there are alternatives.

Jun 03 2018

If you're plugged into the open source or business communities to any degree, you've probably heard buzz that Microsoft is considering buying Github, an online service with a history of having a toxic work environment due to pervasive sexual harassment but still remains the de facto core of collaboration of the open source community - source code hosting, ticket tracking, archival, release management, documentation, project webpage hosting, and generally learning how to use the Git version control system.  At this point it's unclear if they're considering merely investing in the company (currently valued in the neighborhood of $5bus) or buying it outright, the way they did LinkedIn.  Github is certainly an attractive property for Microsoft to consider: The service currently has something like 23 million user accounts and 1.5 million organizations.  I don't think anybody's tried to count the lines of code that Github stores and serves copies of.  It's been observed that Microsoft seems to be carrying out a strategy of controlling as many of the access points to the tech job market.  Not only is Github a highly useful service for managing software projects, but if you're trying to get a job in a technical field having a Github account and a couple of repositories is practically a pre-requisite.

There's also the issue that at least some parts of Microsoft have no qualms against stealing things they think will be useful and filing the identifying features off (local mirror), and fuck the license.  By this, I refer to Learna.  But now I'm getting a little off-track.

As one might imagine, once word got around people began expressing their intention to bail on Github if the takeover went through.  Not that there are no alternatives to Github which not only have many of the same features but are self-hosted, meaning that all you need to do is get an inexpensive virtual machine someplace, install the package, set up backups (you DO back your stuff up, right?), pull your stuff out of Github (easy to do because just about everything is a Git repository), and then push it all back up to your new server.  This is possible because when you clone a Git repository, you get the entire history of the repo - every change ever made, from the very first gets copied to your workstation.  This means that if you then do a `git push` to a new repository, you're effectively making a backup copy of the entire thing to that new remote.  This also means that if there is even a single copy of a Git repository someplace, you can reconstitute the entire project.  This is how I maintain multiple copies of my projects' source code repos simultaneously.  Among these self-hosted alternatives to Github are Gitlab (which is a bit of a bear to maintain, I'm told), Gogs, Gitea, and even Keybase's Git support.

There is, however, another option that I'd like to talk about a little, which I think would be a good alterantive to Github.  It's called Fossil.

Generating passwords.

May 20 2018

A fact of life in the twenty-first century are data breaches - some site or other gets pwned and tends to hundreds of gigabytes of data get stolen.  If you're lucky just the usernames and passwords for the service have been taken; if you're not, credit card and banking information has been exfiltrated.  Good times.

You've probably wondered why stolen passwords are dangerous.  There are a few reasons for this: The first is that people tend to re-use passwords on multiple sites or services.  Coupled with the fact that many online services use e-mail addresses as usernames, this means that all someone has to do is try to log into... well, everything.. with those stolen credentials and see which ones work.  The second is that attackers now have lists of passwords that people actually use, and not huge dictionaries of potential passwords assembled for completeness.  This means that password cracking attacks can be much more precisely targeted and will probably take less time.

There is no shortage of helpful suggestions for generating passwords that are relatively strong and easy to remember.  The one that I find the most useful is the Diceware technique, which is fairly straightforward.

  • Get a handful of six sided dice.
  • Take a large dictionary of words where each word is numbered, and each number consists only of the digits 1 through 6, i.e., 41524
  • Roll the dice.  Find the word with the corresponding number in the dictionary.
  • Do this until you have a long passphrase.

It's a bit tedious, though.  Of course, people have written their own implementations of Diceware for various platforms and with varying states of usability.  I use plain old diceware on Windbringer, mostly because it's available through the AUR but it lacks a few features that I really find useful.  For one, to mix things up I like to sprinkle numbers over my generated passwords, like so: rerun-anteater-idly-00877-lining-paddling-8283

(No, I don't really use that passphrase anywhere.  Come on.)

So, I decided to write my own Diceware utility in Python.  I wrote it to be as self-contained as possible, which is to say as long as you have Python installed on a system it should run.  The wordlist is built into the utility (which accounts for most of its size) and it's as easy to use as I can make it.  I deliberately did not make some options I prefer defaults because I wanted it to be as helpful to people as possible.  Per GNU standard, running ./diceware.py --help will print the online help.  It's also open source so feel free to use it anywhere you like.  I've tested it on Arch Linux and Mac OSX, and I don't see any reason why it wouldn't work on, say, Ubuntu or Raspbian.

Share and enjoy!

Neologism: Gitmnesia

Sep 02 2017

gitmnesia - noun - That feeling when you receive an update email about some ticket on Github from a project that you haven't looked at in so long that you don't recognize its name.  Generally a sign that you follow too many projects on Github.