Apr 11 2020
20200426: UPDATE: Fixed the "pruned oldest snapshots" command.
A couple of years back I did a how-to about using a data backup utility called Duplicity to make offsite backups of Leandra to Backblaze B2. (referrer link) It worked just fine; it was stable, it was easy to script, you knew what it was doing. But over time it started to show its warts, as everything does. For starters, it was unusually slow when compared to the implementation of rsync Duplicity uses by itself. I spent some time digging into it and benchmarking as many functional modules as I could and it wasn't that. The bottleneck also didn't seem to be my network link, as much as I may complain about DSL from AT&T. Even after upgrading Leandra's network interface it didn't really fix the issue. Encryption before upload is a hard requirement for me but that didn't seem to be bogging backup runs down either upon investigation. I even thought it might have been the somewhat lesser read performance of RAID-5 on Leandra's storage array adding up, which is one of the reasons I started using RAID-1 when I upgraded her to btrfs. That didn't seem to make a difference, either.
Ultimately I decided that Duplicity was just too slow for my needs. Initial full backups aside (because uploading everything to offsite storage always sucks), it really shouldn't take three hours to do an incremental backup of at most 500 megabytes (out of over 30 terabytes). On top of that, Duplicity's ability to rotate out the oldest backups... just doesn't seem to work. I wasn't able to clean anything up automatically or manually. Even after making a brand-new full backup (which I try to do yearly regardless of how much time it takes) I wasn't able to coax Duplicity into rotating out the oldest increments and had to delete the B2 bucket manually (later, of course). So I did some asking around the Fediverse and listed my requirements. Somebody (I don't remember whom, sorry) turned me on to Restic because they use it on their servers in production. I did some research and decided to give it a try.
Jan 14 2018
UPDATE: 20191229 - Added how to rotate out the oldest backups.
As frequent readers may or may not remember, I rebuilt my primary server last year, and in the process set up a fairly hefty RAID-5 array (24 terabytes) to store data. As one might reasonably expect, backing all of that stuff up is fairly difficult. I'd need to buy enough external hard drives to fit a copy of everything on there, plus extra space to store incremental backups for some length of time. Another problem is that both Leandra and the backup drives would be in the same place at the same time, so if anything happened at the house I'd not only not have access to Leandra anymore, but there's an excellent chance that the backups would be wrecked, leaving me doubly screwed.
Here are the requirements I had for making offsite backups:
- Backups of Leandra had to be offsite, i.e., not in the same state, ideally not on the same coast.
- Reasonably low cost. I ran the numbers on a couple of providers and paying a couple of hundred dollars a month to back up one server was just too expensive.
- Linux friendly.
- My data gets encrypted with a key only I know before it gets sent to the backup provider.
- A number of different backup applications had to support the provider, in case one was no longer supported.
- Easy to restore data from backup.
After a week or two of research and experimentation, as well as pinging various people to get their informed opinions, I decided to go with Backblaze as my offsite backup provider, and Duplicity as my backup software. Here's how I went about it, as well as a few gotchas I ran into along the way.
Nov 13 2016
A couple of weeks ago my webhosting provider sent me a polite e-mail to inform me that I was using too much disk space. A cursory examination of their e-mail showed that they were getting upset about the daily backups of my site that I was stashing in a hidden directory, and they really prefer that all files in your home directory be accessible. I ran a quick check and, sure enough, about twenty gigabytes times two weeks of daily backups adds up to a fair amount of disk space. So, the question is, how do I keep backing up all my stuff and not bother the admins any more than I have to?
Thankfully, that's a fairly straightforward operation. Beneath the cut is how I did it.