GSCA - acronym, verb - Using grep, sed, cut, and awk on a Linux or UNIX box to chop up, mangle, or otherwise process data on the command line prior to doing anything serious with it. This is not to preclude the use of additional tools (such as sort).
As frequent readers may or may not remember, I rebuilt my primary server last year, and in the process set up a fairly hefty RAID-5 array (24 terabytes) to store data. As one might reasonably expect, backing all of that stuff up is fairly difficult. I'd need to buy enough external hard drives to fit a copy of everything on there, plus extra space to store incremental backups for some length of time. Another problem is that both Leandra and the backup drives would be in the same place at the same time, so if anything happened at the house I'd not only not have access to Leandra anymore, but there's an excellent chance that the backups would be wrecked, leaving me doubly screwed.
Here are the requirements I had for making offsite backups:
- Backups of Leandra had to be offsite, i.e., not in the same state, ideally not on the same coast.
- Reasonably low cost. I ran the numbers on a couple of providers and paying a couple of hundred dollars a month to back up one server was just too expensive.
- Linux friendly.
- My data gets encrypted with a key only I know before it gets sent to the backup provider.
- A number of different backup applications had to support the provider, in case one was no longer supported.
- Easy to restore data from backup.
After a week or two of research and experimentation, as well as pinging various people to get their informed opinions, I decided to go with Backblaze as my offsite backup provider, and Duplicity as my backup software. Here's how I went about it, as well as a few gotchas I ran into along the way.
Let's say there's a website that you want to make a local mirror of. This means that you can refer to it offline, and you can make offline backups of it for archival. Let's further state that you have access to some server someplace with enough disk space to hold the copy, and that you can start a task, disconnect, and let it run to completion some time later, with GNU Screen for example. Let's further state that you want the local copy of the site to not be broken when you load it in a browser; all the links should work, all the images should load, and so forth. One of the quickest and easiest ways to do this is with the wget utility.
A couple of weeks back, somebody I know asked me how I went about deploying SSL certificates from the Let's Encrypt project across all of my stuff. Without going into too much detail about what SSL and TLS are (but here's a good introduction to them), the Let's Encrypt project will issue SSL certificates to anyone who wants one, provided that they can prove somehow that they control what they're cutting a certificate for. You can't use Let's Encrypt to generate a certificate for google.com because they'd try to communicate with the server (there isn't any such thing but bear with me) google.com to verify the request, not be able to, and error out. The actual process is complex and kind of involved (it's crypto so this isn't surprising) but the nice thing is that there are a couple of software packages out there that automate practically everything so all you have to do is run a handful of commands (which you can then copy into a shell script to automate the process) and then turn it into a cron job. The software I use on my systems is called Acme Tiny, and here's what I did to set everything up...
Longtime readers have probably seen the odd post about my getting fed up with Firefox and migrating my workflow (and much of my online data archive) to Chromium, which has been significantly faster if nothing else than Firefox lately. Of course, due to Windbringer's screen resolution I immediately ran into problems with just about every font size being too small, including the text in the URL bar, the menus, and the add-ons that I use. On a lark I went back to my font sizes in Keybase article and give it a try. Lo and behold, when I used --force-device-scale-factor=1.5 it worked - I can see everything now. I could complain about the size of the text in the bookmarks bar, but I'm willing to deal with it because now I can read everything. For the record, here are the contents of my ~/Desktop/chromium.desktop file, so you can do it yourself:
[Desktop Entry] Name=Chromium GenericName=Web Browser Terminal=false Icon=chromium Type=Application Categories=GTK;Network;WebBrowser; MimeType=text/html;text/xml;application/xhtml+xml;text/mml;x-scheme-handler/http; x-scheme-handler/https; Exec=chromium --force-device-scale-factor=1.5 %U
Chrome isn't bad; I have to use it at work (it's the only browser we're allowed to have, enforced centrally). In point of fact, I'd have switched to it a long time ago if it wasn't for one thing. I make heavy use of a plugin for Firefox called Scrapbook Plus, which make it possible to take a full snapshot of a web page and store it locally so that it can be read offline, annotated, and full-text searched. I never count on having connectivity (I live in the United States, after all, and right now my home connection is running quite poorly and has been for several days due to an ongoing situation at my local CO) so I try to keep both essential documentation and reading material in general stored locally for those dry periods. However, there is no port of Scrapbook Plus for Chrome, nor is there a workable equivalent addon for same (I think I've tried them all). I'm not about to do without my traveling hoard of information (which at this time numbers around 10,000 unique web pages and 15 gigabytes of disk space). Out of desperation last night I did some research into how I might be able to speed up Firefox just a little and get more use out of it until I figure out what to do. Here's what I found: