Chrome isn't bad; I have to use it at work (it's the only browser we're allowed to have, enforced centrally). In point of fact, I'd have switched to it a long time ago if it wasn't for one thing. I make heavy use of a plugin for Firefox called Scrapbook Plus, which make it possible to take a full snapshot of a web page and store it locally so that it can be read offline, annotated, and full-text searched. I never count on having connectivity (I live in the United States, after all, and right now my home connection is running quite poorly and has been for several days due to an ongoing situation at my local CO) so I try to keep both essential documentation and reading material in general stored locally for those dry periods. However, there is no port of Scrapbook Plus for Chrome, nor is there a workable equivalent addon for same (I think I've tried them all). I'm not about to do without my traveling hoard of information (which at this time numbers around 10,000 unique web pages and 15 gigabytes of disk space). Out of desperation last night I did some research into how I might be able to speed up Firefox just a little and get more use out of it until I figure out what to do. Here's what I found:
Some time ago I wrote an article about what Keybase is and what it's good for. I also mentioned one of my pet peeves, which is that, by default the fonts used by the Keybase desktop client are way, way too small to see easily on Windbringer. A couple of days ago somebody finally figured out how to blow up the fonts on the desktop, so I can finally see what's going on without putting my nose on the display (and making the mouse cursor jump around because Windbringer has a touchscreen). While I wish that this would be a configuration option in the GUI (or, hell, even a config file) I'll take what I can get. First, some background so everything makes sense...
To keep the complexity of parts of my exocortex down I've opted to not separate everything into larger chunks using popular technologies these days, such as Linux containers (though I did Dockerize the XMPP bridge as an experiment) because there are already quite a few moving parts, and increasing complexity does not make for a more secure or stable system. However, this brings up a valid and important question, which is "How do you restart everything if you have to reboot a server for some reason?"
A valid question indeed. Servers need to be rebooted periodically to apply patches, upgrade kernels, and generally to blow the cruft out of the memory field. Traditionally, there are all sorts of hoops and gymnastics one can go through with traditional initscripts but for home-grown and third party stuff it's difficult to run things from initscripts in such a way that they don't have elevated privileges for security reasons. The hands-on way of doing it is to run a GNU Screen session when you log in and start everything up (or reconnect to one if it's already running). This process, also, can be automated to run when a system reboots. Here's how:
A couple of months ago for my Lesser Feast I decided to treat myself to a toy that I've had my eye on for a couple of months: A Pi-Top laptop kit. My fascination with the Raspberry Pi aside (which includes, to be honest, being able to run a rack full of servers in my office without needing to install a 40U rack and a new 220 power feed), it strikes me as being a very useful thing to have under one's desk as a backup deck or possibly a general purpose software development computer. Most laptops have one unique motherboard per model and if you want to upgrade (or need to replace it) you're pretty much limited to buying a brand-new laptop. To upgrade a Pi-Top you just need to buy a new RaspberryPi, slide a panel aside, and swap a few cables, a system design that I think could be useful indeed. It also has remarkably few components; the screws and fasteners aside, the PiTop is composed of only a few modules: A base with a battery, a keyboard and touchpad panel, a lid with display, a black lexan access panel, a hub circuit board that ties everything together, and a RasPi. You can get a couple of modules to go with it, such as a prototype board for electrical engineering experiments and modular speakers, all of which attach to a sliding rail and plug into a unique pinset on the hub. I'm not an electrical engineer by any means but I have built many a kit over the years, and from eyeballing it it looked like a fairly simple build. I didn't document the build with photographs or anything because I didn't think to do so at the time. Sorry.
We seem to have reached a unique point in history: Available to your average home user are gargantuan amounts of disk space (8 terabyte hard drives are a thing, and the prices are rapidly coming down to widespread affordability) and enough processing power is available for the palm of your hand that makes the computational power that put the human race on the moon compare in the same was that a grain of sand does to a beach. For most people, it's the latest phone upgrade or more space for your media box. For others, though, it poses an unusual challenge: How to make the best use of the hardware without wasting it needlessly. By this, I mean how one might build a server that doesn't result in wasted hard drive space, wasted SATA ports on the mainboard, or having enough room to put all of that lovely (and by "lovely" I really mean "utterly disorganized") data that accumulates without even trying. I mentioned last year that I rebuilt Leandra (specs in here) so I could work on some machine learning and search engine projects. What I didn't mention was that I had some design constraints that I had to follow so that I could get the most out of her.
To get the best use possible out of all of those hard drives I had to figure out how to structure the RAID, where to put the guts of the Arch Linux install, and most importantly figure out how to set everything up so that if Leandra did blow a hard drive the entire system wouldn't be hosed. If I partitioned all of the drives as described here and used one as the /boot and / partitions, and RAIDed the rest, if the first drive blew I'd be out an entire operating system. Also, gauging the size of the / partition can be tricky; I like to keep my system installs as small as possible and add only packages that I absolutely need (and ruthlessly purge the ones that I don't use anymore). 20 gigs is way too big (currently, Leandra's OS install is 2.9 gigabytes after nearly a year of experimenting with this and that) but it would leave room to grow.
So, what did I finally decide on?
Let's say that you want to mirror a website chock full of data before it gets 451'd - say it's epadatadump.com. You've got a boatload of disk space free on your Linux box (maybe a terabyte or so) and a relatively stable network connection. How do you do it?
wget. You use wget. Here's how you do it:
[user@guerilla-archival:(9) ~]$ wget --mirror --continue \ -e robots=off --wait 30 --random-wait http://epadatadump.com/
Let's break this down:
- wget - Self explanatory.
- --mirror - Mirror the site.
- --continue - If you have to re-run the command, pick up where you left off (including the exact location in a file).
- -e robots=off - Ignore robots.txt because it will be in your way otherwise. Many archive owners use this file to prevent web crawlers (and wget) from riffling through their data. Assuming this is sufficiently important, this is what you want to use.
- --wait 30 - Wait 30 seconds between downloads.
- --random-wait - Actually wait for 0.5 * (value of --wait) to 1.5 * (value of --wait) seconds in between requests to evade rate limiters.
- http://epadatadump.com/ - The URL of the website or archive you're copying.
If the archive you're copying requires a username and password to get in, you'll want to add the --user=<your username> and --password=<your password> to the above command line.
Happy mirroring. Make sure you have enough disk space.