A couple of months ago for my Lesser Feast I decided to treat myself to a toy that I've had my eye on for a couple of months: A Pi-Top laptop kit. My fascination with the Raspberry Pi aside (which includes, to be honest, being able to run a rack full of servers in my office without needing to install a 40U rack and a new 220 power feed), it strikes me as being a very useful thing to have under one's desk as a backup deck or possibly a general purpose software development computer. Most laptops have one unique motherboard per model and if you want to upgrade (or need to replace it) you're pretty much limited to buying a brand-new laptop. To upgrade a Pi-Top you just need to buy a new RaspberryPi, slide a panel aside, and swap a few cables, a system design that I think could be useful indeed. It also has remarkably few components; the screws and fasteners aside, the PiTop is composed of only a few modules: A base with a battery, a keyboard and touchpad panel, a lid with display, a black lexan access panel, a hub circuit board that ties everything together, and a RasPi. You can get a couple of modules to go with it, such as a prototype board for electrical engineering experiments and modular speakers, all of which attach to a sliding rail and plug into a unique pinset on the hub. I'm not an electrical engineer by any means but I have built many a kit over the years, and from eyeballing it it looked like a fairly simple build. I didn't document the build with photographs or anything because I didn't think to do so at the time. Sorry.
We seem to have reached a unique point in history: Available to your average home user are gargantuan amounts of disk space (8 terabyte hard drives are a thing, and the prices are rapidly coming down to widespread affordability) and enough processing power is available for the palm of your hand that makes the computational power that put the human race on the moon compare in the same was that a grain of sand does to a beach. For most people, it's the latest phone upgrade or more space for your media box. For others, though, it poses an unusual challenge: How to make the best use of the hardware without wasting it needlessly. By this, I mean how one might build a server that doesn't result in wasted hard drive space, wasted SATA ports on the mainboard, or having enough room to put all of that lovely (and by "lovely" I really mean "utterly disorganized") data that accumulates without even trying. I mentioned last year that I rebuilt Leandra (specs in here) so I could work on some machine learning and search engine projects. What I didn't mention was that I had some design constraints that I had to follow so that I could get the most out of her.
To get the best use possible out of all of those hard drives I had to figure out how to structure the RAID, where to put the guts of the Arch Linux install, and most importantly figure out how to set everything up so that if Leandra did blow a hard drive the entire system wouldn't be hosed. If I partitioned all of the drives as described here and used one as the /boot and / partitions, and RAIDed the rest, if the first drive blew I'd be out an entire operating system. Also, gauging the size of the / partition can be tricky; I like to keep my system installs as small as possible and add only packages that I absolutely need (and ruthlessly purge the ones that I don't use anymore). 20 gigs is way too big (currently, Leandra's OS install is 2.9 gigabytes after nearly a year of experimenting with this and that) but it would leave room to grow.
So, what did I finally decide on?
Let's say that you want to mirror a website chock full of data before it gets 451'd - say it's epadatadump.com. You've got a boatload of disk space free on your Linux box (maybe a terabyte or so) and a relatively stable network connection. How do you do it?
wget. You use wget. Here's how you do it:
[user@guerilla-archival:(9) ~]$ wget --mirror --continue \ -e robots=off --wait 30 --random-wait http://epadatadump.com/
Let's break this down:
- wget - Self explanatory.
- --mirror - Mirror the site.
- --continue - If you have to re-run the command, pick up where you left off (including the exact location in a file).
- -e robots=off - Ignore robots.txt because it will be in your way otherwise. Many archive owners use this file to prevent web crawlers (and wget) from riffling through their data. Assuming this is sufficiently important, this is what you want to use.
- --wait 30 - Wait 30 seconds between downloads.
- --random-wait - Actually wait for 0.5 * (value of --wait) to 1.5 * (value of --wait) seconds in between requests to evade rate limiters.
- http://epadatadump.com/ - The URL of the website or archive you're copying.
If the archive you're copying requires a username and password to get in, you'll want to add the --user=<your username> and --password=<your password> to the above command line.
Happy mirroring. Make sure you have enough disk space.
Not too long ago, when the USB key I'd built a set-top media machine died from overuse I decided to rebuild it using Arch Linux with Kodi as the media player. The trick, I keep finding every time, lies in getting Kodi to start up whenever the machine starts up. I think I've re-figured that out six or seven times by now, and each time after it works I forget all about it. So, I guess I'd better write it down for once so that I've got a snapshot of what I did in case I need to do it again later.
The instructions in the Arch Linux wiki work, but you need to pick the right ones to follow. The short-and-sweet ones with the automagickal AUR package don't work. Forget it.
Install LightDM from the Arch package repository (sudo pacman -S lightdm). Then install the instructions I linked to above to the letter. That means carrying out the following tasks:
Create the file /etc/X11/Xwrapper.config. The file should contain only the following text in bold (no double quotes): "needs_root_rights = yes"
Follow the LightDM "Enabling autologin" and "Enabling interactive passwordless login" instructions. Create a user named "kodiuser" (you don't need to set a password" and give it access to system groups necessary to access resouces in the system. I used the following command to do this: sudo useradd -c "Kodi Service Account" -G dbus,network,video,audio,optical,storage,users -m kodiuser
Create two additional groups which LightDM needs to enable autologin:
- sudo groupadd -r autologin
- sudo groupadd -r nopasswdlogin
Add kodiuser to those groups:
- sudo gpasswd -a kodiuser autologin
- sudo gpasswd -a kodiuser nopasswdlogin
A couple of days ago I got it into my head to upgrade one of my Exocortex servers from Ubuntu Server 14.04 LTS to 16.04 LTS, the latest stable release. While Ubuntu long-term support releases are good for a couple of years (14.04 LTS would be supported until at least 2020) I had some concerns about the packages themselves being too stale to run the later releases of much of my software. To be more specific, I could continue to hope that the Ruby and Python interpreters I have installed could be upgraded as necessary but at some point the core system libraries would be too old and they'd no longer compile. Not good for long-term planning.
First off, whenver you're about to do a major upgrade of anything, read the release notes so you know what you're getting yourself into. You'll also usually find some notes about all the new goodies you'll be able to play with.
In the past I've had nothing but trouble using the documented Ubuntu release upgrade process, so much so that I've had clients sign "I told you so," documents when they pressured me to do so because the procedure could reliably be expected to leave the system completely trashed, and a full rebuild was the only recourse. This time I set up a testbed in Virtualbox which consisted of a fully patched Ubuntu Server 14.04.5 LTS install. I ran through the documented upgrade process, and much to my surprise it went smoothly, leaving me with a functional virtual machine at the end of a 45 minute procedure (most of which was automatic, I only had to answer a few questions along the way). The process consisted of logging in as the root user (sudo -s) and running the updater (do-release-upgrade).
So, if it's so easy, why am I writing a blog post about it? Why worry?
Why worry, indeed. Read on.
First, you need someplace for the software to live. I'll say up front that you can happily run Huginn on your laptop, desktop workstation, or server so long as it's not running Windows. Huginn is developed under Linux; it might run under one of the BSDs but I've never tried. I don't know if it'll run as expected in MacOSX because I don't have a Mac. If you want to give Huginn a try but you run Windows, I suggest installing VirtualBox and build a quick virtual machine. I recommend sticking with the officially supported distributions and use the latest stable version of Ubuntu Server. At the risk of sounding self-serving, I also suggest using one of my open source Ubuntu hardening sets to lock down the security on your new VM all in one go. If you're feeling adventurous you can get a VPS from a hosting provider like Amazon's AWS or Linode. I run some of my stuff at Digital Ocean and I'm very pleased with their service. If you'd like to give Digital Ocean a try here's my referral link which will give you $10us of credit, and you are not obligated to continue using their service after it's used up. If I didn't like their service (both commercial and customer) that much I wouldn't bother passing it around.
As serious web apps go, Huginn's system requirements aren't very high so you can build a very functional instance without putting a lot of effort or money toward it. You can run Huginn in about one gigabyte of RAM and one CPU, with a relatively small amount of disk space (twenty gigabytes or so, a fairly small amount for servers these days). Digital Ocean's $10us/month droplet (one CPU, one gigabyte of RAM, and 30 gigabytes of storage) is sufficient for experimentation and light use. To really get serious usage out of Huginn you'll need about two gigabytes of RAM to fit multiple worker daemons into memory. I personally use the following specs for all of my Huginn virtual machines: At least two CPUs, 60 gigabytes of disk space, and at least four gigabytes of RAM. Chances are, any physical machine you have on your desk exceeds these requirements so don't worry too much about it (but see these special instructions if you plan on using an ultra-mini machine like the Raspberry Pi). If you build your own virtual machine, take into account these requirements.