Oct 19 2019
EDIT - 20191104 @ 2057 UTC-7 - Figured out how long it takes to scrub 40TB of disk space. Also did a couple of experiments with rebalancing btrfs and monitored how long it took.
A couple of weeks ago while working on Leandra I started feeling more and more dissatisfied with how I had her storage array set up. I had a bunch of 4TB hard drives inside her chassis glued together with Linux's mdadm subsystem into what amounts to a mother-huge hard drive (a RAID-5 array with a hotspare in case one blew out), and LVM on top of that which let me pretend that I was partitioning that mother-huge hard drive so I could mount large-ish pieces of it in different places. The thing is, while you can technically resize those virtual partitions (logical volumes) to reallocate space, it's not exactly easy. There's a lot of fiddly stuff that you have to do (resize the file system, resize the logical volume to match, grow the logical volume that needs space, grow the filesystem that needs space, make sure that you actually have enough space) and it gets annoying in a crisis. There was a second concern, which was figuring out which drive was the one that blew out when none of them were labelled or even had indicators of any kind that showed which drive was doing something (like throwing errors because it had crashed). This was a problem that required fairly major surgery to fix, on both hardware and software.
By the bye, the purpose of this post isn't to show off how clever I am or brag about Leandra. This is one part the kind of tutorial I wish I'd had when I was first starting out, and I hope that it helps somebody wrap their mind around some of the more obscure aspects of system administration. This post is also one part cheatsheet, both for me and for anyone out there in a similar situation who needs to get something fixed in a hurry, without a whole lot of trial and error. If deep geek porn isn't your thing, feel free to close the tab; I don't mind (but keep it in mind if you know anyone who might need it later).
Aug 03 2019
A common task that people using Huginn set up as their "Hello, world!" project is getting the daily weather report because it's practical, easy, and fairly well documented. However, the existing example is somewhat obsolete because it references the Weather Underground API that no longer exists, having been sunset at the end of 2018. Recently, the Weather Underground code in the Huginn Weather Agent was taken out because it's no longer usable. But, other options exist. The US National Weather Service has a free to use API that we can use with Huginn with a little extra work. Here's what we have to do:
- Get the GPS coordinates for the place we want weather reports for.
- Use the GPS coordinates to get data out of the NWS API.
- Build a weather report message.
- E-mail it.
As happens sometimes, the admins of the NWS API have imposed an additional constraint upon users accessing their data: They ask that the user agent string of whatever software you use be unique, and ideally include an e-mail address they can contact you through in case something goes amiss. This isn't a big deal.
This tutorial assumes that you've worked with Huginn a bit in the past, but if you haven't I strongly suggest that you read my earlier posts to familiarize yourself.
Okay. Let's get started.
Jul 28 2019
I spend a lot of time digging around in other people's data. If I'm not hunting for anything in particular then it's a bit of a crapshoot, to be honest, if only because you never know what you're in for. You can pretty much take it to the bank that if you didn't assemble it yourself, you can't count on it being complete, well formed, or anything approximating the output of a human being (it usually came out of a database, but I think you see what I'm getting at). Sometimes, if I'm really lucky I'll just get hold of a JSON dump of the database, which to be fair is better than nothing when there isn't even an API to use. From time to time I'll make an attempt at fitting the data into a database of some kind, sometimes MySQL, sometimes SQLite, or occasionally an API layer like Sandman2. This is all well and good, but it winds up being more of an adventure than I'm looking for. I'd much rather be Indiana Jones prowling around in the temple than Rambo going through a preparation montage because Indy was actually getting stuff done.
Wow, this article went a little off the rails. I was never good at writing intros to new code... anyway.
Jul 20 2019
For a couple of years now, I've had my eye on the community of people who've had RFID or NFC chips implanted somewhere in their bodies, usually in the back of the hand. If you've ever used a badge to unlock a door at work or tapped your phone on a point-of-sale terminal to buy something, you've used one of these two technologies in your everyday life to do something useful. What I've wanted to do for a while was use an implanted chip as a second authentication factor to my servers for better security. As for why I couldn't just use something like a key fob or a card or something.. there were a bunch of reasons, most of them having to do with only being able to find what I needed in bulk or the cost being too high because the equipment was aimed at corporate IT departments, where they have a need to crank out a couple of dozen ID badges an hour. There is also the fact that, while I've been curious about various forms of body modification over the years I never really got into into them for a couple of reasons that probably aren't terribly interesting so why not do something useful?
Due to the fact that not everybody is going to be okay with me talking about an elective medical procedure, I'm going to put the rest of this article after the fold.
Feb 02 2019
A couple of weeks back, I found myself in a discussion with a couple of friends about searching on the Internet and how easy it is to get caught up in a filter bubble and not realize it. To put not too fine a point on it, because the big search engines (Google, Bing, and so forth) profile users individually and tailor search results to analyses of their search histories (and other personal data they have access to), it's very easy to forget that there are other things out there that you don't know about for the simple reason that they don't show stuff outside of that profile they've built up. If you're a hardcore code hacker you might find it very difficult to find poetry or the name of a television show you saw once unless you take fairly drastic action. The up-side of this profiling is that, inside of your statistical profile search results are great. You can find what you need, when you need it. But outside of that? Good luck.
The point of the discussion was that there were ways that we could escape this filter bubble through application of self-hosted software and a little cooperation.
Ironically, searching through my conversation history I can't seem to find the thread in question so I'm relying entirely upon on-board storage (as it were). So, go ahead and laugh while I geek out. First, a little bit of Internet history.
Feb 02 2019
It should come as little surprise to anyone out there that I have a bit of a problem with hoarding data. Books, music, and of course files of all kinds that I download and read or use in a project for something. Legal briefs, research papers (arXiv is the bane of my existence), stuff people ask me to review, the odd Humble Bundle... So much so that a scant few years ago I rebuilt Leandra to better handle the volume of data in my library. However, it's taken me this long to both figure out and get around to making it easier to find anything in all that mess. If I can't find it, I can't do anything with it, or even figure out what I do or don't have. I also don't often have console access so it's not as if I can SSH in and grep for what I need. I use Nginx as a web server on Leandra so actually getting access to files when I need them is trivial.
Dec 28 2018
If you've been following the development activity of Systembot, the bot I wrote to monitor my machines (physical as well as virtual) you've probably noticed that I changed a number of things around pretty suddenly. This is because the version of Systembot in question had some pretty incorrect assumptions about how things should work. For starters, I thought I was being clever when I wrote the temperature monitoring code when I decided to use what the drivers thought were high or critical values for sending "something is wrong" alerts. No math (aside from a Centigrade-to-Fahrenheit conversion), just a couple of values helpfully supplied by the drivers by way of psutil (which is a fantastic module, by the way; I don't play with it enough). This was hunky-dory until Leandra started running a backup job and her CPU temperature spiked to 125 degrees Fahrenheit while encrypting the data. 125 degrees isn't terribly hot as servers go, but the lm_sensors drivers seem to disagree. Additionally, my assumptions of how often to send the "high temperature" alerts (after every four cycles through the "do stuff" loop) were... naive? Optimistic?
Let's go with optimistic.
What it boiled down to was that I was getting hammered with "temperature is too high!" warning messages roughly six times a second. Some experiments with changing the delay were equally optimistic and futile. I bit the bullet and made the delay-between-alerts configurable. What I have yet to do is make the frequency of different kinds of warning events configurable, because right now they all use the same delay (defined in time_between_alerts). Setting this value to 0 disables sending warnings entirely. This is less suboptimal at best but it's not waking me up every few seconds so I think it'll hold for a couple of days until I can break this logic out a little.
The second assumption that came back to bite me (hardcoding values until something like this happened aside) was that alerting on 80% of a disk being in use without any context isn't necessarily a good idea. My media server at home was also chirping several times a second because one of the hard drives is currently at 85% of capacity. This seems reasonable at first scratch but when you dig a little deeper it's not. 85% of capacity in this case means that there are "only" 411 gigabytes of space left on a 4 terabyte hard drive. Stuff doesn't get added to that drive very often, so that 400+ gigs will last me another couple of months, at least. There's no reason to alert on this, so making this value a parameter in the config file buys me some time before I have to buy another hard drive.
Oct 30 2018
Some time ago I began a search for a decent note-taking tool that I could carry around with me. For many years I was a devotee of the notes.txt file on my desktop, constantly open in a text editor so I could add and refer to it as necessary. When that ceased to scale I turned to software that replicated the legions of sticky notes on my desks at work and home, such as Tomboy. And that worked well enough for a while, but when I started relying upon my mobile more and more for things it too stopped being as useful as I wanted it to be. For about a year I turned to Simplenote, which is pretty much what it says on the tin: It's a note-taking system with a nice web interface, applications on all of the platforms that I use regularly, and even a command line utility which I used to back up my notes a couple of times a day. However, Simplenote is a centralized service and there is always a risk that it could go away at any time. At the very least, the switchover to the Simperium API could have caused problems in the near term for me, and I have enough on my plate these days that I didn't feel like fighting that particular war. So, the search for a replacement that relied more upon my own infrastructure than someone else's began.
Sep 18 2018
As the title of this post implies, I've been working on some stuff lately that's been taking up enough compute cycles that I haven't been around to post much. Some of this is due to work, because we're getting into the really busy time of year and when I haven't been at work I've been relaxing. Some of this is due to yet another run of dental work that, while it hasn't really been worth writing about has resulted in my going to bed and sleeping straight through until the next day. And some of it's due to my hacking on a new project that wound up being... not as hard as I'd imagined it would be, but there certainly has been a steep learning curve.
Aug 18 2018
It seems that there is another influx of refugees from a certain social network that's turned into a never ending flood of bile, vitriol, and cortisol into what we call the Fediverse, a network of a couple of thousand websites running a number of different applications that communicate with each other over a protocol called ActivityPub. Ultimately, the Fediverse is different from Twitter and Facebook in that it's not run as a for-profit entity. There are no analytics, no suggestions of "thought leaders" you might want to follow, no automated curation of the posts you can see versus the ones you really want to see. Socially speaking, you don't find people carefully polishing their brands or trying to game hashtag trends but instead everything from somebody kicking back after work with a cup of coffee to people carefully archiving the firmware of classic computer hardware to in-jokes about pineapples. Rather than fame, you get people.
But that's not what I want to talk about. I've been asked by a couple of people to post a brief tutorial of how I interfaced my Huginn instance with mastodon.social, the Mastodon instance that I spend most of my time hanging out on.