Feb 02 2019
A couple of weeks back, I found myself in a discussion with a couple of friends about searching on the Internet and how easy it is to get caught up in a filter bubble and not realize it. To put not too fine a point on it, because the big search engines (Google, Bing, and so forth) profile users individually and tailor search results to analyses of their search histories (and other personal data they have access to), it's very easy to forget that there are other things out there that you don't know about for the simple reason that they don't show stuff outside of that profile they've built up. If you're a hardcore code hacker you might find it very difficult to find poetry or the name of a television show you saw once unless you take fairly drastic action. The up-side of this profiling is that, inside of your statistical profile search results are great. You can find what you need, when you need it. But outside of that? Good luck.
The point of the discussion was that there were ways that we could escape this filter bubble through application of self-hosted software and a little cooperation.
Ironically, searching through my conversation history I can't seem to find the thread in question so I'm relying entirely upon on-board storage (as it were). So, go ahead and laugh while I geek out. First, a little bit of Internet history.
Feb 02 2019
It should come as little surprise to anyone out there that I have a bit of a problem with hoarding data. Books, music, and of course files of all kinds that I download and read or use in a project for something. Legal briefs, research papers (arXiv is the bane of my existence), stuff people ask me to review, the odd Humble Bundle... So much so that a scant few years ago I rebuilt Leandra to better handle the volume of data in my library. However, it's taken me this long to both figure out and get around to making it easier to find anything in all that mess. If I can't find it, I can't do anything with it, or even figure out what I do or don't have. I also don't often have console access so it's not as if I can SSH in and grep for what I need. I use Nginx as a web server on Leandra so actually getting access to files when I need them is trivial.
Dec 28 2018
If you've been following the development activity of Systembot, the bot I wrote to monitor my machines (physical as well as virtual) you've probably noticed that I changed a number of things around pretty suddenly. This is because the version of Systembot in question had some pretty incorrect assumptions about how things should work. For starters, I thought I was being clever when I wrote the temperature monitoring code when I decided to use what the drivers thought were high or critical values for sending "something is wrong" alerts. No math (aside from a Centigrade-to-Fahrenheit conversion), just a couple of values helpfully supplied by the drivers by way of psutil (which is a fantastic module, by the way; I don't play with it enough). This was hunky-dory until Leandra started running a backup job and her CPU temperature spiked to 125 degrees Fahrenheit while encrypting the data. 125 degrees isn't terribly hot as servers go, but the lm_sensors drivers seem to disagree. Additionally, my assumptions of how often to send the "high temperature" alerts (after every four cycles through the "do stuff" loop) were... naive? Optimistic?
Let's go with optimistic.
What it boiled down to was that I was getting hammered with "temperature is too high!" warning messages roughly six times a second. Some experiments with changing the delay were equally optimistic and futile. I bit the bullet and made the delay-between-alerts configurable. What I have yet to do is make the frequency of different kinds of warning events configurable, because right now they all use the same delay (defined in time_between_alerts). Setting this value to 0 disables sending warnings entirely. This is less suboptimal at best but it's not waking me up every few seconds so I think it'll hold for a couple of days until I can break this logic out a little.
The second assumption that came back to bite me (hardcoding values until something like this happened aside) was that alerting on 80% of a disk being in use without any context isn't necessarily a good idea. My media server at home was also chirping several times a second because one of the hard drives is currently at 85% of capacity. This seems reasonable at first scratch but when you dig a little deeper it's not. 85% of capacity in this case means that there are "only" 411 gigabytes of space left on a 4 terabyte hard drive. Stuff doesn't get added to that drive very often, so that 400+ gigs will last me another couple of months, at least. There's no reason to alert on this, so making this value a parameter in the config file buys me some time before I have to buy another hard drive.
Oct 30 2018
Some time ago I began a search for a decent note-taking tool that I could carry around with me. For many years I was a devotee of the notes.txt file on my desktop, constantly open in a text editor so I could add and refer to it as necessary. When that ceased to scale I turned to software that replicated the legions of sticky notes on my desks at work and home, such as Tomboy. And that worked well enough for a while, but when I started relying upon my mobile more and more for things it too stopped being as useful as I wanted it to be. For about a year I turned to Simplenote, which is pretty much what it says on the tin: It's a note-taking system with a nice web interface, applications on all of the platforms that I use regularly, and even a command line utility which I used to back up my notes a couple of times a day. However, Simplenote is a centralized service and there is always a risk that it could go away at any time. At the very least, the switchover to the Simperium API could have caused problems in the near term for me, and I have enough on my plate these days that I didn't feel like fighting that particular war. So, the search for a replacement that relied more upon my own infrastructure than someone else's began.
Sep 18 2018
As the title of this post implies, I've been working on some stuff lately that's been taking up enough compute cycles that I haven't been around to post much. Some of this is due to work, because we're getting into the really busy time of year and when I haven't been at work I've been relaxing. Some of this is due to yet another run of dental work that, while it hasn't really been worth writing about has resulted in my going to bed and sleeping straight through until the next day. And some of it's due to my hacking on a new project that wound up being... not as hard as I'd imagined it would be, but there certainly has been a steep learning curve.
Aug 18 2018
It seems that there is another influx of refugees from a certain social network that's turned into a never ending flood of bile, vitriol, and cortisol into what we call the Fediverse, a network of a couple of thousand websites running a number of different applications that communicate with each other over a protocol called ActivityPub. Ultimately, the Fediverse is different from Twitter and Facebook in that it's not run as a for-profit entity. There are no analytics, no suggestions of "thought leaders" you might want to follow, no automated curation of the posts you can see versus the ones you really want to see. Socially speaking, you don't find people carefully polishing their brands or trying to game hashtag trends but instead everything from somebody kicking back after work with a cup of coffee to people carefully archiving the firmware of classic computer hardware to in-jokes about pineapples. Rather than fame, you get people.
But that's not what I want to talk about. I've been asked by a couple of people to post a brief tutorial of how I interfaced my Huginn instance with mastodon.social, the Mastodon instance that I spend most of my time hanging out on.
Jul 08 2018
I've mentioned in the past that my exocortex incorporates a number of different kinds of bots that do a number of different things in a slightly different way than Huginn does. Which is to say, rather than running on their own and pinging me when something interesting happens, I can communicate with them directly and they parse what I say to figure out what I want them to do. Every bot is function-specific so this winds up being a somewhat simpler task than it might otherwise appear. One bot runs web searches, another downloads files, videos, and audio, another wakes up and look sat system stats every minute... but where does this all start? How does it all fit together?
It starts with Jabber, the humble XMPP protocol.
May 05 2018
If you have multiple systems (like I do), a problem you've undoubtedly run into is keeping your bookmarks in sync across every browser you use. Of course, there are services that'll happily do this job on you behalf, but they're free, and we all know what free means. If you're interested in being social with your link collection there are some social bookmarking services out there for consideration, including what's left of Delicious. For many years I was a Delicious user (because I liked the idea of maintaining a public bookmark collection that could be useful to people), but Delicious got worse and worse every time it was sold to a new holding company. I eventually gave up on Delicious, pulled my data out, and thought long and hard about how often anybody actually used my public link collection. The answer wound up being "In all probability, not at all," largely because I never received any feedback at all, on-site or off. Oh, well.
For a couple of years I used an application called Unmark to manage my link collection, and it did a decent enough job. It also had some annoying quirks that, over time got farther and farther under my skin, and earlier this year I kicked Unmark in the head and started the search for a replacement. Quirks like, about half the time bookmarks would be saved without any of the descriptions or tags I gave them. No search API. The search function sucked so I couldn't plug my own search function in. Eventually, the Unmark hosted service started redirecting to the Github repository, and then even that redirect went away. Unmark hasn't been worked on in eight months, and Github tickets haven't been touched in about as long. In short, Unmark seems dead as a doornail.
So I migrated my link collection to a new application called Shaarli, and I'm quite pleased with it.
Jan 28 2018
A couple of days ago I gave a talk online to some members of the Zero State about my exocortex. It's a pretty informal talk done as a Hangout where I talk about some of the day to day stuff and where the project came from. I didn't have any notes and it was completely unscripted.
Embedding is disabled for some reason so I can't just put the vide here here. Here's a direct link to the recording.
Oct 28 2017
UPDATED: Added an Nginx configuration block to proxy YaCy.
If you've been squirreling away information for any length of time, chances are you tried to keep it all organized for a certain period of time and then gave up the effort when the volume reached a certain point. Everybody has therir limit to how hard they'll struggle to keep things organized, and past that point there are really only two options: Give up, or bring in help. And by 'help' I mean a search engine of some kind that indexes all of your stuff and makes it searchable so you can find what you need. The idea is, let the software do the work while the user just runs queries against its database to find the documents on demand. Practically every search engine parses HTML to get at the content but there are others that can read PDF files, Microsoft Word documents, spreadsheets, plain text, and occasionally even RSS or ATOM feeds. Since I started offloading some file downloading duties to yet another bot my ability to rename files sanely has... let's be honest... it's been gone for years. Generally speaking, if I need something I have to search for it or it's just not getting done. So here's how I fill that particular niche in my software ecosystem.