One of these days I'll get around to doing a writeup of an indispensible part of my exocortex, Wallabag. I used it to replace my old paywall breaker program, largely because pumping random articles from the web into a copy of etherpad-lite was janky as hell and did not make for a good user experience. To put it another way, when you're looking for a particular thing in your archive it's a huge time sink to then go through and edit the saved document because it's a single huge line of text. At least Wallabag saves copies of …
Long time readers are probably familar with two things: Horror stories about my dental work, and my endless quest to find search software that'll let me make sense of my data hoard (because I never delete anything). Thankfully, the former's been fairly good lately so I don't have any real complaints there. Things have improved on the latter front, remarkably.
I've experimented off and on with a personal search engine called Recoll, which was designed to work alongside Linux desktop environments initially but later it was ported to Mac OS X and Windows. It is noteworthy in that it tries …
Long time readers have probably read about some of the stuff I do with Searx and I hope that some of you have given some of them a try on your own. If you have you're probably wondering how I get the performance I do because there are some limitations of Searx that have to be worked around. Most of those limitations have to do with the global interpreter lock that is part of the Python programming language which haven't been completely solved yet. What this basically adds up to is that multithreading in Python doesn't actually make great use …
I promise I'll explain what Fess is in a later post. I want to get this information out there in preparation.
If you haven't used Searx before, it's a self-hosted meta-search engine which queries a wide array of search engines (some of which are also self-hosted), collates the search results, and returns them as a regular search result page, an RSS feed, or a JSON API.
One of the lesser known features is that you can add your own search engines. You can either write your own (using an existing one as a template) or you can leverage one of …
Not too long ago I was noodling over a problem: I wanted to break up the scheduling queues in Huginn to make my fleets of agents a little more efficient when the execute. The best way I could think of was to make some of the schedules stochastic - periodically have an agent roll some dice and depending on what comes up decide whether or not to trigger the agents downstream. So, of course I started looking for a random number generator that would basically roll 1d10. However, the Liquid templating language that Huginn uses internally doesn't have any function to …
A Google feature that doesn't ordinarily get a lot of attention is Google Alerts, which is a service that sends you links to things that match certain search terms on a periodic basis. Some people use it for vanity searching because they have a personal brand to maintain, some people use it to keep on top of a rare thing they're interested in (anyone remember the show Probe?), some people use it for bargain hunting, some people use it for intel collection... however, this is all predicated on Google finding out what you're interested in, certainly interested enough to have …