Jun 17, 2017
I've been promising myself that I'd do a series of articles about tools that I've incorporated into my exocortex over the years, and now's as good a time as any to start. Rather than jump right into the crunchy stuff I thought I'd start with something that's fairly simple to use, straightforward, and endlessly useful for many purposes - a wiki.
Usually, when somebody brings up the topic of wikis one either immediately thinks of Wikipedia or one of the godsawful corporate wikis that one might be forced to use on a daily basis. And you're not that off the mark, because ultimately they're websites that let one or more people create, modify, and delete articles about just about anything one might be inclined to by using only a web browser. Usually you need to set up or be given an account to log into them because wiki spam is to this day a horrendous problem to fight (I've had to do it as parts of previous jobs, and I wouldn't wish it on my worst enemy). If you've been around a while, when you think of having a wiki you might think of setting up something like WikiWikiWeb or Mediawiki, which also means setting up a server, a database, web server software, the wiki software, configuring everything... and unless you have a big, important project that necessitates it, it's kind of overkill and you go right back to a text file on your desktop. And I don't blame you.
There are other options out there that require much less in the way of overhead that are also nicer than the ubiquitous notes.txt file. For the past couple of years (since 2012.ev at least) I've been using a personal wiki called Tiddlywiki for most of my projects which requires just a fairly modern web browser (if you're using Internet Explorer you need to be running IE 10 or later) and some room on your desktop for another file.
Jun 11, 2017
To keep the complexity of parts of my exocortex down I've opted to not separate everything into larger chunks using popular technologies these days, such as Linux containers (though I did Dockerize the XMPP bridge as an experiment) because there are already quite a few moving parts, and increasing complexity does not make for a more secure or stable system. However, this brings up a valid and important question, which is "How do you restart everything if you have to reboot a server for some reason?"
A valid question indeed. Servers need to be rebooted periodically to apply patches, upgrade kernels, and generally to blow the cruft out of the memory field. Traditionally, there are all sorts of hoops and gymnastics one can go through with traditional initscripts but for home-grown and third party stuff it's difficult to run things from initscripts in such a way that they don't have elevated privileges for security reasons. The hands-on way of doing it is to run a GNU Screen session when you log in and start everything up (or reconnect to one if it's already running). This process, also, can be automated to run when a system reboots. Here's how:
Feb 07, 2017
As I've mentioned a few times in the past, diverse parts of my exocortex monitor many different aspects of the world. One of them, called Ironmonger, constantly data mines the global stock markets looking for anomalies. Ordinarily, Ironmonger only triggers when stock trading events greater than three standard deviations hit the market. On Monday, 6 Feb at 14:50:38 hours UTC-0800 (PST), Ironmonger did an acrobatic pirouette off the fucking handle. Massive trades of three different tech companies (Intel, Apple, and Facebook) his the US stock market within the same thirty second period. By "massive," I mean that 3,271,714,562 shares of Apple, 3,271,696,623 shares of Intel, and 2,030,897,857 shares of Facebook all hit the market at the same time. The time_t datestamps of the transactions were 1486421438 (Intel), 1486421431 (Apple), and 1486421442 (Facebook) (I use time.is to convert them back into organic-readable time/date specifiers). I grabbed some screenshots from the Exocortex console at the time - check them out:
Intel ; Apple ; Facebook
The tall blue slivers at the far right-hand edges of each graph represent the stock trades. I waited a couple of hours and took another set of screenshots (Intel, Apple, Facebook) because the graph had moved on a bit and the transaction spikes were much more visible. While my knowledge of the stock market is limited, I have to admit that I've never seen multi-billion share stock trades happen before. Out of curiosity, I took a look at the historical price per share of each of those stocks to see what those huge offers did to them. The answer, somewhat surprisingly, was "not much." Check out these extracts from Ironmonger's memory: Facebook, Intel, and Apple.
Because I am a paranoid and curious sort, I immediately wondered if there was a correlation with the large spike in the Bitcoin transaction fee earlier that day (at 13:19:16 UTC-0800, to be precise). The answer is... probably not. A transaction fee of 2.35288902 BTC (approximately $2510.93us as of 22:32 hours UTC-0800 on 7 February 2017, as I write this article ahead of time), while a sizeable sum that would certainly guarantee that someone's transaction made it into a block at that very instant does not mean that it was involved. There just isn't enough data, but it stands on its own as another anomaly that day. I wish I knew who put those huge blocks of stock up for sale all at once. The only thing they seem to have in common is that they're all listed on the Singularity Index, which is mildly noteworthy.
Anybody have any ideas?
Feb 04, 2017
A couple of weeks ago I ran into some of the functional limits of my web search bot, a bot that I wrote for my exocortex which accepts English-like commands ("Send me top 15 hits for HAL 9000 quotes.") and runs web searches in response using the Searx meta-search engine on the back end. This is to say that I gave my bot a broken command ("Send hits for HAL 9000 quotes.") and the parser got into a state where it couldn't cope, threw an exception, and crashed. To be fair, my command parser was very brittle and it was only a matter of time before I did something dumb and wrecked it. At the time I patched it with a bunch of if..then checks for truncated and incorrect commands, but if you look at all of the conditionals and ad-hoc error handling I probably made the situation worse, as well as much more difficult to maintain in the long run. Time for a rewrite.
Back to my long-term memory field. What to do?
I knew from comp.sci classes long ago that compilers use things called parsers and grammars to interpret code so that it can be converted into an executable. I also knew that the parser Infocom used in its interactive fiction was widely considered to be the best anyone had come up with in a long time, and it was efficient enough to run on humble microcomputers like the C-64 and the Apple II. For quite a few years I also ran and hacked on a MOO, which for the purposes of this post you can think of as a massive interactive fiction environment that the players can modify as well as play in; a MOO's command parser does pretty much the same thing as Infocom's IF parser but is responsive to the changes the user's make to their environments. I also recalled something called a parse tree, which I sort-of-kind-of remembered from comp.sci but because I'd never actually done anything with them, I only recalled a low-res sketch. At least I had someplace to start from so I turned my rebooted web search bot loose with a couple of search terms and went through the results after work. I also spent some time chatting with a colleague whose knowledge of the linguistics of programming languages is significantly greater than mine and bouncing some ideas off of him (thanks, TQ!)
But how do I do something with all this random stuff?
Jan 15, 2017
EDIT: 20170123 - My reviewers have suggested some edits to the article, many of which I've applied.
It's been a while since I wrote a Huginn tutorial, so let's start with a basic one to get you comfortable with the idea of building an agent network. This agent network will run every half hour, poll a REST API endpoint, and e-mail you what it gets. You'll have to have access to an already running Huginn instance that can send outbound e-mail. This post is going to be kind of lengthy, but that's because I'm laying out some fundamentals. Once you understand those you can skip past the explanations and move on to the good stuff.
First, a little background - what's a REST API? If you already know just skip down past the cut and move on, but if you don't know what I'm talking about I'll try to explain. I'm going to assume that you've been able to install Huginn using my instructions or someone else's, or you've got access to a running instance. I'm also going to assume that you're not a hardcore coder, you're someone who's trying to apply a useful tool to your everyday life.
At its simplest, an API (Application Program Interface) is a way to interact with a system or part of a system. It's (hopefully) designed to be regular, which means that once you understand the basics you can apply that knowledge to figure out the more complex parts with a little messing around because the basics continue to apply. Let's say that I've written a library called myLib, which implements a bunch of really annoying stuff (like opening and closing files and sorting elements of data) so you don't have to. My library has a bunch of functions that carry out those tasks (openStupidFile(), readAllOfFilesContents(), sortIntegers(), sortFloatingPointValues(), searchThisCrapForAString()) when you call them in your own code. Those functions are part of my library's API. In the documentation are instructions for calling each function, which includes the arguments you need to pass to each function (e.g., openStupidFile() takes two arguments, a full path to a file and 'r' for read-only or 'rw' for read-write, and it returns a handle to the file that you can pass to another function or NULL if it failed). The data type each function returns (the file handle or NULL value) is part of the API, as are the arguments each function takes (path to the file and 'r' or 'rw').
The same principle has been applied to the Web in several different ways. What concerns us right now is something called the RESTful API (REpresentational State Transfer), which basically means interacting with a web service using HTTP verbs (GET, PUT, POST, and so forth) and referencing URLs instead of functions in a library. Like HTTP, requests are stateless, which means that you make a request, the server responds, and there's no further context beyond that. You can think of RESTful APIs as fire-and-forget. The general idea is that there is a web server of some kind, which could be a traditional one like Apache or a specialized one running inside a web app built around a server like web.py which responds to those URLs in some way. If you make a GET request to a URL, it'll send you some data. If you make a PUT request you replace something on the server at that URL with something you send it. If you make a POST request you create a new something on the server. If you make a DELETE request that something on the server gets erased. All of this depends on the HTTP verbs the server supports (not all REST APIs need to support all of them), your access permissions (not every account can do everything), whether or not you've authenticated to the server (it is sometimes the case that read-only access doesn't require an account but read-write access does require an account or an API token or something else along those lines), or who owns a particular resource (Alice's resources are read-only for every other account on the server, but read-write for her alone), of course. REST makes life easier but it's not carte blanche to run hog wild. Additionally, many REST API services enforce access limits - you get so many requests per minute, hour, or day and after that it returns errors. For example, Twitter's API will return an Error 420 (enhance your calm) if you trip their rate limiter.