Exocortex: Halo

Mar 26, 2016

In my last post on the topic of exocortices I discussed the Huginn project, how it works, what the code for the agents actually look like, and some of the stuff I use Huginn's agent networks for for in my everyday life. In short, I call it my exocortex - an extension of the information processing capabilities of my brain running in silico instead of in vivo. Now I'm going to talk about Exocortex Halo, a separate suite of bots which augment Huginn to carry out tasks that Huginn by itself isn't designed to carry out very easily, and thus extend my personal capabilities significantly.

Now, don't get me wrong, Huginn has a fantastic suite of agents built into it already and more are being added every day. However, good design techniques require one to realize when an existing software architecture is suited for some things and not others, and allowances should be made for that. To put it another way, it was highly unlikely that I would be able to shoehorn the additional functionality I wanted into Huginn and have a hope in hell of it working. However, what Huginn has a multitude of are interfaces for getting events into and out of itself, and I could make use of those interfaces for plugging my own bots into it. The Website Agent is ideal for pinging REST API interfaces of my own design; Jabber Agent implements a simple XMPP client which can send events to an address on an XMPP server (assuming that it has its own login credentials); oversimplifying a bit, Webhook Agent basically sets up a custom REST API rail that external software can use to send events into Huginn for processing; Data Output Agent is used for sending events out of Huginn in the form of an RSS feed or a JSON document that can be consumed and parsed by other software.

To put it another way, only a little creative hacking would be necessary to make my custom software interact with Huginn; all the crazy stuff I could save for my custom bots. I decided to call these separate pieces of code the Halo, because they hover around the core of my exocortex without actually being part of the codebase. Because it's the first language I can actually say that I enjoy for its own sake I decided to write everything in Python and post it to my Github account so that other people can use my code if they want to. I'll talk about the elements of the Halo as they exist at this time in alphabetical order because that's how they show up in the Github repository. If you read this article some time later I'll no doubt have added additional functionality.

I'll start off with beta_fork(), which I'm developing as a sort of universal chatbot-cum-activity monitor. It's built around a core of a single HTTP server that implements two things, a RESTful API server (get the feeling that I really like those?) and a central conversation engine that accepts text to learn from and generate responses to (not necessarily at the same time). Access to the API is guarded with API keys that restrict what an agent can do to the conversation engine (train the conversation engine, submit text to train the conversation engine). The general idea is that the person running the server creates accounts in the server with unique API keys (which can be just about any random junk you want so long as there isn't any whitespace; I use pwgen for this) and points whatever bots they want at the server so long as they can use the REST API (which is documented if you just open the server in a web browser). As a proof of concept I have a basic IRC bot in the clients/ subdirectory based off the codebase of an earlier IRC bot I'd written but there is nothing preventing someone from writing bots for Slack, XMPP, Skype, or any other chat system. You could probably even write bots to plug it into Twitter (What could possibly go wrong?) (but this could be useful...) or set up an SMS number to receive and send text messages. I'll eventually get around to adding a key feature for my particular use case, which is telling the bots what sorts of things to look for in their input (which would be persisted in a database in case the bot needed to be restarted), paging me if they detect those interesting things, and letting me use them as stripped-down chat clients to respond.

Second is the GPS Mapper, which does pretty much what it says on the tin: It accepts HTTP requests from an application running on my phone (source code) containing the GPS coordinates of my phone (and myself, because I'm always carrying it). If you plug the URL of the server into a web browser and log in with the username and password it'll pop up a map with a pushpin representing those coordinates. It'll also update every few seconds as the coordinates change. This is a pretty simple system, I have to admit, almost a toy project. It was one of the very first ones I wrote and I was also teaching myself how to write web applications so, at the advice of an old friend I decided to start with web.py (which is reputed to be one of the most friendly web frameworks for Python out there). However, I like to use as few external dependencies as I can get away with so I'll probably rewrite it to use Python's built in BaseHTTPServer module sooner or later. Anyway, I needed a fair amount of help getting the JavaScript for the map working because web design isn't my forte' and I had a hell of a time just getting the basics up and running.

The third element of the Halo is a SIP client, a custom piece of voice-over-IP software that I use to place telephone calls in response to certain events. I started with the PJSIP library's Python bindings (which took some doing; how I did it is documented in that bot's README.me file) and grew it slowly, function by function. I had to figure out how to first register with and then deregister from my VoIP provider (did I mention that the documentation isn't very good?) and build up from there. Under the hood I use Festival to turn text into synthetic speech (there are probably better options out there by now) and then inject the speech file into the voice channel. One part of the bot takes events out of Huginn, extracts the mesage and turns it into speech, and then it calls the second module of the bot which actually places the telephone call. On the whole this was probably the most difficult to write part of the Halo to date, though not the most frustrating to debug.

The fourth major component of the Halo is a generic XMPP bridge - it logs into an XMPP server with a username and password and listens for its registered owner to send Jabber messages to it. That's pretty much it. To go into more depth about it, the XMPP bridge listens for commands of the form "Bot, do something for me." In its configuration file is a list of names of agents to set up message queues for. The XMPP bridge parses the message and looks for a message queue that matches one of the names of bots it's configured for (the "Bot, " part). If it finds a matching queue it pushes the rest of the message ("do something for me") into the end of the queue. The message queues are exposed through... wait for it... a simple REST API that allows other pieces of software to periodically poll their message queues for events containing commands to carry out. Every time a command is picked up by a bot it gets deleted from the message queue in first-in-first-out order. This means that pretty much any piece of code that is capable of making an HTTP request is able to accept commands that are sent over XMPP, for example, from my mobile phone using Chatsecure. It's handy to use speech recognition on my phone to tell my searchbots to find stuff for me and e-mail them to my primary mobile address so that I can go through the links at my leisure (when I arrive at my destination, say, or after landing).

So far I only have one bot which makes any use of the generic XMPP bridge, a web search bot that takes commands from a message queue ("Waldo, top twenty hits for Kavinsky.") and turns them into requests to one or more search engines specified in its configuration file. The web search bot uses the BeautifulSoup HTML parsing module to extract URLs from whatever comes back from the search engines queried for filtering later. Ultimately there are two functional parts to this bot: URLs to search engines that let the bot put the search request directly into the URL (like this: https://startpage.com/do/search?q=kavinsky) because it's cleaner that way, and hyperlinks that the bot must filter out of the links extracted from the HTML returned by the search engines so that only the search results remain. The results are then e-mailed wherever they need to go (defaulting to the address in the config file). As you'd expect of me I set the defaults to three privacy-aware and -attempting-to-preserve search engines, Startpage, Ixquick, and a public node of the open source YaCy search engine network. However, if you really wanted to you could make a copy of the web_search_bot.conf file, name it something else, and plug other search engines into it. I haven't added proxy support to web_search_bot.py yet, but when I do (or if somebody out there files a pull request) that would make it possible for the web search bot to access the Tor network and access some of the search engines native to that network. The Tor-searchbot would also need a different name and the URLs to filter out of the HTML would probably need to be modified (I'll have to write up the trial and error process I used to come up with the default list). As long as the configuration files are different (at a minimum, the bots would need different names to know which message queue to report to) you could have any number of copies of web_search_bot.py running on a machine; you could even point a copy at your private search engine, whatever that happens to be.

And that's about all I've got for this particular post. When I get some more time I'll write about some of the implications Exocortex seems to have for identity and agency on a personal level.