Feb 04 2017
A couple of weeks ago I ran into some of the functional limits of my web search bot, a bot that I wrote for my exocortex which accepts English-like commands ("Send me top 15 hits for HAL 9000 quotes.") and runs web searches in response using the Searx meta-search engine on the back end. This is to say that I gave my bot a broken command ("Send hits for HAL 9000 quotes.") and the parser got into a state where it couldn't cope, threw an exception, and crashed. To be fair, my command parser was very brittle and it was only a matter of time before I did something dumb and wrecked it. At the time I patched it with a bunch of if..then checks for truncated and incorrect commands, but if you look at all of the conditionals and ad-hoc error handling I probably made the situation worse, as well as much more difficult to maintain in the long run. Time for a rewrite.
Back to my long-term memory field. What to do?
I knew from comp.sci classes long ago that compilers use things called parsers and grammars to interpret code so that it can be converted into an executable. I also knew that the parser Infocom used in its interactive fiction was widely considered to be the best anyone had come up with in a long time, and it was efficient enough to run on humble microcomputers like the C-64 and the Apple II. For quite a few years I also ran and hacked on a MOO, which for the purposes of this post you can think of as a massive interactive fiction environment that the players can modify as well as play in; a MOO's command parser does pretty much the same thing as Infocom's IF parser but is responsive to the changes the user's make to their environments. I also recalled something called a parse tree, which I sort-of-kind-of remembered from comp.sci but because I'd never actually done anything with them, I only recalled a low-res sketch. At least I had someplace to start from so I turned my rebooted web search bot loose with a couple of search terms and went through the results after work. I also spent some time chatting with a colleague whose knowledge of the linguistics of programming languages is significantly greater than mine and bouncing some ideas off of him (thanks, TQ!)
But how do I do something with all this random stuff?
Sep 11 2016
In my last post I said that I'd describe in greater detail how to set up the software that I use as the core of my exocortex, called Huginn.
First, you need someplace for the software to live. I'll say up front that you can happily run Huginn on your laptop, desktop workstation, or server so long as it's not running Windows. Huginn is developed under Linux; it might run under one of the BSDs but I've never tried. I don't know if it'll run as expected in MacOSX because I don't have a Mac. If you want to give Huginn a try but you run Windows, I suggest installing VirtualBox and build a quick virtual machine. I recommend sticking with the officially supported distributions and use the latest stable version of Ubuntu Server. At the risk of sounding self-serving, I also suggest using one of my open source Ubuntu hardening sets to lock down the security on your new VM all in one go. If you're feeling adventurous you can get a VPS from a hosting provider like Amazon's AWS or Linode. I run some of my stuff at Digital Ocean and I'm very pleased with their service. If you'd like to give Digital Ocean a try here's my referral link which will give you $10us of credit, and you are not obligated to continue using their service after it's used up. If I didn't like their service (both commercial and customer) that much I wouldn't bother passing it around.
As serious web apps go, Huginn's system requirements aren't very high so you can build a very functional instance without putting a lot of effort or money toward it. You can run Huginn in about one gigabyte of RAM and one CPU, with a relatively small amount of disk space (twenty gigabytes or so, a fairly small amount for servers these days). Digital Ocean's $10us/month droplet (one CPU, one gigabyte of RAM, and 30 gigabytes of storage) is sufficient for experimentation and light use. To really get serious usage out of Huginn you'll need about two gigabytes of RAM to fit multiple worker daemons into memory. I personally use the following specs for all of my Huginn virtual machines: At least two CPUs, 60 gigabytes of disk space, and at least four gigabytes of RAM. Chances are, any physical machine you have on your desk exceeds these requirements so don't worry too much about it (but see these special instructions if you plan on using an ultra-mini machine like the Raspberry Pi). If you build your own virtual machine, take into account these requirements.