Exocortex: Setting up Huginn
In my last post I said that I'd describe in greater detail how to set up the software that I use as the core of my exocortex, called Huginn.
First, you need someplace for the software to live. I'll say up front that you can happily run Huginn on your laptop, desktop workstation, or server so long as it's not running Windows. Huginn is developed under Linux; it might run under one of the BSDs but I've never tried. I don't know if it'll run as expected in MacOSX because I don't have a Mac. If you want to give Huginn a try but you run Windows, I suggest installing VirtualBox and build a quick virtual machine. I recommend sticking with the officially supported distributions and use the latest stable version of Ubuntu Server. At the risk of sounding self-serving, I also suggest using one of my open source Ubuntu hardening sets to lock down the security on your new VM all in one go. If you're feeling adventurous you can get a VPS from a hosting provider like Amazon's AWS or Linode. I run some of my stuff at Digital Ocean and I'm very pleased with their service. If you'd like to give Digital Ocean a try here's my referral link which will give you $10us of credit, and you are not obligated to continue using their service after it's used up. If I didn't like their service (both commercial and customer) that much I wouldn't bother passing it around.
As serious web apps go, Huginn's system requirements aren't very high so you can build a very functional instance without putting a lot of effort or money toward it. You can run Huginn in about one gigabyte of RAM and one CPU, with a relatively small amount of disk space (twenty gigabytes or so, a fairly small amount for servers these days). Digital Ocean's $10us/month droplet (one CPU, one gigabyte of RAM, and 30 gigabytes of storage) is sufficient for experimentation and light use. To really get serious usage out of Huginn you'll need about two gigabytes of RAM to fit multiple worker daemons into memory. I personally use the following specs for all of my Huginn virtual machines: At least two CPUs, 60 gigabytes of disk space, and at least four gigabytes of RAM. Chances are, any physical machine you have on your desk exceeds these requirements so don't worry too much about it (but see these special instructions if you plan on using an ultra-mini machine like the Raspberry Pi). If you build your own virtual machine, take into account these requirements.
Now we're going to be following the official installation instructions, or at least mostly. For our purposes, there are some bits that are going to be problematic or confusing (I speak from experience) so I'll do my best to help you through them with the techniques I used. I haven't fixed the project's official installation docs because they were written for very experienced Rails developers who should not have any difficulty.
Run through the directions in section one of the official installation instructions. You can safely copy and paste the commands and they should run without trouble.
Huginn is written in Ruby but it requires v2.0 or later, and if you've ever worked with Ruby before you know that not every distribution (not even Ubuntu) provides an up-to-date version. So, here is my advice: Skip section two of the official installation docs. Don't bother installing the native Ruby packages if you're doing a from-scratch build, either. Download and install RVM and then use it to install the latest stable version of Ruby: rvm install 2.2
Log out and log back into your (virtual) machine so that the environment variables set up by RVM will take effect. One of the reasons we're installing Ruby this way is because it gives you an isolated Ruby environment unique to your user account and home directory, and it won't touch the operating system itself. This also means that if something goes pear-shaped you can tear the whole thing down (rvm implode) and not worry about messing the rest of the machine up. Now install the three basic Ruby gems necessary to build the rest of Huginn: gem install rake bundler foreman --no-ri --no-doc
You can go through step three of the installation instructions (creating a separate account to run Huginn under) but you don't have to. If this is your first time setting it up, I would advise installing GNU Screen (sudo apt-get install -y screen) and running each part of Huginn inside of a separate shell. After doing so, edit your ~/.bashrc file with your favorite text editor (it doesn't matter which) and look for the line "if [ "$color_prompt" = yes ]; then". Inside of that if;fi block is a pair of environment variables that define what your shell's prompt looks like (e.g., user@host:~). Replace that whole block with the following:
if [ "$color_prompt" = yes ]; then
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w($WINDOW)\$ '
fi
What this does is add some extra code that tells you what GNU Screen shell you're currently paying attention to. Trust me, you need this.
Now go through section four of the installation instructions to install the MySQL database server. Yes, you can use Postgres but for the sake of getting everything up and running rapidly (plus, I'm not very good with Postgres) we're going to use MySQL. DO NOT skip the step of creating a separate user inside of MySQL for Huginn because the specific set of access privileges the procedure sets up are required for normal operation.
Check out the source code to Huginn into your home directory: git clone https://github.com/cantino/huginn
This is a good time to fire up Screen and set up a few shells because doing it earlier rather than later will simplify life immensely. Start Screen (screen -l) and you will see that you are now inside a different user shell because the prompt has changed (user@host:~(0)). Change into the source code directory you just created (cd huginn/) and copy the example configuration file (cp .env.example .env). Now go through the part of the installation instructions called "Configure it" up until you get to the Unicorn config (and when I get around to figuring out how to set up Unicorn, I'll update this howto).
Install all of the Ruby gems, or libraries and sub-programs that Huginn is built on top of: bundle install --deployment --without development test
Create (rake db:create RAILS_ENV=production), populate (rake db:migrate RAILS_ENV=production), and seed (rake db:seed RAILS_ENV=production SEED_USERNAME=[ your username here ] SEED_PASSWORD=[ a password you want to log into Huginn with ]) Huginn's database with the base configuration and some starter agents.
Now compile the assets that will be served up by Rails. What this means is that Rails (the Ruby web application framework that Huginn was built on top of) has a bunch of HTML, CSS, and JavaScript files that aren't complete and need to be preprocessed before it'll start properly. Some files will be minified and others will be generated because they don't exist yet. Assuming that everything has going according to plan (or at least, the way I get it to consistently work, not being a Ruby expert) run this command: rake assets:precompile RAILS_ENV=production
From here on out we're going to assume that you have a VPS running at a hosting provider and not a system that is relatively isolated (like your laptop or a virtual machine running inside of your desktop or laptop). If this is not the case, you can safely omit the references to 127.0.0.1.
Let's fire up Huginn one stage at a time. First we're going to start up the Rails server (which is going to be the front-end that you interact with): RAILS_ENV=production bundle exec rails server -b 127.0.0.1
If it worked you should see output like the following:
user@host:~/huginn(0)$ RAILS_ENV=production bundle exec rails server -b 127.0.0.1
> Booting WEBrick
> Rails 4.2.7.1 application starting in production on http://127.0.0.1:3000
> Run `rails server -h` for more startup options
> Ctrl-C to shutdown server
[2016-08-29 20:46:15] INFO WEBrick 1.3.1
[2016-08-29 20:46:15] INFO ruby 2.1.5 (2014-11-13) [x86_64-linux]
[2016-08-29 20:46:15] INFO WEBrick::HTTPServer#start: pid=3112 port=3000
...this means it worked. Now spawn a second user shell inside of Screen by typing the sequence control+a (as in, the actual "control" or "ctrl" key), then "c". Translated into English, this means "Screen, Create another shell for me." If you look closely, the number in the command prompt should have incremented by one (user@host:~/huginn(1)).
Now to start Huginn's scheduler, which is the process that scans through the database once a minute looking for agents to trigger. Do that with this command: RAILS_ENV=production bundle exec rails runner bin/threaded.rb
Again, this bit assumes that you're installing Huginn on a VPS or a server that is on the public Net. If you're installing Huginn on your laptop or a server running at home, you can skip the following.
Now for something non-obvious: Putting a real web server in front of Huginn to make it easier to use in the long run (plus, if you look closely, you will see that Huginn's Rails server is listening on localhost only, so you can't access it just yet). This will give you a couple of benefits: First, it'll put SSL in front of Huginn to protect your username and password. Second, it'll cache some of the HTML and CSS that Rails serves so it's more responsive. Third, you can put another username and password in front of it to keep people from trying to mess with it (which I can't recommend highly enough). The process I'm about to describe is a little involved, but in the long run it'll hold you in very good stead. Open a new user shell inside of Screen with the sequence I showed you earlier. Install Nginx and the Apache tools: sudo apt-get install -y nginx apache2-utils
When that's done, go through this process for getting a Let's Encrypt SSL certificate for your new server. Even though it's Digital Ocean documentation, it is not specific to their service and can be used on just about any Ubuntu Server instance out there. Go ahead; I'll wait.
When Nginx is back up and the SSL certificate has been installed you need to put a custom Nginx site (or virtual host) file in the right place so that it'll proxy Huginn for you. Huginn comes with an Nginx config file but it assumes that you're using the installation process as documented. I'm trying to make this as easy to do as possible so I'm going to give you the Nginx sitefile that I put together:
server {
listen 443 default ssl;
server_name your-huginn.box.example.com;
client_max_body_size 4G;
keepalive_timeout 5;
root /home/[ your username here ]/huginn/public;
try_files $uri/index.html $uri.html $uri @app;
ssl on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
ssl_certificate /etc/letsencrypt/live/your-huginn.box.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your-huginn.box.example.com/privkey.pem;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
# Wrapped for clarity - ssl_ciphers should be one long line.
ssl_ciphers ECDHE-RSA-AES256-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA:
DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/ssl/dhparam.pem;
location @app {
proxy_pass http://127.0.0.1:3000;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_redirect off;
add_header Strict-Transport-Security "max-age=31536000; includeSubdomains";
add_header X-Frame-Options DENY;
auth_basic "Please enter login credentials";
auth_basic_user_file /etc/nginx/.htpasswd;
}
# Rails error pages
error_page 500 502 503 504 /500.html;
location = /500.html {
root /home/[ your username here ]/huginn/public;
}
}
Copy and paste the above configuration file into a new file /etc/nginx/sites-available/huginn by creating it with your favorite text editor. Note that where it says [ your username here ], substitute your username on your server or virtual machine. Once that's been done, link the file to activate it: sudo ln -s /etc/nginx/sites-available/huginn /etc/nginx/sites-enabled/huginn
Remove the default Nginx site: sudo rm /etc/nginx/sites-enabled/default
Restart Nginx: sudo service nginx restart
Now that Huginn is installed and Nginx is proxying it, I highly recommend that you set up HTTP basic authentication in front of it to prevent anyone from messing around with it. If you look at the Nginx sitefile I gave you, you'll see the line "auth_basic_user_file /etc/nginx/.htpasswd", which is a directive that points to a file that will contain the username and hashed password you'll need to reach the frontpage of your Huginn install. I used the process documented here to do it, but in a nutshell what you'll do is execute the command sudo htpasswd -c /etc/nginx/.htpasswd [ your username here], and when prompted enter a strong password, which you will use to access the frontpage of Huginn. That's it - short, sweet, and to the point. If you want to add additional users to the .htpasswd file you'll run the htpasswd command only you will not add the -c flag (which tells the htpasswd utility to create a new file if it doesn't create, and recreate the file if it does, which you will NOT want).
Now plug the URL of your Huginn server (https://your-huginn.box.example.com) into your web browser, and when prompted enter your username and the password you added to Nginx. You will see the Huginn login screen; enter the username and password you gave when you used rake to seed the database and you will be logged into Huginn.
Incidentally, you'll want to disconnect from Screen so you can eventually log out of your (virtual) machine but leave Huginn running in the background. You'll do this by keying the sequence control+a and then d (for "disconnect from this session"). Screen will keep running in the background as if you were still logged in. To reconnect to that session later, enter the command screen -r ("reconnect to the first disconnected session belonging to me that you find") at the user shell.
And there you have it. It's been a long journey but thankfully you only have to do this once. My next post in this series will probably be about updating Huginn so it tracks the latest commits to the source code repository, and then after that I'll walk you through setting up a basic agent network. In the meantime, I highly recommend that you spend some time browsing the Huginn wiki on Github to get a feel for how it fits together and what it can do.