Dec 07 2015
In my last post I went into the the history of semi-autonomous software agents in a fair amount of detail, going as far back as the late 1970's and the beginning of formal research in the field in the early 1980's. Now I'm going to pop open the hood and go into some detail about how agents are architected in the context of how they work, some design issues and constraints, and some of the other technologies that they can use or bridge. I'm also going to talk a little about agents' communication protocols, both those used to communiate amongst themselves and those used to communicate with their users.
Software agents are meant to run autonomously once they're activated on their home system. They connect to whatever resources are set in their configuration files and then tend settle into a poll-wait loop where they hit their configured resources about as fast as the operating system will let them. Each time they hit their resources they look for a change in state or a new event and examine every change detected to see if it fits their programmed criteria. The agent then fires an event if there is a match and goes back to its poll-wait loop. Other types of agents use a scheduler design pattern instead of a poll-wait loop. In this design pattern, agents ping their data sources periodically but then go to sleep for a certain period of time, which can be anywhere from a minute to days or even months. This reduces CPU load (because poll-wait loops can hit a resource dozens or even hundreds of times a second, which causes the CPU to spend most of its time waiting for I/O to finish) and network utilization. Some agents may be designed to sleep by default but register themselves with an external scheduler process that wakes them up somehow, possibly by sending them an command over IPC or using an OS signal to touch them off. When considering software agents they seem pretty straightforward to design. All it has to do is start up, fork(2) itself into the background, and tickle some server someplace every once in a while. Right? This isn't actually the case. Conceptually speaking it's a decent high-level explanation but from a technical perspective it's naive in the sense that it skips over a lot of design issues that an aspiring agent developer needs to be cognizant of. The first issue, which I mentioned briefly above the cut, is scheduling. How do you schedule your agents to run so that they don't step on one another and lock up? How would you prevent two or more of your agents from colliding when they both try to access a shared resource? How do you schedule them so they don't get throttled or banned outright from some service they're using on your behalf because you're violating the service's terms of service or accidentally DoSing them? Will it be required that the user must actually be there to respond to the agent, or does the agent's message not need to be acknowledged immediately (if at all)?
Another thing to consider is prioritization of events sent by a given agent. Does the priority of an event differ if it's being sent to another agent, a service, or its user? Should it, and if so how can it be marked as such? What is a CRASH!/CRITICAL/HIGH/Medium/low alert in the context of an agent, will an agent define alerts of all of those types, and what action, if any, should an agent take when it generates or receives an event matching one of those priority levels? When should the agent escalate to the next higher priority level? What happens to the rest of an agent network if and when an agent goes offline? How would the agent network recover? How would the agents that are closest to the dead agent in the network handle it? In the event that the agent system is object oriented in nature, what happens if one of the agent prototypes has a bug? Will they all crash or will just one or two crash if they tickle that bug? What happens if the scheduler dies (which is a huge problem)? How would an agent network recover if multiple points of failure occur? Could the network recover unassisted? How long could the network operate in a degraded state without repairs before collapsing entirely? How would the user migrate part of, or an entire agent network to another processing substrate (at a different provider, a different OS, or a different sort of installation (like a container or a jail))? How would you ensure that existing agent configurations don't break when the prototype agents are upgraded (i.e., bugwards compatibility)?
That's a lot of questions that need to be answered; a daunting number, in fact. This is not to scare anybody away from looking more deeply into software agents or playing around with them, but to give you an idea of some of the things that go into designing an agent framework or network from scratch. I'll talk about this in a later post in this series, but suffice it to say that, whenever possible you should build on top of the work of others when you can, think about the problems you might run into ahead of time, and start small and work your way up.
A question that you are probably asking yourself by now is, what is an event? An event, informally speaking, could be said to be a message sent from an agent to another agent or the user in response to a change in the agent's environment. Think of it like an e-mail or text message. Formally speaking, an event is an action that happens in response to a change in a computing environment, which can but need not necessarily take the form of a function or method firing in response. That function or method could interact with the user in some way, or it could cause an interaction with another agent involving at the very least the exchange of information. Here is an example from one of my agents:
One of the nice things about agents is that ultimately they are pieces of software - files sitting on a disk someplace. This means that agents can be cloned by copying them to make deploying more of them faster. Assuming that they have a relatively straightforward means of configuration (basically, config files), each clone can be made unique with only minimal customization required (i.e., not having to rewrite the logic of the new agent in any substantial way, just changing a few things in the configuration file). The early generations of software agents suffered from the problem of hardcoded configuration values, which is fine (more or less) for research but if you want to use them heavily the last thing you want to do is have to rewrite them each time you stand up a new one. Later generation agent frameworks use domain-specific languages which make it easier to build and deploy agents. You don't need to know C, Ruby, or Python, you just need to know how to specify the problem you want solve in the context of the agent framework, and the framework does the rest. Additionally, agents have a concept of memory. This is to say that each agent has a database that contains most if not all of the history of what it's seen and interacted with, which allows an agent to discern changes in its environment by giving it something to compare against. This also makes it possible to carry out certain kinds of machine reasoning, from simple statistical analysis to trend detection to something like Bayesian analysis, assuming that the data is in a format which lends itself to such. Sometimes an agent's implementation will give you direct access to its memory, sometimes it will give you limited access to the memory field, and sometimes you're stuck with whatever the framework gives you unless you want to rewrite parts of it. You'll have to take into account whether or not you'll want to tinker with your agents' individual memories if you start developing agent networks of your own. And remember, if you really, really want to you can always access the database directly and go messing around... at your own peril, of course.
When you get down to brass tacks, you need to decide where and how you're going to run your software agents. Once upon a time if you were doing research in this field you needed some pretty hefty iron to run them due to their resource requirements. This usually meant scientific workstations of the time, which are now pretty much comparable to the cheapest personal computing hardware available today, like the Raspberry Pi. However, not everybody has the inclination or the bandwidth to run agent networks at home, so other options abound. There are enterprisey environments that could be leveraged like Google App Engine and Heroku, which basically let you upload your code and they'll run it for you while you do more interesting things. You can, of course, run the necessary software on your laptop or desktop computer if you can spare the computing power (and if you either don't mind the occasional bit of downtime while moving from place to place). You can always buy or rent a physical server in a data center someplace and administer everything remotely but that tends to be pretty costly these days. It's much more cost-effective to buy a VPS from a hosting provider and run your agent network from there. Excellent VPSes are very cheap these days and are functionally no different in administrative, software development, or utilization perspectives from having your own physical server.
Well, that about does it for this post about software agent guts. My next post will start going into detail about my own software agent project, which I call Exocortex. In addition to describing its history and how it works, I plan on opening some of my source code and describing how you can set up your own Exocortex for experimentation, development, or the sort of heavy daily use my agents carry out.