The Icarus Cognitive Architecture

In my last post, I talked mostly about Knowledge Representation (KR), an area of AI where the intelligence is logical inference on statements. In an example there, I showed how First-Order Logic provided the symbols to have a computer infer that a creature was tapped. KR is very good at making inferences about and understanding its environment, but perception is half the battle for intelligence. You also need to act, and that’s where cognitive architectures come in.

Cognitive architectures attempt to provide an integrated system for all sorts of intelligent behavior. How this happens is fairly broad, but generally, you need some components for recognition and some for action. Some of the better known ones include ACT-R, Soar, and Prodigy. The one that I know about is Icarus. I’ll give you a quick run-down about how it works, and then show it in action. You can read the manual if you want a more complete picture (it’s fairly well written and not too technical), but I’ll try to pick out the important parts.

Icarus is a goal-driven architecture. When it starts, you give an agent a certain goal, and it attempts to use various skills to work towards that goal until it recognizes that it has been completed. An agent will work at this goal for a given number of cycles (or ticks). On each one of these cycles, it first looks to see which percepts are satisfied. For example, you might have a percept that recognizes a creature that either has haste or been under your control since the beginning of your upkeep. On top of these percepts, Icarus can make inferences to come to beliefs about the world. We say that the agent has “concepts” of what potential “beliefs” there are. An example of this might be (blocker ?creature) which specifies that a creature should be used as a blocker because it satisfies readiness, your opponent has a creature, and has a much higher toughness than power. As it is, (blocker ?creature) doesn’t mean a lot, but it might be leveraged later to determine an action.

So percepts and beliefs are our grounding in KR. We also have a goal, which is a belief so that we can recognize that our goal has been completed. The last part here is the set of skills that our agent has. These skills have start conditions and functions that are invoked to make things happen. These functions are the actions; they can actually change the environment (usually by manipulating the variables representing the environment). If an agent sees that its goal has not been satisfied because the goal is not in the set of percepts or beliefs for this cycle, it will invoke the skill that should cause this goal to be reached. This becomes even more powerful because Icarus has hierarchical skills as well. This means that skills can have subgoals which invoke other skills that must be satisfied first. For example, the (cast ?spell) skill might have subgoals of (mana ?cost) and (target ?creature).

So that’s a lot of junk. I have no idea if that makes any sense (actually, tell me if it doesn’t, because it means I need to refine this for the class), but I do have some examples of this at work. The code I’ve linked to is Icarus code. It’s definitely not like C because it’s all written in Lisp, a very popular language among AI researchers. I’d like to hope it makes sense on its own, but I might have lived in code for long enough that code looks like plain English to me (that’s a tragic thought). An important thing to note in all of the code is that in Icarus, anything that has a question mark in front of it, like ?creature, is a variable, and most other stuff is a constant. And anything with a semicolon in front of it is a comment, absolutely in plain English so I can try to explain what’s going on. So you can take a look at my very simple Icarus agent that determines whether a creature should attack, and how much it needs to pump before it’ll attack with it.

Magic Icarus Agent Code

So we can see the output for the running of this agent here. You can see its goals, percepts, beliefs, and what skills it’s trying to execute on each cycle. In this case, on cycle 1, it sees its goal is attack-set. It gets the skill for that. On cycle 2, it sees that it needs to satisfy one of its subgoals, so it executes attack. Note that it already knows that craw wurm is the strongest, so it can skip that subgoal. On cycle 3, it sees that it satisfied all of the subgoals of attack-set, and on cycle 4, it declares attack-set completed. Fun.

But maybe that’s not really that impressive, so I ran it again. In this run, you can see that the first thing I do is add a new creature to the enemy side; a Kalonian Behemoth. The big difference here is that it first sees that our creature isn’t the strongest. To make this belief true, it has to use both of its subgoals. Maybe it doesn’t sound so impressive, but I think it’s neat that we have a single agent that behaves differently based on its environment.

So hopefully that gives you a sense about how cognitive architectures work. Of course, I realize that my agent isn’t nearly powerful enough to actually play magic. I do think, though, that something very impressive could be built out of it. Compared to our minimax AI, I think it’s very clear that this “understands” Magic far better. It recognizes parts of the environment and has a relatively clear line of deliberate actions and skills to help it reach the goal of winning.

But your likely skepticism is warranted. There are a few major drawbacks here. One, there’s a lot of code to do this. As I mentioned last time, it might take hundreds or thousands of predicates (or percepts or concepts) to encompass Magic, and although it might be more true to life, that’s a lot of code that has to be hand-written. And that’s one of the real problems for researchers interested in cognitive architectures: a lot of the knowledge has to be done by hand. And if it’s done by hand, you’re hard-pressed to argue that your agent is intelligent. Instead, the intelligence happens to all be that of the person who programmed the agent. Thus, a big part of research in this area is to provide generalizable knowledge that extends past domains. For example, an agent that could play Magic would be good, but if that agent could play Yu-Gi-Oh based entirely on its own reasoning without additional code, that would be tremendous.

Another drawback is that these agents are bad with numbers. It’s hard to imagine computers being bad with numbers, but you can kind of see that with this agent. For example, let’s say you wanted to figure out how much you want to pay for your firebreathing. You might test it with 3, then 4, then 5. That’s hard for an agent to do because it doesn’t support numbers as variables as part of the state. Or maybe you wanted your goal to be to minimize the damage you take this turn. It doesn’t deal with that too well, either. These, however, would be easy for a statistical algorithm. It would just run a for-loop over those values for firebreathing and see which had the highest resulting evaluation.

So that’s a look at using a cognitive architecture to play Magic. Likely more difficult to do than just programming it straight up, but for cognitive architectures, Magic is just one of many domains it might try to tackle. Large, statistical algorithms are definitely more popular now, but maybe we just need a good presentation of an generally intelligent agent to prove that symbolic AI has some promise.

And by the way, big Zendikar spoilers coming out as I write:

http://mtgcast.com/?p=2365

Leave a Reply