Archive for the ‘Game Theory’ Category

Standard Metagame and Iterative Deletion

Saturday, August 8th, 2009

I guess we just had nationals a couple weeks ago, and the results were very interesting. Of course, all the big name decks made appearances, and since I don’t play professionally, I’m actually very much out of the loop on what’s going on there. What I thought was interesting, though, was the metagame shift from one week to the next. While we saw a bunch of fae and elf combo decks doing well in the first round of nationals, we see that fae got severely hated out of the US nats finals (32 copies of great stable stag, 31 copies of volcanic fallout), while decks like jund seem to do a little better. Great to know, but perhaps not so helpful right now.

I think this is actually my first shot to see whether some of what I know will be useful. In the game theory class from Yale that I’ve been following, we’ve talked a lot about iterative deletion. The idea basically goes that we have some game. There are a set of strategies that are dominated by another. Therefore, we might assume that no one will play those strategies. With that new game, though, there are more strategies that no one will play, so we can eliminate those. Those ones are only strictly dominated if we assume that everyone else is rational. From those, we can eliminate even more, and maybe see it converge to something. This is very well seen in the “guess 2/3 of the average” game.

I skate over the subject because 1) I’ll cover it more rigorously if I think this can go somewhere 2) I’m feeling lazy right now and 3) because a very excellent article has already been written on a similar subject. John Beety wrote a great piece on how information about how other agents acts will affect your own decision. So if you know that most of the field will play deck X, you should play deck Y because it has a good matchup against that. A bunch of people might know that, though, so you’ll see the popularity of deck X be followed by a surge in deck Y. That’s the metagame shift.

That’s kind of where we are right now. I mentioned we had a shift just recently, and I was thinking that I might be able to actually apply it to the current metagame. If someone’s done this rigorously already, please let me know so I can save myself the effort. If not, though, I’d be willing to give it a shot. There are a lot of numbers to crunch, but I think that’s where my programming background will pay off.

This is where I very much need your help. As I mentioned, while I know what the decks do, I actually know very little about matchups. To be honest, I’ve never actually seen most of these decks played (so on a side note, if anyone knows anywhere where I can watch magic games, either videos or spectating online matches or something, please let me know. Right now, I just have a couple youtube subscriptions for that), so I really don’t know what deck beats what with what regularity. So that you can help me, I created a google spreadsheet that is openly editable that can just does matchups. The first page is the observed probability that the deck on the left column will beat the deck in the first row, given the numbers from this article. Let me know if these numbers are somehow skewed by low frequencies or if more data is out there. The second sheet is completely empty, where it would be really helpful if you could just write a sentence or two about what the perception (qualitatively) is about which deck should beat which, and why. I think I have the knowledge to do the analysis of the metagame; I just don’t have the data.

Hopefully, assuming it gets filled in well, I’ll be able to come back in the next couple days to a week about my findings. There’s a good chance that the analysis will just reinforce what everyone knows, but maybe there’s a surprise out there.

You Make the Play: bolt edition take 2

Sunday, August 2nd, 2009

Hopefully everyone’s gotten a chance to look at the setup for the problem here. I got 2 great comments, so it looks like I have something to talk from.

So there are a couple different options here. Being Magic, there are a lot more option here than what I’m presenting, but we’ll narrow the focus down to what actually seems to matter. Your main choices are:

  1. Swing with Knight of Meadowgrain, leave mana open for harm’s way
  2. Play Honor of the Pure, swing with Knight of Meadowgrain
  3. Don’t swing, maybe do something post-combat

There are a few more combinations throwing in the Figure of Destiny somewhere (I should probably switch out the figure for a 2 mana creature so that + harm’s way isn’t an option, but no worries), but as far as the outcomes I listed, that really shouldn’t affect it.

On this turn, only 1 red mana open and an unknown card means your opponent is representing Lightning Bolt. We can throw this together into a table to see what we get (too lazy to do the HTML on it, sorry)

Bolt payoffs

The numbers inside are our payoffs, normalized so I don’t have to use decimals. See the original post for how those numbers correspond to the payoffs, and see Salivanth’s comment for more explanation on all of those (the only one I disagree with is the swing, no bolt situation. He might play maelstrom pulse, not Jund Charm, so you that’d be a 50%er as well)

What do we have here? Well, we learned about dominant strategies in the last “You Make the Play“, so we can try to apply that first. Remember, a strategies dominates another if it results in a better outcome regardless of what your opponent does (here’s a little more: a strictly dominant strategy always does better; a weakly dominant strategy does at least as well in all situations). Looking at the chart, we can see that not swinging is a dominated strategy. Swinging only results in payoffs of 3 regardless of bolt, and doing nothing has 2s regardless of bolt, so you’re always better off swinging. If you want to take a look at Salivanth’s comment again, he shows that HotP is strictly dominated by no HotP, which I agree with up to the point about what happens in scenario 2.

Now if we only had dominant strategies, this would be where the story ends. If your opponent does have a bolt, we shouldn’t play HotP, but if they don’t, we should play HotP. In the end, this should be pretty obvious, but is a lead-in into how we determine what we do.

Reyemile made the excellent point that whatever we do is “pure guesswork” for the reasons listed. Simply, I haven’t told you anything else about how the game has been going so far. How many bolts has your opponent played? How many cards left in the library? Were there any other times where they clearly should’ve played it if they had it? The most important point here is that the best action is determined by what you think your opponent has in your hand.

Again, not surprising, but remember what happened in the Prisoner’s Dilemma. Sure, it matters what your opponent does in figuring out your payoffs, but it doesn’t affect what you should do; you should accuse the other person (or in our Magic case, sit). Here, what you’re trying to do is figure out what your best response (that’s official terminology) is to your opponent’s action.

And this should be intuitive as well. Depending on how confident you are about what your opponent is going to do. Basically, we can multiple the probability that your opponent has a bolt by the payoffs for each action you choose. Thus, we get equations about the expected payoffs for various actions at various confidences. Here’s a graph about what that looks like. The x axis is your judged probability that they have a bolt. y axis is payoff. Blue line is swinging only, orange line is HotP, and teal line is doing nothing.

Graph of Best Response

(Major props to Apple, by the way, for having the Grapher application. That was really easy)

So what you see there isn’t surprising. Doing nothing never yields the best reward, and HotP is only better when they don’t have a bolt. In more Game Theory terms, nothing is strictly dominated by swinging and weakly dominated by HotP. Swinging only is better when you think there’s a greater than 50% chance that your opponent has a bolt. This, of course, is on the graph because you can look for the intersection between the two curves and determine the ranges in which one is greater than the other. The result is a little trivial in this case because it happens at 50%, but when I get this into a real slide, I’ll pick more creative payoffs so that the intersection is at 57% or something.

So that’s about it with best response. As foreshadowing, best response is the basis for one of the more famous results in Game Theory named after a dude with a movie now, so maybe we’ll hit that if I can come up with a half decent example. And I’ll put responses to the comments from last time more fully in another comment there. Of course, feel free to comment again on this one. I feel like I’m almost hitting the point of BSing here to try to work this in, so if you’re disgusted by this, please don’t just run away. Let me know; criticism is great feedback.

You Make the Play: bolt edition

Saturday, August 1st, 2009

Alright, I’m going to try to set up another situation for you to judge and lead into another topic. Being a work in progress, let me know if the problem is poorly presented or doesn’t really apply because I’m going to need to use these eventually.

You’re playing something like Florian Michelac’s build of white weenies, and your opponent is playing something like Pierre Simoni’s Jund Cascade deck (both can be found here). Scenario:

Your opponent: 2 cards in hand (1 is definitely Jund Charm, unsure of the other), 1 red mana open (and all colors available to play jund charm next turn), no creatures in play

You: Knight of Meadowgrain in play, 2 white mana open, and Honor of the Pure, Harm’s Way, and Figure of Destiny in hand. Here’s what you know:

  • If you swing with Knight of Meadowgrain alone and it gets through, you have a 1/2 chance of winning
  • If you swing with Knight pumped by Honor of the Pure and it gets through, you have a 2/3 chance of winning
  • If you can’t hit your opponent this turn, you have a 1/3 chance of winning

It’s your precombat main phase. What do you do, and more importantly, why?

You Make the Play: Star Edition Take 2

Monday, July 20th, 2009

So I got one great response to the problem. Unfortunately, I guess I should’ve clarified one part of the problem. When I make these posts about game theory, I’m working under the assumption that all else is equal, which is better known as ceteris paribus if you want to sound smart. In this case, you would assume that nothing else not specified in the problem description would remain constant. Ixidor and that Goblin Chieftain wouldn’t be doing anything this turn, and the game state would only change by the life totals. Even so, Reynolds brought up some great points that I was looking to talk off of.

Reynolds mentioned that you can’t know for certain what Akroma will do after you. What we can do is try to figure out what our best move for every move that Akroma makes. Let’s assume Akroma attacks. Should we attack or not? If we don’t attack, we’ll be at 20 life. If we do attack, we’ll be at 15 life. So it seems like our better move if Akroma does attack is not to attack. Let’s assume, then, that Akroma doesn’t attack. If we don’t attack, we’ll be at 10 life. If we do attack, we’ll be at 5 life. So if Akroma doesn’t attack, we’re better off if we also don’t attack.

This seems like a reasonable way to think about this problem, at least in a very limited sense. There are very clearly some things that are wrong with this reasoning within a Magic sense beyond ceteris paribus. As Reynolds pointed out, there’s more to the game than just this one turn. We have to consider the game in a larger scope, so we’re actually repeating this situation over and over each time it’s our turn. Repeated games are a little trickier, so I’ll punt on that. Another point he brought up is that life isn’t the only factor here; even if we assume that all else remains constant, your choice clearly influences the opinions around the table, so it’s a little complicated. The biggest violation I made here to canon, though, is that Akroma will know what you chose to do when you make a choice. Fortunately, for the idea I’m trying to get across right here, it really doesn’t matter.

What this really is is a Magic setup for the classic problem, the prisoner’s dilemma. Here’s the description on wikipedia:

Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies (defects from the other) for the prosecution against the other and the other remains silent (cooperates with the other), the betrayer goes free and the silent accomplice receives the full 10-year sentence. If both remain silent, both prisoners are sentenced to only six months in jail for a minor charge. If each betrays the other, each receives a five-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?

Isn’t that grim? This is pretty famous. If you’ve played Baldur’s Gate 2, a genie in the beginning poses it to you. I remember in Dilbert that they split up a bunch of characters into rooms to try to get them to rat each other out for murder. Dilbert very heroically and indignantly points out that since every one of them is on the same side, everyone will shut up and that it won’t work, only to have the blinds pulled up and see that everyone else blames the murder on him.

Dilbert, in this case, is the sucker. It seems like a very attractive option to not betray anyone else; if everyone does that, everyone gets a lighter sentence. The bad news is that even if you convince everyone else to shut up, it’s still better for you to rat them out. Tattling (or not attacking Phage) is an example of a dominant strategy (real Game Theory term). A strategy dominates another strategy if it results in a better outcome for you regardless of how your opponents act. In this case of this Magic game, not attacking dominates attacking because whether Akroma attacks or not, your life total is still higher when you don’t attack than if you do attack.

Obviously, there’s a lot more going on in this situation. Potential take 3 here if you guys come up with more observations about this situation.

You Make the Play: Star Edition

Saturday, July 18th, 2009

It seems pretty common among magic blogs to offer up situations and get responses on the proper way to play it. So I’m going to do the same thing.

Let’s say you’re playing a magic game in the star format. In this one, there are 5 players, and each player has to try to defeat the 2 people sitting across from them. The first person to do that wins. One of your opponents is Phage, and their opponents are you and Akroma. The board is pretty developed, and choices here matter a lot. Particularly, you and Akroma are both in good positions to attack someone. Your situations are actually pretty similar.

If you both attack Phage, you can probably beat Phage down pretty badly so that she can’t counter attack as strongly. When it comes around to both of your next turns, you’ll both be at 15 life after taking a little hurt from Phage. If you attack Phage and Akroma doesn’t, then Phage will get beat a little, but she’ll be able to counter-attack, and since you attacked her, she’ll hit you back. So you’ll be at 5 life, and Akroma will be at 20 life. Conversely, if you don’t attack and Akroma does, Akroma will be at 5, and you’ll be at 20. If neither of you attack, Phage will just attack both of you, and you’ll both be at 10 life.

It’s your turn, and Akroma’s after yours. Do you attack or not?

(Edit: I guess an even better question is, in addition to how you attack, what else would you do and what factors caused you do choose to do what you did?)

Alpha-Beta Pruning

Sunday, July 12th, 2009

Two weeks ago, I wrote about search in a search tree. The idea was that a game (like Magic) can be broken up into a series of positions, and a good way to determine the best strategy is to evaluate all of these positions and find a path that gets you to a winning state, even when your opponent makes their best move as well. I left a few holes in that article, including how one turns the evaluation function into rules about what to do and what other shortcuts one can find in searching over this very large tree. So let’s finish that.

I briefly mentioned minimax at the end of the last article, and that’s where I’m going to start. Minimax is a strategy for evaluating these trees that should be fairly intuitive. How do players usually make decisions? Based on which outcomes are mostly likely to allow you to win. When you can look at the entire depth of a tree from your current position (as in, you can see every possible way that you win or lose), then you’re trying to reach one of those winning states. To have that happen, you probably need to make a choice on that second to last position to win. But you can also assume there that your opponent is going to make a choice one position before that that tries the most to keep you from winning. What this means is that you can guess at how a game is going to go, and how good a position is by propagating the alternating best and worst positions for you.

Let’s make that more concrete. You are at 2 life, have 2 Ashenmoor Gougers and a summoning sicknessed mogg fanatic in play before your combat step and no cards in hand.  Your opponent is tapped out with a foul imp untapped, 1 foul imp tapped, no cards in hand, and is at 5 life*. What do you do?

It should be fairly obvious that you swing with both gougers and then pop the fanatic for the kill. So let’s see how you came to that.

Your current position is listed above and you’re in your declare attackers step. Your choices are attack with 0, 1, or 2 gougers, each of which represents a different branch from your current position. If you attack with 0 gougers, nothing happens. The next major choice (I cheat a lot since magic has tons of intermediary positions for phases changes and such) is your opponent’s attack step. We assume your opponent is rational and makes his best move, which is to kill you. Technically, he could’ve done nothing during his attack step, but we won’t count on that. So you lose. We can propagate losing up a level to say that not attacking results in you losing.

The same applies for attacking with 1 gouger. You can do that, so the next position to evaluate is how your opponent blocks. He can either block with his imp, or not. Since he’s rational, he sees the lookahead 1 move to where you pop your fanatic and determines that the result of not blocking is losing, and blocking is winning. So he should choose to block, so this position for you can be generalized as a loss.

But if you attack with both, all moves for him result in losing. Thus, that can be propagated to say that your current position is a winning on because the best outcome out of all the possible next positions is a win.

So that’s minimax in its complete form. As I mentioned in the previous post, though, it’s often unfeasible to enumerate the entire tree to the end of the game. Instead, one might look maybe 5 moves ahead and consider all of those possibilities. Since these don’t have definite win/loss states, we can instead use the evaluation function to give them numerical values. We can using the same propagation method to assign the values for positions 4 moves ahead by considering the values of the 5-ahead states that it’s linked to. We can continue to work backwards until we’ve figured out the values for the next move, then pick the best one.

Even so, this is slow. So here’s where alpha-beta pruning comes in. Alpha-beta pruning applies a heuristic to minimax to avoid looking at entire trees. The basic idea here is that we don’t need to consider entire subtrees (all of the positions following a given position) if we know that that subtree can be skipped. From any position, you’re basically doing a lookahead 2 positions to see if you’re in a better spot after your opponent makes their choice. Since we can find maximums and minimums along the way, certain subtrees don’t matter as they’re already less optimal. That was ambiguous; let’s take a trivial example.

Let’s say your opponent has a prodigal pyromancer in play but with summoning sickness. It’s your turn, you have 1 life, and you know your opponent has a goblin piker on the top of their library because you excommunicated it with no cards in hand. You have a lightning bolt and goblin piker in hand with only enough mana for one of the spells, with your opponent at 6 life. From this position, you have 4 choices: bolt the pyromancer, bolt your opponent, make your opponent discard a card, or do nothing. So we can consider these each as a different path down the search tree.

The first is to bolt the pyromancer. If you do this, your opponent’s board is clean. Next, it’s your opponent’s move. They’re only options are to either play or not play their piker. If they play it, your evaluation function puts it at -10, while if they don’t play it, you’re at -5. Since your opponent is trying to minimize your evaluation function, they will play, so the action of bolting the pyromancer has a net value of -10.

The next is to bolt your opponent. If you do this, your opponent is now at 3. On their turn, they can poke with the pyromancer or not, play the piker or not. If they poke you, you die, resulting in an evaluation of -1000. Here’s where alpha-beta pruning comes in. At this point, you know that the action of bolting the pyromancer results in a worst case of -10. Bolting your opponent results in at least a worst case of -1000. There’s a chance that other actions your opponent will make will be worse (probably not), but that doesn’t matter because if you bolt your opponent, one of your opponent’s action is worse than the worst case of another action. So you don’t need to worry about considering the cases where they play the piker and so forth. This is the pruning action of not considering this sub-tree.

This example is a little trivial, but it’s hopefully very clear. Alpha-beta pruning can be used at multiple levels. For example, the position we discussed above might be 3 levels deep in another tree rooted at what you’re actual position is, and the result of that will propagate into a larger situation.

Hopefully, this all seems intuitive. Indeed, we as humans actually prune a lot more deeply than this; in a game, it would never have crossed my mind to not bolt the pyromancer, but those are other heuristics. Maybe that’ll come up in a future post. In the mean time, you can google around for alpha-beta pruning and get more rigorous explanations of how it works and why it’s actually called alpha-beta pruning.

*You’ll note I cheated majorly and made sure that no one has any tricks because minimax, in its most naive form, assumes perfect information. Maybe I’ll deal with this later

Search in a Search Tree and “Decision-Point Theory”

Wednesday, June 24th, 2009

Just this morning, I figured I should pick up the starcitygames feed to make sure that I was seeking more important articles. That was a great decision, because I already have an article to write about.

In my last post, I was talking about how a Magic game can be formalized into a series of positions, each with its own set of values representing the entirety of the game. We can take all these positions and put them into a search tree, where each position is linked to others depending on what actions can lead from one to another. With this, we can now talk a little more formally about how search through that works (if you’re trying to follow in the original dailymtg article, we’re still in first section, “Basic Concept”).

So the search algorithm that a computer would use on a search tree like this would be to find the best (ambiguity in defining best because I’ll address it later) path of transitions between positions that lead from the current position to a winning position. All that really means is that the computer’s algorithm for winning is to determine each step it needs to make that gives it the best chance of winning from where it is now. That doesn’t sound nearly as nerdy.

Because that’s exactly what we as humans do, and Gavin Verhey just posted 2 articles* on starcitygames discussing that exact topic. Basically, he says that magic is a series of “decision points,” and at each, you have new choices and decisions to make and you can make the right decision by looking ahead on that road and seeing what options are there. This shouldn’t sound surprising; this is both intuitive, and very similar to what Patrick Buckland had to say in the dailymtg article.

I guess the real trick here is determining which choice is the best and doing that relatively quickly. There are obviously tradeoffs between these two goals, but I’ll see if I can walk through the intuition for how this is done step by step.

So let’s temporarily assume that we have infinite time. You’re, say, trying to figure out whether you should chump block a shambling remains with a mogg fanatic now, or wait and hope that you don’t die before you can use the mogg fanatic more effectively when you can deal 1 more point of damage to the shambling remains. Since we have infinite time, we can just see exactly where both of these decisions would lead. Even though the trees below both are still huge, they’re finite, so it can be done in infinite time (maybe think 5 monkeys with typewriters and infinite time writing the entire works of shakespeare?). This way, you can determine which is better by, say, taking the ratio of wins versus losses in each sub-tree and picking the more favorable one. So if blocking results in 1 million wins to 500,000 losses and not blocking is 2 million wins to 1.9 million losses, let the fanatic die now.

So your first instinct here is that just taking the ratio isn’t a good way to determine which path is better. Certain paths are more likely (have a greater probability) of happening. Which is true. And maybe not all of those options are equal. In that case, maybe we can weight each outcome by the probability that it would occur. But in the end, it seems like our best method is to somehow quantify (probably numerically, since this is going to be a computer algorithm) the value of various paths and pick the option that has the better summed up value in its sub-tree.

Unfortunately, we don’t have infinite time. Because our finite time is probably significantly less than the size of the tree, we can only look a few decisions and positions ahead (perhaps even as few as 1). In this case, we take shortcuts, or heuristics. One of these is to evaluate the goodness of any position and compare those. We can do that using an evaluation function, which is just a fancy word for taking a bunch of important factors, weighing them, and getting an end value. For example, our 2 choices may result in something like:

2 * 1 (for having the mogg fanatic) + .1 * 16 (being at 16 life) + .1 * 1 (for having just dealt 1 dmg to your opponent with the fanatic’s ability) + …other stuff = 5

2 * 0 (no mogg fanatic) + .1 * 20 (20 life) + .1 * 0 (didn’t pop the fanatic) + …other stuff = something less than 5

(Ignore the 2 most obvious faults that popping the fanatic is another decision and that there isn’t a linear value to life)

So it seems like in the position immediately resulting from blocking or not blocking, it’s more valuable to block. This, however, is obviously short-sighted. Maybe, your opponent can immediately do something else (like, say, play a figure of destiny that you now no longer have a mogg fanatic to kill before it pumps). So what you really want to do is look as far as you can in the future and evaluate that position, assuming that your opponent made intelligent decisions all along the way. What actually ends up happening is that you need to alternately need to look through states where your evaluation function is maximized (for when you made the best choice) and then a state where your evaluation function is minimized (for when your opponent made a choice and tried to screw you). This is a rule in game theory called “minimax”  where you’re trying to maximize your own evaluation function in a game with some adversary.

These are getting really long. Anyways, I’ll continue to talk more about this in another post. In that, hopefully I’ll finally get to talking about alpha-beta pruning and more exciting, AI-y sorts of things.

* I should make a couple comments about these articles. 1) I honestly thought they were kind of dumb, but clearly, I appreciated them enough to link to them. In some sense, he’s doing exactly what I’m doing, and taking a more complex idea and explaining it simply. So maybe I should like them more? 2) Decision-point theory, I’m pretty sure, is a term he came up with. I’ve never heard of it before, and it doesn’t really mean anything. It can either be part of game theory or ai, but it definitely isn’t legit on its own. 3) The best point I thought he made was that pros often play complex decks that require a lot of decisions, and good playskill often shows that they choose options that give them more options down the road. This actually deals with another part of search algorithms called “constraint-satisfaction problems”. In that, “forward checking” is something of a way of doing lookahead, and “least-constraining variable” is a heuristic for making choices. But that’s all another post, if it comes up again.