In my previous blog entry, the normal form game and Nash equilibrium were introduced. Attention was restricted to pure strategies. As will be illustrated below, not every finite, normal form game has a pure strategies Nash equilibrium. The notion of mixed strategies extends the notion of pure strategies, allowing players to assign probabilities to each pure strategy. This extension provides for the existence of a mixed strategies Nash equilibrium in every finite, normal form game. Additionally, zero sum games and prudent strategies will be discussed.
In this section, we introduce the notion of mixed-strategies. One important motivator for mixed-strategies is that not every game has a pure strategies Nash equilibrium. Consider the matching pennies game:
In any pure strategy profile, one player incurs utility and the other player incurs utility . The player incurring utility can unilaterally deviate by switching its choice to improve its utility. This inverts the payoffs- the first player incurs utility while the second player incurs utility . Iterating on the above argument, we see that no Nash equilibrium exists in pure strategies.
Mixed Strategies: Let be a normal form game. Let . A mixed strategy is a sequence and a probability distribution where player selects strategy with probability . Note that . The set of mixed strategies for player is denoted , where is the simplex in . That is, .
Note that pure strategies are a special case of mixed strategies. The mixed extension will now be defined, to formalize the notion of games with mixed strategies. A mixed strategies Nash equilibrium in a normal form game is equivalent to a pure strategies Nash equilibrium in a mixed extension.
Mixed Extension: Let be a normal form game. The mixed extension of is the three-tuple , where .
The notion of mixed strategies is rather unintuitive from a behavioral perspective, as a normal form game is played simultaneously. So how is a mixed strategies Nash equilibrium formulated? Recall that each player is a rational, utility maximizing agent that is aware of the structure of the game. Each player still seeks to mix its strategies in such a way to maximize its utility. In mixing strategies, a player runs the risk that another player can take advantage of a given mixing. Thus, in a Nash equilibrium, each player mixes strategies such that is indifferent to whichever pure strategy ends up being played. That is, ‘s expected utility for each of ‘s pure strategies in the mixing is the same. This is formalized as follows.
Theorem 2.1: Let be a normal form game. A mixed strategy profile is a mixed strategy Nash equilibrium if and only if, for each player , the following two conditions are satisfied:
- Every pure strategy which is given positive probability by yields the same expected payoff against ; that is, .
- Every pure strategy which is given probability zero by yields no more than the pure strategies that are assigned positive probability: .
Proof: Suppose first that the mixed-strategy profile satisfies conditions (1) and (2). Let . If unilaterally deviates by shifting positive probability to a strategy given zero probability in , then ‘s utility does not increase by condition (2). Let be the set of strategies given positive probability by . By condition (1), each pure strategy in results in the same expected utility . Thus, any mixing of strategies in results in expected utility:
Thus, player cannot unilaterally deviate and improve its outcome, so is a mixed strategies Nash equilibrium.
Conversely, suppose is a mixed-strategies Nash equilibrium. As no player can unilaterally deviate and improve its outcome, condition (2) follows immediately. Suppose to the contrary that condition (1) does not hold. Let and such that the . If , then player could assign more weight to in and improve its outcome. Similarly, if , then player could assign less weight to in and improve its outcome. Either occurrence contradicts the assumption that is a mixed-strategies Nash equilibrium. QED.
Example 1: Let’s now use Theorem 2.1 to find a mixed-strategies Nash equilibrium for the Matching Pennies game. Player mixes strategies such that Player is indifferent to and . Suppose Player plays with probability and with probability . Player ‘s payoff from playing is . Player ‘s payoff from playing is . In equilibrium, Player is indifferent between playing and . Setting . By symmetry, we have Player mixing between and with frequencies as well.
In addition to guaranteeing the existence of a Nash equilibrium, mixed strategies are also useful in selecting realistic Nash equilibria. Consider the following example.
Example 2: Consider a traffic routing game on the following network. The weight of each edge denotes the latency cost of traversing that edge. The variable denotes the number of players traversing the edge , and the variable denotes the number of players using the edge . So for example, if , then the latency cost of is for every each of the players. Each player starts at and ends at , seeking to minimize latency.
Suppose there are players in the game. Denote as the number of players choosing the path , as the number of players choosing the path , and as the number of players choosing . Consider first the pure strategies Nash equilibrium of . Both the edges and have players traversing them, and so have latency costs . Players of each type incur latency cost . If a player of type unilaterally deviates, he increases the latency cost of the edge to , resulting in a total latency cost of . By similar argument, players of type and cannot unilaterally deviate and decrease their costs as well.
While and is a pure strategies Nash equilibrium, it is unlikely the players will end up playing this strategy profile. However, this pure strategies equilibrium does provide the probabilities for a mixed strategies equilibrium. As the game is symmetric, there exists a Nash equilibrium in which each player selects the same strategy. Suppose each player selects the mixed strategy with probabilities . We apply Theorem 2.1 to verify this mixed strategy profile, denoted , is a mixed-strategies Nash equilibrium.
First, observe that . Consider each of the pure strategies .
- Suppose player plays the pure strategy . Under , , , and . So .
- Suppose player plays the pure strategy . Under , , and . So .
- Suppose player plays the pure strategy . Under , and . So .
Thus, is a mixed-strategies Nash equilibrium.
III. Zero-Sum Games
In this section, we examine zero-sum games. Intuitively, in a zero sum game, each player’s gain (or loss) is exactly balanced with those of the other players. For example, cutting a larger slice of cake for one person leaves less cake for the others. This notion is formalized as follows:
Zero-Sum Games: Let be a normal form game. is said to be a Zero-Sum Game if for every strategy profile .
Recall the Matching Pennies game from the previous section. In any strategy profile, one player earns utility while the other player earns utility . Thus, the Matching Pennies game is a zero-sum game. Another example of a zero-sum game is Rock-Paper-Scissors. The winner of the game earns utility while the loser earns utility . In the event of a tie, each player earns utility .
So how are zero-sum games solved? We can solve zero-sum games in the same manner as any other normal form game. Of particular interest, however, areprudent strategies. Intuitively, players who play prudently seek to minimize potential losses. We define prudent strategies as follows:
Prudent Strategies: Let be a finite, normal form game. A prudent strategy of player is a mixed strategy that satisfies .
Note that the definition of prudent strategies did not restrict to zero-sum games. We discuss them in the context of zero-sum games; however, as they are most useful here. In the case of a two-player game, saddle points are equivalent to prudent strategies. We define a saddle point as follows.
Saddle Point: Let be a two-player zero-sum game. A saddle point is a strategy profile satisfying for all , all , and each .
In order to test for saddle points in finite games, we convert the payoff matrix into a mathematical matrix from linear algebra. As we are considering zero sum games, define the matrix to be a real-valued matrix where , . That is, represents the amount Player 1 wins and Player 2 loses when the pure strategy profile is played. A saddle point is a minimum of row and a maximum of column . Let’s consider an example of finding a saddle point.
Example 3: Consider the two-player, zero-sum game given by the following payoff matrix.
We begin by converting the payoff matrix to an algebraic matrix:
The row minima are ; and the column maxima are . Observe that row two and column two have the same value: . So is a saddle point with payoff .
Example 4: Consider the two-player, zero-sum game given by the following payoff matrix.
We begin by converting the payoff matrix to an algebraic matrix:
The row minima are for row one, and for column one. The column maxima are for column one, and for column two. So there are no pure strategy saddle points for this game.
We solve this game by determining the mixed strategies Nash equilibria. Suppose Player 2 plays with probability and with probability . If Player 1 plays , his expected payoff is . If Player 1 plays , his expected payoff is . Setting . So Player 2 plays with probability and with probability in equilibrium.
We now solve for Player 1’s mixed equilibrium strategies. Suppose Player 1 plays with probability and with probability . Then Player 2’s expected payoff from playing is . Player 2’s expected payoff from playing is . Setting yields . So Player 1 plays with probability and with probability in equilibrium.
When considering mixed strategies or infinite games with compact (closed and bounded) strategy sets (such as mixed extensions of normal form games), saddle points are guaranteed to exist. This is due to a result in real analysis known as the Weierstrass Extreme Value Theorem, which states that a continuous function over a compact set achieves both a maximum and a minimum.
It is easy to verify that a pure-strategy saddle point is a Nash equilibrium in a finite zero-sum games. This verification is presented in the following theorem to build intuition. The logic is analogous in the case of infinite games (such as mixed extensions of finite zero-sum games).
Theorem 3.1: Let be a two-player zero-sum game and let be the associated matrix. Suppose is a saddle point. Then constitutes a Nash equilibrium of .
Proof: As is the maximum of column , a unilateral deviation from Player 1 will result in payoff for some . It follows that , so Player 1 cannot unilaterally deviate and improve its outcome. For Player 2, as is a zero-sum game and by construction of the matrix . So if Player 2 unilaterally deviates, the payoff will be since . And so Player 2 cannot unilaterally deviate and improve its outcome. Thus, is a Nash equilibrium. QED.
The Nash equilibria of zero-sum games can be characterized in terms of prudent strategies. Intuitively, each player seeks to minimize its opponent’s payoff. As a zero-sum game is being considered, this leaves more for the individual. This notion is formalized as follows:
Theorem 3.2: In any finite, two-person zero-sum game, the following conditions hold:
- If is a mixed strategies Nash equilibrium, then is a prudent strategy of player and: (2)
- If is prudent for each , then is a mixed-strategies Nash equilibrium.
Proof: Suppose first that is a mixed strategies Nash equilibrium. Now suppose to the contrary that for some , is not prudent. It follows that there exists a strategy guaranteeing a better result than . Consider the mixed strategy profile , where is a best response to . We thus have . As is a best response to , we have . As is a best response to and the game is zero-sum, it follows that . Chaining the inequalities together implies , a contradiction. It follows that is prudent for both players.
It will now be shown that (2) holds. Note that for any function and fixed , we have . Taking the max of both sides maintains this inequality. Thus:
Similarly, as is a saddle point, we have:
Thus, (2) holds.
Conversely, suppose is prudent for each . As is prudent, it solves . So player cannot unilaterally deviate and improve its payoff. Thus, is a mixed-strategies Nash equilibrium. QED.