I. Introduction
In my previous blog entry, the normal form game and Nash equilibrium were introduced. Attention was restricted to pure strategies. As will be illustrated below, not every finite, normal form game has a pure strategies Nash equilibrium. The notion of mixed strategies extends the notion of pure strategies, allowing players to assign probabilities to each pure strategy. This extension provides for the existence of a mixed strategies Nash equilibrium in every finite, normal form game. Additionally, zero sum games and prudent strategies will be discussed.
II. Mixed-Strategies
In this section, we introduce the notion of mixed-strategies. One important motivator for mixed-strategies is that not every game has a pure strategies Nash equilibrium. Consider the matching pennies game:
In any pure strategy profile, one player incurs utility and the other player incurs utility
. The player incurring utility
can unilaterally deviate by switching its choice to improve its utility. This inverts the payoffs- the first player incurs utility
while the second player incurs utility
. Iterating on the above argument, we see that no Nash equilibrium exists in pure strategies.
Mixed Strategies: Let be a normal form game. Let
. A mixed strategy is a sequence
and a probability distribution
where player
selects strategy
with probability
. Note that
. The set of mixed strategies for player
is denoted
, where
is the simplex in
. That is,
.
Note that pure strategies are a special case of mixed strategies. The mixed extension will now be defined, to formalize the notion of games with mixed strategies. A mixed strategies Nash equilibrium in a normal form game is equivalent to a pure strategies Nash equilibrium in a mixed extension.
Mixed Extension: Let be a normal form game. The mixed extension of
is the three-tuple
, where
.
The notion of mixed strategies is rather unintuitive from a behavioral perspective, as a normal form game is played simultaneously. So how is a mixed strategies Nash equilibrium formulated? Recall that each player is a rational, utility maximizing agent that is aware of the structure of the game. Each player still seeks to mix its strategies in such a way to maximize its utility. In mixing strategies, a player runs the risk that another player can take advantage of a given mixing. Thus, in a Nash equilibrium, each player mixes strategies such that
is indifferent to whichever pure strategy ends up being played. That is,
‘s expected utility for each of
‘s pure strategies in the mixing is the same. This is formalized as follows.
Theorem 2.1: Let be a normal form game. A mixed strategy profile
is a mixed strategy Nash equilibrium if and only if, for each player
, the following two conditions are satisfied:
- Every pure strategy
which is given positive probability by
yields the same expected payoff against
; that is,
.
- Every pure strategy
which is given probability zero by
yields no more than the pure strategies that are assigned positive probability:
.
Proof: Suppose first that the mixed-strategy profile satisfies conditions (1) and (2). Let
. If
unilaterally deviates by shifting positive probability to a strategy
given zero probability in
, then
‘s utility does not increase by condition (2). Let
be the set of strategies given positive probability by
. By condition (1), each pure strategy in
results in the same expected utility
. Thus, any mixing
of strategies in
results in expected utility:
Thus, player cannot unilaterally deviate and improve its outcome, so
is a mixed strategies Nash equilibrium.
Conversely, suppose is a mixed-strategies Nash equilibrium. As no player can unilaterally deviate and improve its outcome, condition (2) follows immediately. Suppose to the contrary that condition (1) does not hold. Let
and
such that the
. If
, then player
could assign more weight to
in
and improve its outcome. Similarly, if
, then player
could assign less weight to
in
and improve its outcome. Either occurrence contradicts the assumption that
is a mixed-strategies Nash equilibrium. QED.
Example 1: Let’s now use Theorem 2.1 to find a mixed-strategies Nash equilibrium for the Matching Pennies game. Player mixes strategies such that Player
is indifferent to
and
. Suppose Player
plays
with probability
and
with probability
. Player
‘s payoff from playing
is
. Player
‘s payoff from playing
is
. In equilibrium, Player
is indifferent between playing
and
. Setting
. By symmetry, we have Player
mixing between
and
with frequencies
as well.
In addition to guaranteeing the existence of a Nash equilibrium, mixed strategies are also useful in selecting realistic Nash equilibria. Consider the following example.
Example 2: Consider a traffic routing game on the following network. The weight of each edge denotes the latency cost of traversing that edge. The variable denotes the number of players traversing the edge
, and the variable
denotes the number of players using the edge
. So for example, if
, then the latency cost of
is
for every each of the
players. Each player starts at
and ends at
, seeking to minimize latency.
Suppose there are players in the game. Denote
as the number of players choosing the path
,
as the number of players choosing the path
, and
as the number of players choosing
. Consider first the pure strategies Nash equilibrium of
. Both the edges
and
have
players traversing them, and so have latency costs
. Players of each type incur latency cost
. If a player of type
unilaterally deviates, he increases the latency cost of the edge
to
, resulting in a total latency cost of
. By similar argument, players of type
and
cannot unilaterally deviate and decrease their costs as well.
While and
is a pure strategies Nash equilibrium, it is unlikely the
players will end up playing this strategy profile. However, this pure strategies equilibrium does provide the probabilities for a mixed strategies equilibrium. As the game is symmetric, there exists a Nash equilibrium in which each player selects the same strategy. Suppose each player selects the mixed strategy
with probabilities
. We apply Theorem 2.1 to verify this mixed strategy profile, denoted
, is a mixed-strategies Nash equilibrium.
First, observe that . Consider each of the pure strategies
.
- Suppose player
plays the pure strategy
. Under
,
,
, and
. So
.
- Suppose player
plays the pure strategy
. Under
,
, and
. So
.
- Suppose player
plays the pure strategy
. Under
,
and
. So
.
Thus, is a mixed-strategies Nash equilibrium.
III. Zero-Sum Games
In this section, we examine zero-sum games. Intuitively, in a zero sum game, each player’s gain (or loss) is exactly balanced with those of the other players. For example, cutting a larger slice of cake for one person leaves less cake for the others. This notion is formalized as follows:
Zero-Sum Games: Let be a normal form game.
is said to be a Zero-Sum Game if
for every strategy profile
.
Recall the Matching Pennies game from the previous section. In any strategy profile, one player earns utility while the other player earns utility
. Thus, the Matching Pennies game is a zero-sum game. Another example of a zero-sum game is Rock-Paper-Scissors. The winner of the game earns utility
while the loser earns utility
. In the event of a tie, each player earns utility
.
So how are zero-sum games solved? We can solve zero-sum games in the same manner as any other normal form game. Of particular interest, however, areprudent strategies. Intuitively, players who play prudently seek to minimize potential losses. We define prudent strategies as follows:
Prudent Strategies: Let be a finite, normal form game. A prudent strategy of player
is a mixed strategy
that satisfies
.
Note that the definition of prudent strategies did not restrict to zero-sum games. We discuss them in the context of zero-sum games; however, as they are most useful here. In the case of a two-player game, saddle points are equivalent to prudent strategies. We define a saddle point as follows.
Saddle Point: Let be a two-player zero-sum game. A saddle point is a strategy profile
satisfying
for all
, all
, and each
.
In order to test for saddle points in finite games, we convert the payoff matrix into a mathematical matrix from linear algebra. As we are considering zero sum games, define the matrix to be a real-valued
matrix where
,
. That is,
represents the amount Player 1 wins and Player 2 loses when the pure strategy profile
is played. A saddle point
is a minimum of row
and a maximum of column
. Let’s consider an example of finding a saddle point.
Example 3: Consider the two-player, zero-sum game given by the following payoff matrix.
We begin by converting the payoff matrix to an algebraic matrix:
The row minima are ; and the column maxima are
. Observe that row two and column two have the same value:
. So
is a saddle point with payoff
.
Example 4: Consider the two-player, zero-sum game given by the following payoff matrix.
We begin by converting the payoff matrix to an algebraic matrix:
The row minima are for row one, and
for column one. The column maxima are
for column one, and
for column two. So there are no pure strategy saddle points for this game.
We solve this game by determining the mixed strategies Nash equilibria. Suppose Player 2 plays with probability
and
with probability
. If Player 1 plays
, his expected payoff is
. If Player 1 plays
, his expected payoff is
. Setting
. So Player 2 plays
with probability
and
with probability
in equilibrium.
We now solve for Player 1’s mixed equilibrium strategies. Suppose Player 1 plays with probability
and
with probability
. Then Player 2’s expected payoff from playing
is
. Player 2’s expected payoff from playing
is
. Setting
yields
. So Player 1 plays
with probability
and
with probability
in equilibrium.
When considering mixed strategies or infinite games with compact (closed and bounded) strategy sets (such as mixed extensions of normal form games), saddle points are guaranteed to exist. This is due to a result in real analysis known as the Weierstrass Extreme Value Theorem, which states that a continuous function over a compact set achieves both a maximum and a minimum.
It is easy to verify that a pure-strategy saddle point is a Nash equilibrium in a finite zero-sum games. This verification is presented in the following theorem to build intuition. The logic is analogous in the case of infinite games (such as mixed extensions of finite zero-sum games).
Theorem 3.1: Let be a two-player zero-sum game and let
be the associated matrix. Suppose
is a saddle point. Then
constitutes a Nash equilibrium of
.
Proof: As is the maximum of column
, a unilateral deviation from Player 1 will result in payoff
for some
. It follows that
, so Player 1 cannot unilaterally deviate and improve its outcome. For Player 2,
as
is a zero-sum game and by construction of the matrix
. So if Player 2 unilaterally deviates, the payoff will be
since
. And so Player 2 cannot unilaterally deviate and improve its outcome. Thus,
is a Nash equilibrium. QED.
The Nash equilibria of zero-sum games can be characterized in terms of prudent strategies. Intuitively, each player seeks to minimize its opponent’s payoff. As a zero-sum game is being considered, this leaves more for the individual. This notion is formalized as follows:
Theorem 3.2: In any finite, two-person zero-sum game, the following conditions hold:
- If
is a mixed strategies Nash equilibrium, then
is a prudent strategy of player
and:
(2)
- If
is prudent for each
, then
is a mixed-strategies Nash equilibrium.
Proof: Suppose first that is a mixed strategies Nash equilibrium. Now suppose to the contrary that for some
,
is not prudent. It follows that there exists a strategy
guaranteeing a better result than
. Consider the mixed strategy profile
, where
is a best response to
. We thus have
. As
is a best response to
, we have
. As
is a best response to
and the game is zero-sum, it follows that
. Chaining the inequalities together implies
, a contradiction. It follows that
is prudent for both players.
It will now be shown that (2) holds. Note that for any function and fixed
, we have
. Taking the max of both sides maintains this inequality. Thus:
Similarly, as is a saddle point, we have:
And so:
Thus, (2) holds.
Conversely, suppose is prudent for each
. As
is prudent, it solves
. So player
cannot unilaterally deviate and improve its payoff. Thus,
is a mixed-strategies Nash equilibrium. QED.