Not technically a Prisoners' Dilemma!
Both players have a weakly-dominant strategy to Detonate
What is/are the Nash equilibrium/equilibria?
A true prisoners' dilemma: a>b>c>d
Each player's preferences:
Nash equilibrium: (Defect, Defect)
We'll stick with these specific payoffs for this lesson
How can we sustain cooperation in Prisoners' Dilemma?
Analysis of games can change when players encounter each other more than once
Repeated games: the same players play the same game multiple times, two types:
Players know the history of the game with each other
Finitely-repeated game: has a known final round
Infinitely-repeated game: has no (or an unknown) final round
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Both Defect in round 1 (and round 2)
No value in cooperation over time!
Finitely-repeated games are interesting, but rare
Some predictions for finitely-repeated games don't hold up well in reality
We often play games or are in relationships that are indefinitely repeated (have no known end), we call them infinitely-repeated games
Since we are dealing with payoffs in the future, we have to consider players' time preferences
Easiest to consider with monetary payoffs and the time value of money that underlies finance
PV=FV(1+r)t
FV=PV(1+r)t
PV=FV(1+r)nPV=1000(1+0.05)1PV=10001.05PV=$952.38
FV=PV(1+r)nFV=1000(1+0.05)1FV=1000(1.05)FV=$1050
Suppose a player values $1 now as being equivalent to some amount with interest 1(1+r) one period later
The “discount factor” is δ=11+r, the ratio that future value must be multiplied to equal present value
$1 now=δ$1 later
If δ is low (r is high)
If δ is high (r is low)
Example: Suppose you are indifferent between having $1 today and $1.10 next period
Example: Suppose you are indifferent between having $1 today and $1.10 next period
$1 today=δ$1.10 next period$1$1.10=δ0.91≈δ
Example: Suppose you are indifferent between having $1 today and $1.10 next period
$1 today=δ$1.10 next period$1$1.10=δ0.91≈δ
There is an implied interest rate of r=0.10
$1 at 10% interest yields $1.10 next period
δ=11+rδ=11.10 ≈0.91
p(δ+δ2+δ3+⋯)
p(δ+δ2+δ3+⋯)
∞∑t=1=p1−δ
Alternate interpretation: game continues with some (commonly known among the players) probability θ each round
Assume this probability is independent between rounds (i.e. one round continuing has no influence on the probability of the next round continuing, etc)
Then the probability the game is played T rounds from now is θT
A payoff of p in every future round has a present value of p(θ+θ2+θ3+⋯)=(p1−θ)
This is similar to discounting of future payoffs; equivalent if θ=δ
Recall, a strategy is a complete plan of action that describes how you will react under all possible circumstances (i.e. moves by other players)
For an infinitely-repeated game, an infinite number of possible strategies exist!
We will examine a specific set of contingent or trigger strategies
Consider one (the most important) trigger strategy for an infinitely-repeated prisoners' dilemma, the “Grim Trigger” strategy:
“Grim” trigger strategy leaves no room for forgiveness: one deviation triggers infinite punishment, like the sword of Damocles
3+3δ+3δ2+3δ3+⋯+3δ∞=31−δ
Payoff to cooperation>Payoff to one-time defection31−δ>4+2δ1−δδ>0.5
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
Not technically a Prisoners' Dilemma!
Both players have a weakly-dominant strategy to Detonate
What is/are the Nash equilibrium/equilibria?
A true prisoners' dilemma: a>b>c>d
Each player's preferences:
Nash equilibrium: (Defect, Defect)
We'll stick with these specific payoffs for this lesson
How can we sustain cooperation in Prisoners' Dilemma?
Analysis of games can change when players encounter each other more than once
Repeated games: the same players play the same game multiple times, two types:
Players know the history of the game with each other
Finitely-repeated game: has a known final round
Infinitely-repeated game: has no (or an unknown) final round
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Suppose a prisoners' dilemma is played for 2 rounds
Apply backwards induction:
Both Defect in round 1 (and round 2)
No value in cooperation over time!
Finitely-repeated games are interesting, but rare
Some predictions for finitely-repeated games don't hold up well in reality
We often play games or are in relationships that are indefinitely repeated (have no known end), we call them infinitely-repeated games
Since we are dealing with payoffs in the future, we have to consider players' time preferences
Easiest to consider with monetary payoffs and the time value of money that underlies finance
PV=FV(1+r)t
FV=PV(1+r)t
PV=FV(1+r)nPV=1000(1+0.05)1PV=10001.05PV=$952.38
FV=PV(1+r)nFV=1000(1+0.05)1FV=1000(1.05)FV=$1050
Suppose a player values $1 now as being equivalent to some amount with interest 1(1+r) one period later
The “discount factor” is δ=11+r, the ratio that future value must be multiplied to equal present value
$1 now=δ$1 later
If δ is low (r is high)
If δ is high (r is low)
Example: Suppose you are indifferent between having $1 today and $1.10 next period
Example: Suppose you are indifferent between having $1 today and $1.10 next period
$1 today=δ$1.10 next period$1$1.10=δ0.91≈δ
Example: Suppose you are indifferent between having $1 today and $1.10 next period
$1 today=δ$1.10 next period$1$1.10=δ0.91≈δ
There is an implied interest rate of r=0.10
$1 at 10% interest yields $1.10 next period
δ=11+rδ=11.10 ≈0.91
p(δ+δ2+δ3+⋯)
p(δ+δ2+δ3+⋯)
∞∑t=1=p1−δ
Alternate interpretation: game continues with some (commonly known among the players) probability θ each round
Assume this probability is independent between rounds (i.e. one round continuing has no influence on the probability of the next round continuing, etc)
Then the probability the game is played T rounds from now is θT
A payoff of p in every future round has a present value of p(θ+θ2+θ3+⋯)=(p1−θ)
This is similar to discounting of future payoffs; equivalent if θ=δ
Recall, a strategy is a complete plan of action that describes how you will react under all possible circumstances (i.e. moves by other players)
For an infinitely-repeated game, an infinite number of possible strategies exist!
We will examine a specific set of contingent or trigger strategies
Consider one (the most important) trigger strategy for an infinitely-repeated prisoners' dilemma, the “Grim Trigger” strategy:
“Grim” trigger strategy leaves no room for forgiveness: one deviation triggers infinite punishment, like the sword of Damocles
3+3δ+3δ2+3δ3+⋯+3δ∞=31−δ
Payoff to cooperation>Payoff to one-time defection31−δ>4+2δ1−δδ>0.5