Game Theory
Reputation Effects
When tomorrow's punishment makes today's cooperation rational
In repeated interactions, defection today costs you tomorrow's payoff. The folk theorem says any individually rational outcome can be supported — if players value the future enough.
- Folk theoremAumann 1981; Fudenberg-Maskin 1986
- Cooperation thresholdδ ≥ (T−R)/(T−P) ≈ 0.4 in PD
- Bayesian reputationKreps-Milgrom-Roberts-Wilson 1982
- Chain-store paradoxSelten 1978; resolved by KMRW
- Empirical channeleBay ratings; brand investment; trade deals
- Forgiving variantsTit-for-tat (Axelrod 1980-81)
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Why repetition changes everything
Start with the one-shot prisoner's dilemma. Two players each choose Cooperate or Defect. Payoffs satisfy T > R > P > S (temptation, reward, punishment, sucker). The classic numbers: T = 5, R = 3, P = 1, S = 0. Each player has a dominant strategy: defect. The unique Nash equilibrium is mutual defection at payoffs (1, 1), even though mutual cooperation (3, 3) is Pareto-better.
Now play the same game indefinitely. Each player discounts future payoffs by factor δ ∈ (0, 1) per round. A consistent cooperative path yields the infinite stream R + δR + δ²R + ... = R/(1−δ). A one-time defection (today) yields T today; whatever punishment follows costs the defector the difference between R and the punishment payoff stream.
The simplest cooperation-sustaining strategy is grim trigger: cooperate every round until the opponent defects once; then defect forever. Under grim trigger, the deviation gain is T − R today; the deviation cost is (R − P)·δ/(1−δ) from all future rounds. Cooperation is incentive-compatible if and only if
(T − R) ≤ δ(R − P)/(1 − δ), or equivalently δ ≥ (T − R) / (T − P).
For the canonical payoffs (5, 3, 1, 0): δ ≥ (5−3)/(5−1) = 2/4 = 0.5. If the discount factor exceeds one-half, mutual cooperation is a subgame-perfect equilibrium. Below that, even infinitely patient players cannot sustain cooperation through grim trigger.
The folk theorem
Grim trigger is just one strategy. The folk theorem of repeated games (Aumann 1981, Fudenberg-Maskin 1986) is more sweeping. For any payoff vector that is (i) feasible — achievable by some sequence of actions — and (ii) individually rational — strictly better than each player's minmax payoff — there exists a discount factor δ* < 1 such that the vector is supported as a subgame-perfect equilibrium for all δ ∈ (δ*, 1).
In English: for patient enough players, essentially any reasonable outcome can be made stable. This is both an empowering result (cooperation possible) and an embarrassing one (theory makes no sharp predictions). The folk theorem is sometimes called the "anything goes" theorem; refinements such as renegotiation-proofness, evolutionary stability, and risk-dominance have been proposed to narrow predictions.
Worked example: when does cooperation pay?
Two firms in a duopoly market consider whether to coordinate prices or undercut each other each quarter. Per-quarter payoffs:
- Both cooperate (collude at high price): each earns $3 million.
- Both defect (price war): each earns $1 million.
- One defects, other cooperates: defector earns $5 million, cooperator earns $0.
Discount factor δ = 0.85 per quarter (cost of capital ≈ 18% annualised). Grim trigger threshold: δ ≥ (5−3)/(5−1) = 0.5. Since 0.85 > 0.5, cooperation is sustainable. The infinite cooperative payoff stream is $3M / (1 − 0.85) = $20M per firm.
A one-time defection earns $5M today but triggers the price war forever. Continuation value after defection: $1M / (1 − 0.85) = $6.67M. So total deviation payoff = $5M today + δ·$6.67M = $5M + $5.67M = $10.67M. Total cooperative payoff = $20M. Deviation worse by $9.33M. Cooperation holds.
Now imagine δ drops to 0.3 (extreme impatience). Threshold not satisfied. Cooperative payoff = $3M / 0.7 = $4.29M. Deviation payoff = $5M + 0.3·$1.43M = $5.43M. Defection beats cooperation by $1.14M. The cartel collapses. The shadow of the future is too short to discipline today's behaviour.
Reputation effects vs related concepts
| Reputation Effects | Tit-for-Tat | Grim Trigger | Folk Theorem | Chain Store | Brand Capital | |
|---|---|---|---|---|---|---|
| Game structure | Repeated, possibly finite | Repeated PD | Repeated PD | Infinitely repeated | Finite sequential entry | Repeated buyer-seller |
| Punishment | Selective, calibrated | Match opponent's last move | Defect forever once | Any IR punishment | Fight entry once | Stop buying / boycott |
| Forgives errors? | Often yes | Yes — eventually | No — terminal | Depends on strategy | — | Brand-dependent |
| Threshold δ | Varies | ≥ (T−R)/(T−P) | ≥ (T−R)/(T−P) | Some δ* < 1 | Solved by Bayesian uncertainty | High δ helps |
| Canonical paper | Kreps-Wilson 1982 | Axelrod 1984 | Friedman 1971 | Fudenberg-Maskin 1986 | Selten 1978 | Klein-Leffler 1981 |
| Real-world example | eBay seller ratings | Trade tariff retaliation | Cartel breakdown | Any cooperative norm | Predatory pricing | Coca-Cola loyalty |
Reputation effects are the umbrella concept. Tit-for-tat and grim trigger are specific strategies. The folk theorem is the existence result. The chain-store paradox is the canonical counter-example resolved by Bayesian reputation.
Bayesian reputation: the KMRW resolution
Selten's 1978 chain-store paradox showed that backward induction breaks reputation in finite games. The 1982 quartet of papers — Kreps and Wilson, Milgrom and Roberts — resolved the paradox elegantly: introduce a small probability that one player is a "commitment type" who always plays a fixed strategy regardless of incentives.
Suppose the chain store is "tough" with probability ε (always fights entry) and "rational" with probability 1 − ε. An entrant cannot tell. In early rounds, even a rational chain has incentive to mimic the tough type — fighting entry costs short-run profit but maintains the reputation of toughness, deterring future entrants. As the horizon shortens, the value of reputation falls; near the final round, even rational chains stop fighting. But the early-round fighting persists, vindicating the empirical observation that chain stores deter early entrants.
KMRW reputation effects unite incomplete-information game theory with the folk theorem intuition. The trick is that even infinitesimal uncertainty about types — ε ≈ 0 — supports substantial cooperation if the horizon is long. This is one of the most cited results in modern game theory.
What the data say
- eBay seller ratings. Cabral and Hortaçsu (2010) measured a 9% sales drop within two weeks of a seller's first negative review. The reputation system creates substantial monetary incentive to maintain a clean record.
- Klein-Leffler brands. Klein and Leffler (1981) showed that brand-name capital invested in advertising serves as a hostage: defecting on quality forfeits the investment. Major U.S. consumer-brand spend ($300B+ annually) is largely this reputation-sustaining cost.
- Diamond Dealers Club (Bernstein 1992). A closed reputation network with permanent exclusion sustains $25B+ annual diamond trades that would be impossible in anonymous markets.
- Trade-deal credibility. Maggi (1999) shows that bilateral retaliation under WTO dispute settlement is the reputation mechanism that holds the trading system together.
- Sovereign debt. Tomz (2007) measures reputation as a major driver of repayment incentives; countries pay back to preserve future market access.
- Online platforms. Anonymous Airbnb listings carry 25-50% price discounts relative to verified-host equivalents (Edelman-Luca-Svirsky 2017), measuring the cost of missing reputation infrastructure.
Variants and refinements
- Forgiveness. Tit-for-two-tats and generous tit-for-tat (forgive with probability p) outperform grim trigger in noisy environments — Axelrod 1984 and Nowak-Sigmund 1992.
- Calibrated punishment. Modify the punishment phase so it lasts long enough to deter deviation but not forever — Abreu (1988) optimal penal codes.
- Renegotiation-proof equilibria. Restrict to equilibria where, after a deviation, neither player would prefer to renegotiate to a different continuation. Farrell-Maskin (1989) defined the concept; rules out grim trigger.
- Public versus private monitoring. If players observe each other's actions imperfectly, the cooperation threshold tightens; Green-Porter (1984) on cartel breakdowns under demand shocks.
- Reputation in mechanism design. Tadelis (1999, 2002) studies the market for reputation: when reputation is for sale, opportunism emerges.
- Reputation under turnover. If players are replaced over time and new players cannot observe full history, reputation decays — Mailath-Samuelson (2001).
A brief history
James Friedman (1971) wrote the first formal repeated-game folk theorem for the prisoner's dilemma. Aumann (1981) generalised to arbitrary stage games. Fudenberg and Maskin (1986) proved the modern subgame-perfect version. Reinhard Selten's chain-store paradox (1978) showed the finite-horizon trouble. Kreps, Milgrom, Roberts and Wilson (1982) — published as two companion papers in the Journal of Economic Theory — resolved the paradox with Bayesian reputation.
Aumann and Maskin shared parts of the 2005 and 2007 Nobel Prizes (Aumann with Schelling 2005; Maskin with Hurwicz and Myerson 2007) for related work. The reputation framework is a cornerstone of modern industrial organisation (Tirole 1988), international trade theory (Maggi 1999), and political economy (Persson-Tabellini 2000). Axelrod's 1980-81 computer tournaments brought the iterated prisoner's dilemma to a wide audience.
Common pitfalls
- Confusing infinitely repeated with finitely repeated. Finite-horizon games unravel by backward induction unless type-uncertainty is added.
- Forgetting the discount factor matters. Cooperation requires δ above a specific threshold; "very patient" is not a mathematical statement.
- Treating reputation as automatic. Reputation requires observable history. In anonymous one-shot markets it fails entirely.
- Assuming grim trigger is optimal. In noisy environments, more forgiving strategies (tit-for-tat) outperform grim trigger.
- Conflating reputation with brand. Brand is a hostage — money spent that can be forfeited. Reputation is a belief held by counterparties. Brand investment maintains reputation; the concepts are distinct.
- Equilibrium multiplicity. Folk theorem implies many equilibria. Real-world coordination on a particular cooperative outcome depends on focal points, conventions, and history — not the theory alone.
- Underestimating the role of monitoring. The cooperation threshold rises sharply when monitoring is imperfect. Green-Porter cartels break down on bad-news realisations.
Frequently asked questions
What are reputation effects?
Reputation effects are the long-run behavioural consequences of being observed repeatedly by counterparties who can adjust their future behaviour based on your past actions. A firm that cheats one customer faces fewer future customers; a country that breaks one trade deal faces fewer future trade partners; an eBay seller who ships late gets one-star ratings that drive away buyers. The shadow of the future converts short-run incentives to defect into long-run incentives to cooperate. Formally, reputation effects sustain a cooperative outcome that would not be a Nash equilibrium of the one-shot game.
What is the folk theorem?
A result in repeated-game theory: for any feasible, individually rational payoff vector in the stage game, there exists a discount factor δ below 1 such that the vector can be supported as a subgame-perfect equilibrium of the infinitely repeated game. Informally: anything you can imagine doing as a long-run plan can be supported in equilibrium for sufficiently patient players. Aumann (1981) gave the early statement; Fudenberg and Maskin (1986) proved the modern version. The theorem's name comes from 'folklore' — the result was known orally before being formally written down.
What is the discount factor δ?
The weight that a player places on next period's payoff relative to today's. If δ = 0.95 and the per-period payoff is $1, the total present value of an infinite stream is $1/(1-0.95) = $20. Higher δ means players are more patient — they value the future more — and reputation effects are correspondingly stronger. In the iterated prisoner's dilemma with payoffs (T, R, P, S) = (5, 3, 1, 0), the threshold for grim trigger to sustain mutual cooperation is δ ≥ (T-R)/(T-S) = 2/5 = 0.4. Below that threshold, even patient cooperation breaks down.
What is the chain-store paradox?
A puzzle proposed by Reinhard Selten in 1978. A chain store faces a sequence of N potential entrants, each deciding whether to enter the store's market. The chain can either fight (low payoff for both) or accommodate (medium payoff for both). Backward induction predicts: in the final period, the chain accommodates (fighting is worse). Entrant N enters. By induction, the chain accommodates every period, and every entrant enters. But empirically chain stores fight early entrants to deter later ones. Kreps and Wilson (1982) and Milgrom-Roberts (1982) resolved the paradox by introducing incomplete information about chain-store type: even a small probability that the chain is 'tough' makes fighting credible.
Why is reputation hard in finite games?
Backward induction: in the last round there is no future, so the stage-game equilibrium (defect) is the only credible move. Knowing this, the second-to-last round also has no shadow — defection unravels backward to the first round. The folk theorem requires either infinite horizon or uncertainty about the horizon. Kreps-Milgrom-Roberts-Wilson (1982) showed that introducing even a tiny probability that the opponent is an irrational 'cooperator' type restores cooperative equilibria in long-but-finite games. Reputation needs uncertainty to survive.
What is grim trigger?
A repeated-game strategy: cooperate every round until the opponent defects once; then defect forever. Grim trigger is the simplest cooperation-sustaining strategy and the easiest to analyse. Its weakness: a single misunderstanding triggers permanent breakdown, with no recovery path. Real-world relationships use forgiving variants — tit-for-tat (cooperate, then mirror opponent's last move) and tit-for-two-tats (defect only after two consecutive opponent defections). Axelrod's 1980-81 tournaments famously found tit-for-tat outperformed grim trigger in noisy environments.
What are real-world examples?
Brands burn money on advertising to make their long-term commitment credible — defecting on quality would forfeit the brand investment. eBay built an entire seller-rating system because anonymous one-shot transactions would have unraveled. Bank-loan officers extend credit on the basis of relationship history. Diamond merchants (the famous Diamond Dealers Club, Bernstein 1992) use a closed network with permanent exclusion as punishment. Reputation effects also underpin nuclear deterrence, NATO commitments, central-bank inflation-fighting credibility, and academic citation norms.