Game Theory
Tit-for-Tat Strategy
Cooperate first; then do whatever your opponent just did
Tit-for-tat is a 4-line strategy that won Axelrod's 1980-81 Iterated Prisoner's Dilemma tournaments: cooperate on move one, then mirror the opponent. Anatol Rapoport submitted it; it beat 13 and 62 rival strategies respectively, and seeded the modern literature on the evolution of cooperation.
- Submitted byAnatol Rapoport, 1980
- Tournaments wonAxelrod 1980 (14 entries) & 1981 (63)
- Code length4 lines of FORTRAN
- Four key propertiesNice · Retaliating · Forgiving · Clear
- VariantsGTFT, Pavlov/WSLS, Tit-for-Two-Tats
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The underlying game: prisoner's dilemma
Two suspects are interrogated in separate rooms. Each can cooperate with the other (stay silent) or defect (testify). The payoff matrix:
| B cooperates | B defects | |
|---|---|---|
| A cooperates | R, R = 3, 3 (Reward) | S, T = 0, 5 (Sucker, Temptation) |
| A defects | T, S = 5, 0 | P, P = 1, 1 (Punishment) |
The numbers (R = 3, T = 5, P = 1, S = 0) satisfy the canonical prisoner's-dilemma inequalities T > R > P > S and 2R > T + S (so taking turns to defect is worse than mutual cooperation). In a one-shot game, defection strictly dominates: whatever the opponent does, you score one point higher by defecting. The unique Nash equilibrium is mutual defection, yielding (1, 1). Both players would have done better at (3, 3) — but neither can trust the other.
The puzzle dissolves when the game is repeated. If the same two players meet again — and again — and the number of rounds is uncertain, defection today provokes retaliation tomorrow. Cooperation can sustain itself if the shadow of the future is long enough. The folk theorem of repeated games says practically any outcome with payoffs above mutual defection can be supported in equilibrium for a sufficiently patient player. Which strategies actually emerge?
The Axelrod tournaments (1980-81)
In 1980 Robert Axelrod, a political scientist at Michigan, invited game theorists, computer scientists and economists to submit FORTRAN programs to play a round-robin iterated prisoner's-dilemma (IPD) tournament. Each program played 200 rounds against every other program (and against itself and a random strategy). Axelrod received 14 entries; the winner — by average score, not head-to-head wins — was a four-line program from Anatol Rapoport. It was called TIT FOR TAT:
- Move 1: cooperate.
- Move n: replay the opponent's move on round n − 1.
That was the entire program. Tit-for-tat scored 504 average points across the 14 opponents, ahead of strategies orders of magnitude more elaborate (one entry was a 152-line "Tideman & Chieruzzi" Bayesian learner). Crucially, tit-for-tat never beat any single opponent in head-to-head play — the rule guarantees you score equal-or-fewer points than your opponent in any pairing — yet it accumulated the highest average score because it elicited cooperation from cooperators while limiting losses to defectors.
Axelrod's results provoked enough interest that he ran a second tournament in 1981. This time he received 62 entries from six countries, with strategies designed in full knowledge of tit-for-tat's earlier win. Many entrants tried to exploit tit-for-tat by defecting near the end of the game, by simulating tit-for-tat themselves and then defecting once, by occasional probing defections to test the opponent. Anatol Rapoport submitted the same four-line tit-for-tat. It won again.
The four properties Axelrod identified
| Property | What it means | Why it pays |
|---|---|---|
| Nice | Never defects first. | Avoids triggering retaliation spirals against other nice strategies. The eight top finishers in 1980 were all nice; the seven bottom finishers were all not. |
| Retaliating | Punishes defection immediately. | Defectors learn there's no free lunch and stop probing. Always-cooperate (TFT without retaliation) is exploited mercilessly. |
| Forgiving | Returns to cooperation as soon as the opponent does. | Avoids endless cycles of mutual defection. Grim Trigger (defect forever after one defection) earned far less. |
| Clear | Easy for the opponent to read and learn. | Predictability makes other strategies converge on cooperation. Complex strategies often confused both opponents and themselves. |
| Not envious | Doesn't try to score more than the opponent in any pairing. | The IPD is a non-zero-sum game. Maximising your own score does not require minimising theirs; in fact the opposite is usually true. |
| Symmetrical | Treats opponent as you'd accept being treated. | Reciprocity is evolutionarily stable in repeated interactions; pure exploitation is not. |
A worked walkthrough
Suppose two strict tit-for-tat players meet for 10 rounds, both cooperating throughout:
- Each round: (C, C) → (R, R) = (3, 3).
- 10 rounds × 3 = 30 points each. Joint payoff 60. Optimal mutual outcome.
Now suppose tit-for-tat meets always-defect:
- Round 1: TFT cooperates, AllD defects → (S, T) = (0, 5).
- Rounds 2-10: TFT mirrors defection → (P, P) = (1, 1) each.
- TFT total: 0 + 9 × 1 = 9. AllD total: 5 + 9 × 1 = 14.
TFT loses head-to-head — but only by 5 points spread over 10 rounds, and AllD loses badly to other AllD opponents (10, 10) compared to TFT-vs-TFT pairs (30, 30). Across the round-robin, TFT's nice-and-retaliating mix outscored 13 rivals.
The pathological case: two strict TFT players where one accidentally defects on round 4 (a noise-induced "trembling hand"):
- Rounds 1-3: (C, C), (C, C), (C, C). 9 each.
- Round 4: A trembles, defects accidentally. B cooperated. (T, S) for A, B → (5, 0).
- Round 5: A cooperates (mirrors B's R3 cooperation). B retaliates A's R4 defection → (S, T) for A, B = (0, 5).
- Round 6: A retaliates B's R5 defection → defect. B cooperates (mirrors A's R5 cooperation) → (T, S) = (5, 0).
- Pattern locks into alternating (C, D), (D, C) → joint payoff per pair of rounds = 5 + 0 = 5 per player, vs. 6 for mutual cooperation. The death spiral.
This noise-fragility motivated all the variants below.
Variants of reciprocal strategy
| Strategy | Rule | Strength | Weakness |
|---|---|---|---|
| Tit-for-Tat (TFT) | Cooperate first; then copy opponent's last move. | Nice, simple, robust against most strategies in the original tournaments. | Death-spiral on a single noisy defection; cannot recover without external trigger. |
| Generous Tit-for-Tat (GTFT) | Same as TFT but cooperate after a defection with probability ~1/3. | Breaks death-spirals. Nowak & Sigmund (1992) showed it beats TFT in noisy environments. | Exploitable by always-defect that bets on the forgiveness probability. |
| Tit-for-Two-Tats | Defect only after two consecutive defections by opponent. | Even more forgiving; performs well in noise. Would have won Axelrod's 1980 tournament had it been entered. | Exploitable by a strategy that alternates D and C. |
| Grim Trigger | Cooperate until opponent defects once; defect forever after. | Maximum punishment; supports cooperation in equilibrium under low discount. | No forgiveness; one accident dooms the relationship. |
| Pavlov / Win-Stay Lose-Shift (WSLS) | Repeat last move if payoff was R or T; switch if payoff was P or S. | Recovers from accidents within one round. Nowak & Sigmund (1993) showed it can outperform TFT in long noisy IPDs. | Initially exploits naive cooperators; more complex; not nice. |
| Contrite Tit-for-Tat | TFT, but if you defected by accident, cooperate next move regardless of opponent's reply. | Repairs death-spirals you caused. | Requires the player to know its own error, hard to implement realistically. |
| Zero-determinant strategies | Press & Dyson (2012) showed extortionate strategies that unilaterally enforce a linear payoff relationship. | Mathematically can extract more than equal payoff from any responsive opponent. | Lose to evolution: in a population, extortionate strategies are selected against. |
Real-world appearances of conditional reciprocity
- Trench warfare 1914-18. Sociologist Tony Ashworth's Trench Warfare 1914-1918: The Live and Let Live System documented how front-line British and German units evolved tacit truces — shooting wide of accuracy, holding fire at meal times, restoring routine after sudden attacks — that resemble noisy tit-for-tat. High command broke the system by rotating units (shortening the shadow of the future) and ordering raids that forced retaliation.
- Vampire-bat blood-sharing. Gerald Wilkinson's 1984 fieldwork in Costa Rica showed Desmodus rotundus bats regurgitate blood meals to roost-mates who failed to feed, and selectively withhold from past non-sharers. The pattern fits TFT in a long-lived social network.
- Stickleback predator inspection. Manfred Milinski's 1987 experiments showed pairs of three-spine sticklebacks approaching a predator take turns, withdrawing in proportion to their partner's defection — a classic IPD.
- Trade-policy reciprocity. The GATT and WTO MFN principle institutionalises tit-for-tat tariff negotiation: a tariff cut by one country is conditional on cuts by others, and retaliation tariffs are explicitly authorised under the Dispute Settlement Understanding.
- Commercial price-matching guarantees. "We'll match any competitor's price" is a credible commitment that punishes any defector who cuts price by stripping their incremental customers — sustaining tacit oligopoly cooperation.
- OPEC quota enforcement. Saudi Arabia's flood-the-market response to quota violators (notably 1985-86 and 2014) is a tit-for-tat punishment that has periodically restored discipline among OPEC and OPEC+ members.
Evolution and population dynamics
Axelrod's third experiment — an "ecological" simulation — let strategies reproduce in proportion to their score and re-run the tournament with the new population. Tit-for-tat grew steadily; always-defect shrank as it ran out of cooperators to exploit; the tournament converged on a population dominated by nice reciprocators. Subsequent work by Martin Nowak, Karl Sigmund and others showed the picture is more cyclical: in noisy environments, a TFT population is invaded by always-cooperate (no penalty for being soft), then by always-defect (preys on cooperators), then re-invaded by TFT or Pavlov. Evolutionary stability depends on noise level, payoff parameters and discount factors.
Common pitfalls and counterarguments
- "Tit-for-tat is the optimal strategy." No — TFT was the best average performer in two specific tournaments with specific entry pools. Against an all-defect environment, always-defect dominates. Against noisy opponents, GTFT or Pavlov outperform. There is no universal optimum.
- "Defecting last round is rational." If both players know the game ends on round 200, backward induction unravels cooperation: defect on 200, then on 199, then 198, all the way back. Axelrod's tournaments avoided this by making the round count uncertain (probabilistic continuation), which is also the realistic case in nature.
- "TFT punishes too quickly." In noisy environments — accidental defections, mis-perceived signals — strict TFT cycles into mutual defection. This is why GTFT and Tit-for-Two-Tats exist; the right amount of forgiveness depends on the noise rate.
- "Zero-determinant strategies destroy TFT." Press and Dyson (2012) proved that a class of "extortionate" strategies can unilaterally enforce any linear payoff relation against a responsive opponent, scoring more on average. But subsequent work (Hilbe, Nowak, Sigmund 2013) showed extortionate strategies are not evolutionarily stable: in a population, they are out-competed by generous strategies that build mutual cooperation.
- "Cooperation requires kinship or reputation." Hamilton's kin-selection and Trivers's reciprocal-altruism arguments do not require sophisticated cognition. Even bacteria producing public-good metabolites display TFT-like conditional cooperation; reputation systems are a richer extension to large groups (see indirect reciprocity, Nowak & Sigmund 1998).
- "Real games aren't pure prisoner's dilemmas." Many real interactions are stag-hunts (coordination), chicken (escalation), or hawk-dove (asymmetric). Tit-for-tat's lessons about reciprocity, retaliation and forgiveness generalise, but the specific payoff structure shifts which strategy wins.
Frequently asked questions
Who invented tit-for-tat?
Anatol Rapoport, a mathematical psychologist at the University of Toronto, submitted the four-line FORTRAN program to Robert Axelrod's 1980 round-robin tournament for the iterated prisoner's dilemma. The strategy itself is older — it had appeared in informal discussions of repeated games — but Rapoport's entry showed it could beat far more elaborate strategies under tournament conditions.
What are the rules of tit-for-tat?
Two rules: cooperate on the first move, then on every subsequent move do exactly what the opponent did on the previous move. If they cooperated, you cooperate. If they defected, you defect once, then return to cooperation as soon as they do. That's the entire algorithm.
Why did tit-for-tat win Axelrod's tournament?
Axelrod identified four properties: niceness (never defects first), retaliation (punishes defection immediately), forgiveness (returns to cooperation quickly), and clarity (other strategies can read its pattern and learn to cooperate with it). The combination earned the highest average score across pairings even though it never beat any single opponent head-to-head.
What is generous tit-for-tat?
Generous tit-for-tat (GTFT) cooperates after a defection with some small probability — roughly 1/3 in Nowak and Sigmund's analysis. It outperforms strict tit-for-tat in noisy environments where mistakes happen, because it breaks the death spiral of mutual retaliation that two strict TFT players fall into after a single accidental defection.
What is Pavlov or win-stay lose-shift?
Pavlov, also called win-stay lose-shift (WSLS), keeps its previous move if it earned a "good" payoff (mutual cooperation reward R or temptation T) and switches if it earned a "bad" payoff (sucker S or mutual defection P). Nowak and Sigmund (1993) showed Pavlov can outperform tit-for-tat in long noisy tournaments because it exploits a naive cooperator and recovers from errors quickly.
Does tit-for-tat work in real life?
Versions of it appear in trench-warfare "live and let live" arrangements documented by Tony Ashworth, in vampire-bat blood-sharing studied by Gerald Wilkinson, in stickleback fish predator-inspection behaviour described by Manfred Milinski, in trade-policy reciprocity, and in commercial price-matching guarantees. Wherever players interact repeatedly with memory, conditional reciprocity tends to evolve.