Molecular Biology

Mismatch Repair

The post-replication spell-checker — how MutS finds errors, how cells decide which strand to keep, and why losing MMR causes cancer

Mismatch repair (MMR) scans newly synthesized DNA for base-pair mismatches and small insertion-deletion loops, excises the daughter-strand error, and re-synthesizes. In E. coli the MutS-MutL-MutH pathway uses Dam-hemimethylation to mark the daughter strand; in humans MSH2/MSH6 (MutSα) or MSH2/MSH3 (MutSβ) sense errors, MLH1/PMS2 (MutLα) nicks the daughter, and EXO1 excises the patch. Loss raises mutation rate 100-1000×, drives microsatellite instability, and underlies Lynch syndrome — the most common hereditary colorectal cancer predisposition.

  • DiscoveredWagner & Meselson (1976), Modrich (1989); Nobel 2015
  • Sensor (E. coli / human)MutS / MSH2-MSH6 (MutSα)
  • Strand discriminationHemimethylated GATC (E. coli); PCNA-orientation + nicks (humans)
  • Patch size excised~1 kb in humans, hundreds of bp in E. coli
  • Mutation rate impact100-1000× elevation when lost
  • DiseaseLynch syndrome (colorectal, endometrial, gastric)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why mismatch repair matters

After polymerase proofreading, roughly one mismatch still slips through per 10⁷ bases. A human cell replicates ~6×10⁹ bases per division — so without a second proofreader the genome would acquire hundreds of mismatches every cycle. MMR catches ~99% of residual errors, dropping the mutation rate from ~10⁻⁷ to ~10⁻⁹ per base.

Beyond the arithmetic, MMR is the one DNA-repair pathway whose loss creates a clinically distinct tumor phenotype (microsatellite-unstable cancer), now standard-of-care to test for in colorectal and endometrial biopsies because MSI-high tumors respond extraordinarily well to immune-checkpoint immunotherapy. It is also why temozolomide chemotherapy works: it requires functional MMR to convert O6-methylguanine adducts into apoptotic signals.

The E. coli pathway: MutS → MutL → MutH → UvrD → Pol III

The prokaryotic pathway is the canonical textbook story — reconstituted in vitro by Paul Modrich's group in the late 1980s and recognized with the 2015 Nobel Prize in Chemistry.

  1. Recognition. A MutS homodimer scans the DNA. On encountering a mismatched base pair (G:T, A:C) or small insertion-deletion loop, the dimer kinks the DNA ~60° and undergoes an ATP-driven shift into a sliding clamp.
  2. Recruitment. The clamped MutS recruits MutL, another homodimer with ATPase activity. The MutS-MutL complex slides bidirectionally, searching for a strand-discrimination signal.
  3. Strand discrimination. Newly replicated DNA is hemimethylated for several minutes — the parent strand carries N6-methyladenine at GATC sites (Dam methylase), the daughter strand is bare. MutH binds the GATC site and is activated by MutS-MutL to nick the unmethylated daughter.
  4. Excision. UvrD helicase unwinds toward the mismatch. One of four exonucleases (RecJ, ExoI, ExoVII, ExoX, picked by nick orientation) digests the displaced strand from nick to mismatch.
  5. Resynthesis. DNA polymerase III holoenzyme fills the gap from the parent template; DNA ligase seals the final nick.

The hemimethylation window is short (a few minutes) precisely because permanent asymmetry would lock in errors. MutH-mediated discrimination is unique to bacteria with Dam methylation; most prokaryotes use lagging-strand nicks instead.

The human pathway: MutSα/β, MutLα, EXO1, Pol δ

Eukaryotic MMR is mechanistically homologous but diverges in two ways: the sensor is a heterodimer, and strand discrimination doesn't use methylation.

  1. Recognition. MSH2 partners with MSH6 (MutSα, ~80% of repair, single mismatches and 1-2 nt IDLs) or MSH3 (MutSβ, 2-15 nt IDLs at microsatellites).
  2. Recruitment. MutLα (MLH1-PMS2) is recruited and bridges detection to excision. Minor MutL homologs MutLβ (MLH1-PMS1) and MutLγ (MLH1-MLH3) handle meiotic recombination more than canonical MMR.
  3. Strand discrimination. No Dam methylation in eukaryotes. PCNA — the sliding clamp loaded at the replication fork — has an inherent orientation, and MutLα reads it to nick the daughter strand. Lagging-strand 5' termini at Okazaki junctions also serve as entry points.
  4. Excision. EXO1, a 5'→3' exonuclease, chews from nick to mismatch. The excised patch is typically ~150-1000 bases. RPA coats the ssDNA gap.
  5. Resynthesis. DNA polymerase δ (with RFC loading PCNA) fills the gap; DNA ligase I seals the nick.

MutLα itself is an endonuclease — its nicking activity, identified by Modrich's group in 2006, was the missing piece that explained how eukaryotes pick a strand without MutH.

E. coli vs human MMR — the homology table

FunctionE. coliHuman
Sensor: single bp + small IDLMutS homodimerMSH2-MSH6 (MutSα)
Sensor: large IDL (microsatellites)MutS (same protein)MSH2-MSH3 (MutSβ)
Coupling factor / endonucleaseMutL (no nuclease)MLH1-PMS2 (MutLα, endonuclease)
Strand-discrimination signalHemimethylated GATC (Dam)PCNA orientation + lagging nicks
Strand-incision endonucleaseMutH(MutLα itself)
HelicaseUvrD(none — EXO1 is processive)
Excision exonucleaseExoI / RecJ / ExoVII / ExoXEXO1
Single-strand bindingSSBRPA
Resynthesis polymerasePol III holoenzymePol δ + PCNA + RFC
LigaseDNA ligaseDNA ligase I

Pathway diagram (human MMR)

5' ──G━━━━━━━━━━━━━━━━━━ parent
3' ──T━━━━━━━━━━━━━━━━━━ daughter (G:T mismatch)
       │
       ▼  MSH2-MSH6 (MutSα) clamps mismatch
       ▼  MLH1-PMS2 (MutLα) recruited
       ▼  PCNA orientation → MutLα nicks daughter
       ▼  EXO1 excises ~150-1000 nt patch (RPA coats ssDNA)
       ▼  Pol δ + PCNA resynthesize, DNA ligase I seals
       ▼
5' ──G━━━━━━━━━━━━━━━━━━ parent
3' ──C━━━━━━━━━━━━━━━━━━ daughter (corrected)

Lynch syndrome and MMR loss

Lynch syndrome (formerly HNPCC) is caused by germline mutation in MLH1, MSH2, MSH6, PMS2, or 3' deletion of EPCAM (which silences MSH2 by readthrough methylation). It affects ~1 in 280 people — the most common monogenic cancer predisposition syndrome. Carriers inherit one defective allele; somatic loss of the second copy in a colonic crypt produces an MMR-deficient cell that begins accumulating frameshift mutations in tumor suppressors.

Cumulative cancer risk by age 70 is ~50-80% for colorectal cancer (MLH1/MSH2 highest, PMS2 lowest), 30-50% for endometrial cancer, and elevated for ovarian, gastric, urothelial, and small-bowel cancers. The sentinel finding — Aldred Warthin's 1913 "cancer family G" — predates molecular genetics by 80 years; Henry Lynch reconnected the dots in the 1960s, and MSH2 (1993) and MLH1 (1994) mutations were cloned from Lynch families.

Sporadic colorectal cancers also lose MMR, most commonly by MLH1 promoter hypermethylation — a recurrent epigenetic event in 10-15% of colon cancers and the cause of ~80% of MSI-high colorectal tumors overall.

Microsatellite instability — the diagnostic signature

Microsatellites are tandem repeats of 1-6 nucleotides — (CA)n, (A)n, (CAG)n — covering ~3% of human DNA. DNA polymerase slips at these tracts ~once per 10⁵ replications, producing IDLs that MMR routinely repairs. Without MMR, those IDLs accumulate and repeat lengths drift, generating a tumor population with idiosyncratic microsatellite alleles.

The clinical assay uses five mononucleotide markers — BAT-25, BAT-26, NR-21, NR-24, MONO-27 — amplified from tumor and matched normal DNA. Two or more shifts = MSI-high; one = MSI-low; none = microsatellite-stable. The clinical payoff is dramatic: MSI-high tumors carry hundreds to thousands of frameshift neoantigens. Pembrolizumab was approved in 2017 for any MSI-high solid tumor regardless of organ — the FDA's first tissue-agnostic approval — based on a ~40% response rate across colorectal, endometrial, gastric, biliary, and pancreatic tumors.

Variants and related pathways

  • Constitutional MMR deficiency (CMMRD). Biallelic germline MMR mutations (~1 in 1M) cause childhood-onset hematologic and brain tumors with café-au-lait macules; survival beyond age 30 is uncommon without immunotherapy.
  • Meiotic recombination. MSH4-MSH5 and MLH1-MLH3 stabilize Holliday junctions and direct crossover formation — non-redundant with mismatch sensing.
  • Triplet repeat expansion. MutSβ paradoxically promotes CAG/CTG expansion in Huntington's, myotonic dystrophy, and Friedreich's ataxia, binding slipped intermediates that get extended rather than excised. MSH3 inhibitors are in Huntington trials.
  • Damage-response signaling. MutSα and MutLα recruit ATR-CHK1 after methylation damage, contributing to S-phase checkpoint and apoptosis decisions.

O6-methylguanine and the futile cycle

Counterintuitively, working MMR is required for the cytotoxicity of methylating chemotherapies like temozolomide (glioblastoma) and procarbazine (lymphoma). These drugs deposit a methyl group at the O6 position of guanine. After replication, O6-meG mispairs with T. MutSα triggers daughter-strand excision — but the template still has O6-meG, so the polymerase reinserts T. The cycle repeats, generating persistent single-strand gaps that collapse into double-strand breaks at the next replication fork, triggering apoptosis. MMR-deficient cells skip the cycle, tolerate O6-meG, and survive — which is why temozolomide-treated glioblastomas frequently evolve MMR deficiency under selection, relapsing as MSI-high hypermutator tumors.

Common pitfalls and misconceptions

  • Confusing MMR with polymerase proofreading. Pol δ/ε have a 3'→5' exonuclease that removes mis-incorporated nucleotides during synthesis; MMR operates after the polymerase has moved on. Defects in POLE/POLD1 produce an ultra-hypermutated phenotype distinct from MSI-high.
  • Assuming all mismatches repaired equally. G:T and A:C are repaired efficiently; C:C transversions poorly. Repeat-tract IDLs are caught most reliably, which is why their loss produces the MSI signature so cleanly.
  • Thinking MMR loss alone causes cancer. It's a mutator, not an oncogene. Tumors still need driver mutations in TP53/APC/KRAS — MMR loss just makes them more likely per division.
  • Calling Lynch syndrome on any MMR-deficient tumor. ~80% of MSI-high colon cancers are sporadic (BRAF V600E with MLH1 promoter methylation), not Lynch. Universal screening uses BRAF and methylation as reflex tests before germline testing.
  • Underestimating the immunotherapy implications. MSI-high status changes treatment more than tumor stage. Stage IV MSI-high colorectal cancer on pembrolizumab now has long-term survival rates unthinkable before 2017.

Frequently asked questions

How does MMR know which strand carries the error?

In E. coli, GATC sites in the parent strand are methylated by Dam methylase but the newly synthesized daughter strand is transiently unmethylated — a window of a few minutes after replication. MutH recognizes the hemimethylated GATC and nicks the unmethylated (daughter) strand, marking it as the one to be re-synthesized. Eukaryotes don't use Dam methylation; instead, MutLα is loaded onto PCNA, which is itself oriented by the replication fork, and nicks the discontinuous lagging strand at pre-existing 3' termini, or at strand discontinuities on the leading strand.

What's the difference between MutSα and MutSβ?

MutSα is the heterodimer MSH2-MSH6 and recognizes single base-pair mismatches and 1-2 nucleotide insertion-deletion loops (IDLs). MutSβ is MSH2-MSH3 and specializes in larger IDLs of 2-15 nucleotides, which are common at microsatellite repeats. Both share the MSH2 subunit, which is why MSH2 loss disables both branches and causes the most severe Lynch syndrome phenotype, while MSH6-only loss spares MutSβ activity and gives a milder, later-onset cancer risk.

What is microsatellite instability and why does it matter clinically?

Microsatellites are tandem repeats of 1-6 nucleotides (e.g. (CA)n, (A)n) where DNA polymerase frequently slips, generating IDL mismatches. A working MMR system fixes these silently. When MMR is lost, microsatellite lengths drift, creating microsatellite instability (MSI) — detectable by PCR or NGS at marker loci like BAT-25, BAT-26, NR-21, NR-24, MONO-27. MSI-high tumors generate hundreds to thousands of frameshift neoantigens, making them exceptionally responsive to immune-checkpoint inhibitors like pembrolizumab. The FDA's first tissue-agnostic cancer approval in 2017 was for MSI-H solid tumors.

How does MMR cause cancer when it fails?

MMR loss is a mutator phenotype — every cell division accumulates ~10-1000× more mutations than normal. Many are neutral, but eventually a frameshift hits a tumor suppressor (TGFBR2, BAX, MSH3, MSH6) or activates an oncogene. Lynch syndrome patients inherit one defective MMR allele; somatic loss of the second copy in a colonic crypt initiates a tumor. Roughly 80% of Lynch carriers develop colorectal cancer by age 70, with elevated risk also for endometrial, gastric, ovarian, and urothelial cancers.

How is MMR different from base or nucleotide excision repair?

BER handles small base lesions (oxidation, alkylation) via glycosylases. NER handles bulky helix-distorting lesions like UV thymine dimers via XPC and a 24-32 nt excision. MMR uniquely targets normal Watson-Crick mismatches and IDLs that arose during replication — its job starts only after the polymerase passed and the proofreading exonuclease missed an error.

Why does MMR also process O6-methylguanine adducts?

O6-meG mispairs with T after replication. MutSα recognizes it and excises the T from the daughter — but the template still has O6-meG, so the polymerase reinserts T, and MMR cycles again. This futile cycle generates double-strand breaks and triggers apoptosis. That's why temozolomide requires functional MMR to kill tumor cells; MMR-deficient gliomas are intrinsically resistant.