Molecular Biology
DNA Polymerase
1000 nt/s replication enzyme with 3'→5' proofreading exonuclease
DNA polymerase is the enzyme that synthesizes DNA from a template, extending a primer 5' to 3' by adding deoxynucleoside triphosphates and releasing pyrophosphate. The replicative forms achieve incorporation rates of ~1000 nt/s in E. coli (Pol III holoenzyme) and ~50 nt/s in mammalian cells (Pol δ and Pol ε). Catalysis uses a two-metal-ion mechanism described by Thomas Steitz: two divalent magnesium ions coordinate the incoming dNTP and lower the pKa of the primer's 3'-OH for nucleophilic attack on the α phosphate. The 3' to 5' exonuclease domain proofreads each addition, lowering the intrinsic 10-4 error rate to 10-6, and post-replication mismatch repair drops it further to 10-9 to 10-10.
- Speed (E. coli Pol III)~1000 nt/s
- Speed (human Pol δ)~50 nt/s
- Error rate~10-7 with proofreading
- CatalysisTwo-metal-ion (Mg2+)
- Direction5' to 3' only
- Processivity (Pol III/clamp)>50,000 nt
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Why DNA polymerase matters
- It is the cell's only general-purpose DNA copier. Every dividing cell, every viral replication cycle, every PCR reaction depends on a polymerase. The human genome's 6.4 billion bp duplicates in ~8 hours of S-phase, requiring roughly 1014 phosphodiester bonds formed across thousands of forks running in parallel.
- Its fidelity sets the mutation rate of life. Replicative polymerases plus mismatch repair give a per-base-pair mutation rate of ~10-9-10-10. Drift, selection, and human cancer all begin in the rare polymerase mistakes that escape proofreading and MMR.
- The two-metal-ion mechanism unifies enzymology. Thomas Steitz's 1993 crystal structures of Klenow fragment, HIV reverse transcriptase, and T7 RNA polymerase all converged on the same active-site geometry. The mechanism extends to RNA polymerases, group I/II introns, RNase H, and the catalytic core of the spliceosome.
- Pol III speed is one of biology's most studied enzyme rates. E. coli Pol III holoenzyme catalyzes ~1000 nt/s with k_cat ≈ 1000 s-1 per active site, near the diffusion limit for dNTP binding. The reaction is fast enough that every base pair takes ~1 ms.
- It is the target of major antiviral and anticancer drugs. AZT, tenofovir, acyclovir, and remdesivir are all chain-terminating nucleoside analogs that block viral polymerases. Cisplatin produces lesions that stall replicative polymerases, killing dividing cancer cells. Cytarabine (Ara-C) is incorporated by Pol α and stalls leukemia cells.
- Taq polymerase enabled PCR. Kary Mullis's 1985 PCR concept needed a thermostable polymerase. Taq (Thermus aquaticus, isolated from Yellowstone hot springs by Thomas Brock) survives 95 °C denaturation cycles; one billion-fold amplifications became routine, and biotechnology was born.
- Translesion polymerases let cells survive damage. Y-family enzymes (Pol η, ι, κ, Rev1) tolerate UV photoproducts, abasic sites, and bulky adducts at the cost of higher error rates (10-2 to 10-4). Loss of Pol η causes xeroderma pigmentosum variant — UV-driven skin cancer at very young ages.
Common misconceptions
- Polymerase reads DNA both directions. No. Synthesis is strictly 5' to 3' on the new strand, which means reading the template 3' to 5'. Every polymerase ever characterized obeys this — there is no known reverse-direction DNA polymerase.
- Pol I replicates the genome. Pol I, the first polymerase isolated (Arthur Kornberg, 1956), does primer removal and gap filling, not bulk replication. Pol III holoenzyme is the actual replicative enzyme in E. coli — discovered by Thomas Kornberg (Arthur's son) in 1970.
- Eukaryotic polymerases are slower because they are weaker. They are slower per fork (~50 vs 1000 nt/s) but the eukaryotic genome is 1000x larger, so cells run thousands of forks in parallel. Total replication throughput per cell is comparable when normalized to genome size.
- Proofreading and mismatch repair are the same thing. Proofreading is a 3' to 5' exonuclease activity within the polymerase that catches errors during synthesis (within 1-2 bases). Mismatch repair (MMR) is a separate post-replicative system using MutS/MutL homologs that scans the daughter strand minutes to hours later and patches mismatches.
- All polymerases need Mg2+. The two-metal-ion mechanism prefers Mg2+ in vivo, but Mn2+ works (with reduced fidelity) and some translesion polymerases use either. Crystal structures often use higher-Z metals like Ca2+ or Zn2+ as inactive substitutes to trap pre-catalytic complexes.
- Polymerase is a single protein. Pol III holoenzyme is a 17-subunit complex (~900 kDa) including the catalytic α, the ε proofreader, the θ subunit, plus the β clamp and γ clamp loader. Eukaryotic Pol δ and Pol ε are 4-subunit assemblies, and the full replisome adds another dozen factors.
How a single nucleotide is added
The cycle begins with the polymerase bound to a primer-template, with the 3' end of the primer in the catalytic site. A dNTP diffuses in from solution and base-pairs (or fails to base-pair) with the next template nucleotide. If correct, the polymerase fingers domain rotates inward by 30-40° — the open-to-closed transition described by Lorena Beese and Tom Steitz from Klenow fragment crystal structures. This conformational change brings the catalytic aspartates and the dNTP triphosphate into alignment with the primer's 3'-OH and the two Mg2+ ions. The 3'-OH attacks the α-phosphate, displacing pyrophosphate (PPi) and forming the new phosphodiester bond. The fingers domain reopens, the polymerase translocates one nucleotide along the template, and the cycle repeats. Each addition consumes ~1 ms in Pol III at saturating substrate.
If a mismatch slips past the geometric checkpoint, the mispaired 3' end fits poorly into the active site, kinks the primer-template, and increases the off-rate. The primer-template is shuttled ~30 angstroms to the 3' to 5' exonuclease site, where the mispaired terminal base is hydrolyzed off (one of the few hydrolytic, not synthetic, activities in the polymerase). The polymerase then attempts the addition again. Proofreading roughly doubles the per-nucleotide free-energy cost and lowers error rates ~100-1000-fold below the geometric baseline. Post-replicative mismatch repair, run by MutS-MutL homologs scanning for mismatches by hemi-methylated GATC strand-discrimination signals (bacteria) or ribonucleotide marks (eukaryotes), removes another 99% of remaining errors. The combined system gives genomic fidelity that is essentially unmatched in any other engineered or evolved copying machinery.
Polymerase family comparison
| Polymerase | Organism / role | Speed | Fidelity (error rate) | Proofreading? | Distinctive feature |
|---|---|---|---|---|---|
| Pol I | E. coli; primer removal + repair | ~20 nt/s | 10-6 | Yes (3'-5') | Has 5'-3' exo for nick translation |
| Pol III | E. coli; replicative | ~1000 nt/s | 10-7 | Yes (ε subunit) | 17-subunit holoenzyme, dual replisome |
| Pol α-primase | Eukaryotic; primer synthesis | ~20 nt/s | 10-4 | No | Only enzyme that synthesizes RNA primers |
| Pol δ | Eukaryotic; lagging strand + repair | ~50 nt/s | 10-7 | Yes | Loads on PCNA, strand-displacement for FEN1 |
| Pol ε | Eukaryotic; leading strand | ~50 nt/s | 10-7 | Yes | Identified as leading-strand specific by Kunkel 2007 |
| Pol γ | Mitochondrial replication | ~70 nt/s | 10-6 | Yes | Sole mtDNA polymerase; mutations cause MELAS, Alpers |
| Pol β | Base-excision repair gap filling | ~10 nt/s | 10-3-10-4 | No | Smallest replicative polymerase, 39 kDa |
| Pol η | Translesion across UV dimers | ~5 nt/s | 10-2-10-3 | No | Loss causes xeroderma pigmentosum variant |
| Taq polymerase | Thermus aquaticus; PCR | ~50 nt/s at 72 °C | 10-5 | No | Stable at 95 °C; foundation of PCR |
| Phi29 polymerase | Bacteriophage phi29; isothermal amp | ~50 nt/s | 10-6 | Yes | ~70 kb processivity, used for whole-genome amplification |
Famous experiments
- Arthur Kornberg, 1956 (J Biol Chem). Purified DNA polymerase from E. coli (now called Pol I) and demonstrated template-directed DNA synthesis in vitro. Won the Nobel Prize in 1959. Severo Ochoa shared it for RNA polymerase work.
- Thomas Kornberg & Malcolm Gefter, 1970-1972. Showed Pol I null mutants of E. coli (polA1) still grow, identified Pol II and then Pol III as the actual replicative enzyme. Resolved the embarrassing mystery of why losing Kornberg's polymerase didn't kill the cell.
- Steitz lab, 1985-1995. Solved the crystal structures of Klenow fragment and HIV reverse transcriptase, defined the right-hand polymerase fold (palm, fingers, thumb), and proposed the two-metal-ion mechanism that turned out to apply across all nucleic acid polymerases.
- Kary Mullis, 1985. Conceived PCR while driving on Highway 128 in California. Replacing the heat-labile Klenow fragment with thermostable Taq polymerase (Saiki et al., 1988) made the technique practical. Mullis won the 1993 Nobel.
- Tom Kunkel, 2007 (Nature). Used catalytic-domain mutations that increase ribonucleotide misincorporation in Pol ε and Pol δ separately to map which polymerase replicates which strand in S. cerevisiae. Concluded Pol ε is the dedicated leading-strand polymerase, Pol δ the lagging.
Frequently asked questions
How fast is DNA polymerase?
Speed varies enormously by organism and polymerase. E. coli Pol III holoenzyme adds 700-1000 nucleotides per second per fork — the entire 4.6 Mb genome is duplicated in ~40 minutes by two converging forks. Bacteriophage T7 polymerase tops 200 nt/s. Eukaryotic Pol delta and Pol epsilon manage only 30-50 nt/s, but eukaryotes compensate with thousands of replication origins firing in parallel — the human genome's 6 billion bp is replicated in ~8 hours of S-phase by ~30,000-50,000 active forks. Repair polymerases like Pol beta and translesion polymerases (Pol eta, Pol kappa) are even slower at 10-100 nt/min, trading speed for the ability to bypass damaged bases.
How accurate is DNA polymerase?
The intrinsic geometric selectivity of base-pairing gives DNA polymerase about one mistake per 10^4 to 10^5 incorporations. The 3' to 5' exonuclease proofreading domain catches most of these by removing mispaired 3' termini before extension — the mismatched base pair fits poorly into the polymerase active site, so the primer 3' end transfers to the exonuclease ~30 angstroms away, gets clipped, and synthesis resumes. Proofreading lowers the error rate to ~10^-7. Post-replicative mismatch repair (MutS/MutL in bacteria, MSH2/MSH6 plus MLH1/PMS2 in humans) catches roughly 99% of remaining errors, giving a final mutation rate of ~10^-9 to 10^-10 per base pair per generation. A single human cell makes ~3-6 mutations per division on top of 6 billion replicated bases.
What are the main eukaryotic DNA polymerase families?
Five replicative or near-replicative families. Pol alpha-primase (~340 kDa, four subunits) lays the RNA primer plus 20-30 nt of low-fidelity DNA at every Okazaki fragment start. Pol delta extends both the lagging strand and short stretches behind Pol alpha; it has the strongest 3' to 5' exonuclease and sits on the PCNA clamp. Pol epsilon is the dedicated leading-strand polymerase (Kunkel showed this in 2007 by tracking ribonucleotide incorporation). Pol gamma replicates mitochondrial DNA. The Y-family translesion polymerases (Pol eta, Pol iota, Pol kappa, Rev1) bypass DNA damage at low fidelity (10^-2 to 10^-4 error rate). Pol beta does base-excision repair gap filling, and Pol zeta extends mismatched primers in damage tolerance.
What is the two-metal-ion mechanism?
Thomas Steitz's 1993 model, derived from crystal structures of Klenow fragment and HIV reverse transcriptase. Two divalent metal ions, usually Mg2+, sit in the active site. Metal A activates the 3'-OH of the primer's terminal nucleotide by lowering its pKa, allowing it to act as a nucleophile attacking the alpha phosphate of the incoming dNTP. Metal B coordinates the leaving pyrophosphate (PPi), stabilizing the developing negative charge. Both metals are coordinated by conserved aspartate residues — typically two Asp side chains in the polymerase palm domain. The geometry produces an in-line SN2-like nucleophilic substitution with inversion of configuration at the alpha phosphorus. The same mechanism operates in RNA polymerases and many self-splicing ribozymes.
Why does DNA polymerase need a primer?
Because the active site is geometrically tuned to extend a 3'-OH that is already base-paired to the template, not to position two unstabilized nucleotides for the first phosphodiester bond. The geometric distortion required for de novo bond formation would lower fidelity by orders of magnitude — the cell pays a small cost (the primer) to keep replicative polymerases tightly templated. The exception is primase, which is built differently: a separate enzyme family (DnaG in bacteria; Pol alpha-primase in eukaryotes; archaeal PriS/PriL) that synthesizes ~10 nt RNA primers at low fidelity, accepting the trade-off because the products will be checked and removed downstream.
What gives DNA polymerase its processivity?
The sliding clamp. Free Pol III core or free Pol delta dissociates after 10-100 nt — too few to replicate even one Okazaki fragment. Loaded onto the beta clamp (E. coli homodimeric ring around DNA) or PCNA (eukaryotic homotrimeric ring), processivity jumps to >50,000 nt without falling off. The clamp is loaded by the gamma complex (bacteria) or RFC (eukaryotes) using ATP hydrolysis to crack the ring open and seal it around DNA. PCNA is one of the most heavily used scaffolds in the cell: it docks ligase I, FEN1, MSH3/6, p21, CDK regulators — over 50 partner proteins through a short PIP-box motif (Q-x-x-L/I-x-x-F-F).