Biotechnology
Polymerase Chain Reaction
Copying one DNA molecule into billions
The polymerase chain reaction (PCR) is a laboratory method that copies a chosen stretch of DNA from a vanishingly small starting amount into billions of identical copies, by cycling a tube between three temperatures: ~95°C to denature the double helix, ~50–65°C to anneal short primers, and ~72°C to let a heat-stable polymerase extend new strands. Because every copy becomes a template for the next round, the target doubles each cycle — so 30 cycles turn one molecule into roughly a billion. Invented by Kary Mullis in 1983, PCR is the workhorse behind DNA sequencing prep, forensics, cloning, and the diagnostic tests that detect viruses like SARS-CoV-2.
- Invented byKary Mullis, 1983 (Nobel 1993)
- Three stepsDenature 95°C · anneal 50–65°C · extend 72°C
- Amplification~2ⁿ — 30 cycles ≈ 10⁹-fold
- Key enzymeTaq polymerase (~1,000 nt/s, stable to 95°C)
- Primers2 oligos, 18–25 bases, Tm ~55–65°C
- Run time25–40 cycles in ~1–2 hours
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The trick: a copy machine for one DNA sequence
Suppose you have a single drop of blood at a crime scene, a few cells from a tumor biopsy, or one virus particle in a nasal swab. Hidden inside is a specific DNA sequence you care about — but there is far too little of it to read, cut, or detect. The polymerase chain reaction solves this with brute-force exponential copying. You tell it which sequence to copy using two short primers, then let it run for an hour. One target molecule becomes two, then four, then eight, and after thirty rounds you are holding roughly a billion identical copies — enough to load on a gel, sequence, or clone.
The elegance is that PCR copies only the region you specify. Out of the three billion base pairs in a human genome, you can pluck and amplify a single 500-base-pair gene, leaving everything else untouched. That selectivity comes entirely from primer design, and the speed comes from a polymerase that survives near-boiling water.
The three-temperature cycle
A PCR reaction is a small tube — typically 10–50 microliters — containing the template DNA, the two primers, a thermostable DNA polymerase, the four deoxynucleotide building blocks (dNTPs: dATP, dTTP, dGTP, dCTP), magnesium ions (Mg²⁺, an essential cofactor for the polymerase), and a buffer. A machine called a thermal cycler drives the tube through a repeating temperature program. Each cycle has three phases.
1 · Denature (~95°C)
Heating to about 95°C breaks the hydrogen bonds between complementary bases — two per A–T pair, three per G–C pair — so the double helix unzips into two single strands. (This is why GC-rich templates are harder to melt and sometimes need a hotter or longer denaturation.) Ordinary enzymes would be cooked and destroyed at this temperature; PCR's secret is a polymerase that is not.
2 · Anneal (~50–65°C)
Cooling to roughly 50–65°C lets the two primers — short single-stranded DNA pieces, usually 18–25 bases — find and base-pair to their complementary sites on the now-separated strands. The forward primer binds one strand, the reverse primer the other, and they point toward each other, bracketing the target. The annealing temperature is tuned to the primers' melting temperature (Tm): too cold and primers stick to near-matches (non-specific products); too hot and they fall off entirely.
3 · Extend (~72°C)
Warming to ~72°C — the optimum for Taq — lets the polymerase latch onto each primer's free 3′ end and add nucleotides one at a time, reading the template 3′→5′ while synthesizing the new strand 5′→3′. Taq extends at roughly 1,000 nucleotides per second, so a 1-kilobase product is finished in about a second of pure synthesis (extension times are usually set to ~30–60 s for margin). When the cycle's heat returns, the new strands separate from their templates and the whole thing repeats.
Why the numbers explode
The pivotal idea is that every product strand becomes a template in the next cycle. Starting from one double-stranded target:
- After cycle 1: 2 copies
- After cycle 2: 4 copies
- After cycle 3: 8 copies
- After cycle n: 2ⁿ copies
So 20 cycles give 2²⁰ ≈ 1 million, 30 cycles give 2³⁰ ≈ 1.07 billion, and 40 cycles give over a trillion — in theory. Real reactions are not perfectly efficient, so the practical formula is N × (1 + E)ⁿ, where E is the per-cycle efficiency (often 0.9–1.0 early on). The reaction also cannot grow forever: once primers and dNTPs deplete and product strands start re-annealing to each other faster than primers can bind, the curve flattens into a plateau. This S-shaped (exponential then plateau) growth is exactly what real-time PCR exploits to measure how much template you started with.
There is a subtlety worth knowing: in cycle 1 the new strands run off the end of the template (they have no defined stop), but the primers fix the start. By cycle 3, products appear that are bounded by primers at both ends — the exact-length "amplicon." From then on, these fixed-length products dominate and accumulate exponentially, which is why the final band on a gel is a sharp size, not a smear.
Taq: the enzyme from a hot spring
The earliest PCR experiments used the polymerase from E. coli, which is destroyed at 95°C — so fresh enzyme had to be pipetted in after every single denaturation step, making the method tedious and expensive. The breakthrough was Taq polymerase, isolated from Thermus aquaticus, a bacterium that thrives in ~70°C Yellowstone hot springs. Taq survives repeated 95°C cycling, so it is added just once at the start. This single substitution turned PCR from a laboratory curiosity into an automatable, world-changing technique.
Taq is fast (~1,000 nt/s at 72°C) and robust, but it lacks a 3′→5′ proofreading exonuclease, so it cannot correct misincorporated bases. Its error rate is roughly 1 mistake per 10⁴–10⁵ bases — fine for diagnostics and detection, but risky when the exact sequence matters (e.g., cloning a gene for expression). For high-fidelity work, proofreading enzymes such as Pfu (from Pyrococcus furiosus, ~1 error per 10⁶ bases) or engineered blends are used instead. Taq also has a quirk that became a feature: it adds a single untemplated adenosine to the 3′ end of products, enabling convenient "TA cloning."
Polymerases compared
| Enzyme | Source | Speed | Proofreading | Error rate (per base) | Typical use |
|---|---|---|---|---|---|
| Taq | Thermus aquaticus | ~1,000 nt/s | No | ~1 × 10⁻⁴–10⁻⁵ | Diagnostics, routine PCR, colony screening |
| Pfu | Pyrococcus furiosus | ~500 nt/s | Yes (3′→5′) | ~1 × 10⁻⁶ | Cloning, mutagenesis, high-fidelity work |
| Phusion (engineered) | Fusion + Pyrococcus-like | Very fast | Yes | ~5 × 10⁻⁷ | Long, accurate amplicons |
| Reverse transcriptase | Retroviruses (e.g. MMLV) | Slow | No | High | RNA → cDNA before PCR (RT-PCR) |
PCR's family tree
The basic reaction has spawned dozens of variants. Three matter most:
- qPCR (quantitative / real-time PCR). A fluorescent reporter — an intercalating dye like SYBR Green, or a sequence-specific TaqMan probe — brightens as product accumulates. The machine reads fluorescence every cycle. The cycle at which signal crosses a threshold (the Ct or Cq value) is proportional to the log of the starting amount, so qPCR quantifies DNA, not just detects it. Each whole Ct unit corresponds to roughly a doubling, i.e. a 2-fold difference in starting template.
- RT-PCR (reverse-transcription PCR). RNA cannot be a PCR template directly, so reverse transcriptase first copies it into complementary DNA (cDNA). This is how we measure gene expression and detect RNA viruses. The familiar "PCR test" for COVID-19 is RT-qPCR: reverse transcription, then real-time quantification.
- Digital PCR. The sample is split into thousands of tiny partitions, each with zero or one template molecule, and each is amplified independently. Counting the positive partitions gives an absolute molecule count without a standard curve — powerful for rare-mutation detection and liquid biopsy.
Where PCR changed the world
PCR is arguably the single most important technique in molecular biology, and its applications are everywhere:
- Diagnostics. Detecting pathogens — SARS-CoV-2, HIV, tuberculosis, HPV — from minute samples, often before symptoms or culture would reveal them.
- Forensics. A few cells of touch DNA can be amplified at standardized short-tandem-repeat (STR) loci to generate a profile matching one person in billions.
- Sequencing and cloning. PCR enriches a target before sequencing or inserting it into a plasmid; it is the indispensable front end of recombinant DNA work.
- Ancient and trace DNA. PCR can amplify degraded DNA from museum specimens, fossils, and Neanderthal bone, reconstructing genomes from fragments.
- Genotyping and research. Detecting mutations, confirming gene edits, measuring expression, and screening transgenic organisms.
Why PCR is also dangerous to trust blindly
The very power that makes PCR useful — billion-fold amplification of any matching sequence — makes it exquisitely vulnerable to contamination. A single stray molecule from a previous reaction, an aerosol, or a neighboring sample can be amplified into a convincing false positive. The worst offender is "carryover": amplified product from an earlier run is already enriched a billion-fold, so even a microscopic droplet swamps a clean reaction. Standard defenses include a no-template control (water instead of DNA, to catch contamination), filtered pipette tips, UTP/uracil-N-glycosylase decontamination systems, and physically separating the room where reactions are set up from where products are analyzed.
Specificity failures are the other risk: if the annealing temperature is too low or primers match similar sequences, you amplify the wrong thing, or primers bind each other to form primer-dimers. Good primer design (matched Tm, ~40–60% GC, no self-complementarity) and a melt-curve or gel size check guard against this.
Frequently asked questions
How does PCR work?
PCR repeats three temperature steps. (1) Denaturation at ~95°C breaks the hydrogen bonds holding the double helix together, giving two single strands. (2) Annealing at ~50–65°C lets two short DNA primers bind the sequences flanking your target. (3) Extension at ~72°C lets a thermostable polymerase (Taq) build new complementary strands from free dNTPs, 5′→3′. Each new strand becomes a template, so the target roughly doubles every cycle. Repeat 25–40 cycles in a thermal cycler and one molecule becomes billions.
Why does PCR amplify exponentially?
Every copy made in one cycle serves as a template in the next. If the reaction were perfectly efficient, N starting molecules become N × 2ⁿ after n cycles. So 1 molecule → 2 → 4 → 8 …, reaching about 2³⁰ ≈ 1.07 billion copies after 30 cycles. Real efficiency is ~90–100% early on, so the practical fold-amplification is closer to 10⁶–10⁹ before primers, dNTPs, or polymerase run low and the curve flattens into a plateau.
What is Taq polymerase and why is it used?
Taq is a DNA polymerase from Thermus aquaticus, a bacterium living in ~70°C Yellowstone hot springs. Its key property is thermostability: it survives the repeated 95°C denaturation steps that would destroy ordinary polymerases, so you add it once instead of after every cycle. Taq extends at ~1,000 nucleotides per second at 72°C. Its drawback is no 3′→5′ proofreading, giving an error rate near 1 in 10⁴–10⁵ bases; for high-fidelity work, proofreading enzymes like Pfu (~1 error in 10⁶) are used instead.
What are primers and how are they designed?
Primers are short single-stranded DNA oligonucleotides, typically 18–25 bases, that define the boundaries of the amplified region. The forward primer matches one strand, the reverse primer matches the other, so they point inward and only the sequence between them is copied. Good primers have a melting temperature (Tm) of ~55–65°C, ~40–60% GC content, matched Tms for both, and no self-complementarity (which causes hairpins or primer-dimers). Specificity comes from requiring two correct binding sites at once.
What is the difference between PCR, qPCR, and RT-PCR?
Standard PCR makes the product and you check the result at the end (e.g., on a gel). Quantitative PCR (qPCR, real-time PCR) measures fluorescence after every cycle, so you can quantify the starting amount from the cycle at which signal crosses a threshold (Ct). RT-PCR first uses reverse transcriptase to convert RNA into complementary DNA (cDNA), then amplifies that — essential for RNA viruses like SARS-CoV-2 and influenza. Diagnostic "PCR tests" are usually RT-qPCR: reverse transcription plus real-time quantification.
Why can PCR give false positives?
Because PCR amplifies any matching template a billion-fold, even one stray molecule of contaminating DNA can produce a strong signal. Common causes: carryover of amplified product from a previous run (the worst contaminant, since it is already enriched), aerosols, or non-specific primer binding to similar sequences. Controls fix this: a no-template control catches contamination, and dedicated pipettes, filtered tips, and physically separated pre- and post-PCR areas limit carryover. qPCR melt curves and gel sizing confirm the product is the intended one.