Molecular Biology

DNA Structure

Watson-Crick double helix — the molecular basis of inheritance

DNA is a double helix of two antiparallel strands held together by complementary base pairs. Watson and Crick deduced the structure in 1953 from Rosalind Franklin's X-ray diffraction. Adenine pairs with thymine via two hydrogen bonds; guanine with cytosine via three. The sugar-phosphate backbone is on the outside; bases stack inside at 3.4 Å spacing, with one full turn every ~10.5 bp (34 Å). The human genome is ~3 billion base pairs, encoding ~20,000 protein-coding genes plus extensive regulatory and noncoding sequence. Mutations underlie genetic disease, cancer, and evolution.

  • Helix turn~10.5 bp per turn (34 Å)
  • Base pair spacing3.4 Å
  • Human genome size~3 × 10⁹ bp
  • Coding genes~20,000
  • Pairing rulesA-T (2 H-bonds), G-C (3 H-bonds)
  • DiscoveryWatson, Crick, Franklin, Wilkins (1953)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why DNA structure matters

  • Genetic testing. BRCA1/2 mutations, cystic fibrosis, hemochromatosis screening.
  • Cancer profiling. Identifies driver mutations and matches targeted therapy (EGFR, BRAF, KRAS).
  • Pharmacogenomics. CYP2D6, CYP2C19, TPMT polymorphisms guide warfarin, clopidogrel, thiopurine dosing.
  • Forensic identification. Short tandem repeat profiling matches suspects to evidence.
  • Prenatal screening. Cell-free fetal DNA in maternal blood detects trisomy 13/18/21.
  • mRNA vaccines. Lipid nanoparticles deliver mRNA encoding spike protein for COVID-19.
  • CRISPR therapy. Casgevy (sickle cell) and Zolgensma (SMA) marked dawn of gene therapy era.

Common misconceptions

  • Junk DNA is junk. "Non-coding" includes regulatory elements, lncRNAs, structural sequences.
  • Watson and Crick discovered DNA. They deduced its structure; DNA known since Miescher 1869.
  • Genes are the whole story. Epigenetics (methylation, histone marks) and 3D chromatin structure also matter.
  • Mutations are always bad. Most are neutral; some confer benefit; cancer requires multiple driver mutations.
  • DNA replication is error-free. ~1 error per 10⁹ bases after proofreading and mismatch repair.
  • Mitochondrial DNA is irrelevant clinically. mtDNA mutations cause MELAS, Leber's, hearing loss, diabetes.

Frequently asked questions

What is a nucleotide?

A nucleotide has three parts. A nitrogenous base (purine — adenine, guanine; or pyrimidine — cytosine, thymine, or uracil in RNA). A pentose sugar (deoxyribose in DNA, ribose in RNA). A phosphate group. Nucleotides link via phosphodiester bonds between 3' OH of one sugar and 5' phosphate of the next, building the directional backbone.

What's antiparallel?

The two DNA strands run in opposite directions. One reads 5' to 3' left to right; the other 3' to 5'. Required by base-pair geometry: the bases can only hydrogen bond when oriented correctly. This has critical consequences for replication: DNA polymerase only synthesizes 5' to 3', leading to discontinuous lagging strand synthesis with Okazaki fragments.

How are the bases paired?

Adenine pairs with thymine (2 hydrogen bonds). Guanine pairs with cytosine (3 hydrogen bonds). Purine always pairs with pyrimidine, keeping the helix uniform width (~20 Å). The G-C bond is stronger; high G-C content raises melting temperature. Chargaff's rule: %A = %T and %G = %C in any double-stranded DNA.

What forms hold the helix together?

Hydrogen bonds between paired bases — relatively weak individually, strong en masse. Base stacking: hydrophobic and van der Waals interactions between adjacent base pairs contribute as much energy as the hydrogen bonds. The phosphate backbone is negatively charged; counter-ions (Mg2+, K+) and histones (in eukaryotes) neutralize repulsion and allow tight packing.

How is DNA packaged?

~2 meters of DNA per cell fits into a 5 μm nucleus through hierarchical packing. DNA wraps around histone octamers (147 bp per nucleosome), forming the 11 nm "beads on a string" fiber. This coils into 30 nm fiber, then loops, then chromatin domains (TADs), and finally metaphase chromosomes (~10,000-fold compaction). Chromatin can be euchromatin (active) or heterochromatin (silenced).

What types of DNA damage exist?

Spontaneous depurination, deamination (C→U). UV: thymine dimers. Reactive oxygen species: oxidized bases (8-oxoG). Alkylating agents. Ionizing radiation: double-strand breaks (most lethal). Cells use multiple repair pathways: base excision (BER), nucleotide excision (NER), mismatch repair (MMR), homologous recombination, non-homologous end joining. Defects: xeroderma pigmentosum, Lynch syndrome, BRCA1/2, ataxia-telangiectasia.

How does sequencing work?

Sanger (dideoxy chain termination): chain extension stops at fluorescent ddNTP; capillary electrophoresis reads sequence. Next-gen sequencing (Illumina): millions of clusters sequenced in parallel by reversible terminator chemistry. Long-read (PacBio, Oxford Nanopore) reads tens of kilobases. Cost: $3 billion (2003 first human genome) → ~$200 today. Drives clinical genomics, cancer profiling, infection diagnostics.