Molecular Biology

Genetic Code

Mapping nucleotides to amino acids — three letters at a time, nearly universal

The genetic code is the set of rules by which the nucleotide sequence in mRNA is translated into the amino acid sequence in a protein. Three nucleotides (codon) = one amino acid. 64 codons (4³) → 20 amino acids + stop signals. Redundancy (multiple codons per amino acid) — provides robustness. Universal across nearly all organisms (with rare exceptions, e.g., mitochondria). Start codon AUG (= Methionine). Stop codons UAA, UAG, UGA. Decoded by tRNA (transfer RNA) — each tRNA carries one amino acid; anticodon matches codon. Cracked 1961-1966 (Nirenberg and others — Nobel 1968).

  • Codon3 nucleotides = 1 amino acid
  • Total codons64 (4³)
  • Amino acids encoded20 standard
  • Start codonAUG (Methionine)
  • Stop codonsUAA, UAG, UGA
  • UniversalityNearly all organisms; rare exceptions

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why genetic code matters

  • Genetics. Foundation of inheritance.
  • Biotechnology. Cross-species protein expression.
  • Medicine. Mutations cause genetic diseases.
  • Evolution. Universal code = common ancestor.
  • Drug design. Target translation, codon usage.
  • Vaccines. mRNA vaccines exploit translation.
  • Synthetic biology. Designed proteins, expanded code.

Common misconceptions

  • One codon per amino acid. Most have multiple (degeneracy).
  • Code is purely random. Optimal — minimizes mutation effect.
  • All life uses identical code. Slight variations exist.
  • DNA codons read directly. mRNA mediates.
  • Protein = sequence only. Folding crucial too.
  • Code arbitrarily set. Optimized over billions of years of evolution.

Frequently asked questions

How was the genetic code cracked?

Marshall Nirenberg + Heinrich Matthaei (1961): made synthetic poly-U RNA; translated to poly-Phe protein. Showed UUU = Phe. Subsequent work cracked all codons by 1966. Tools: synthesizing RNAs of known sequences; observing produced amino acids. Noble 1968 (Nirenberg, Holley, Khorana). Cracking the code: foundational achievement; enabled modern molecular biology.

Why is the code redundant?

64 codons / 20 amino acids = ~3 codons per amino acid average. Redundancy benefits: (1) Mutations less harmful (changing third position often gives same amino acid). (2) Faster mutation tolerance. (3) Optimization for codon usage. Most redundancy in third base ("wobble"). E.g., glycine: GGU, GGC, GGA, GGG (all Gly). Methionine and Tryptophan: only one codon each.

What's the wobble hypothesis?

Crick (1966): third codon position can pair non-Watson-Crick with anticodon. Allows fewer tRNAs to read more codons. tRNA wobble base reads multiple synonymous codons. ~30-40 tRNAs read 61 sense codons. Saves cellular resources. Explains: redundancy of code is at third position.

Is the code truly universal?

Nearly. Mitochondrial codes have minor variations (UGA = Trp instead of stop; AGA = stop instead of Arg). Some single-celled organisms reassigned a few codons. But: ~99% of organisms use same standard code. Strong evidence for common origin of all life. Universality enables: gene transfer between species; cross-species protein expression in biotechnology.

How does translation work?

Ribosome reads mRNA codon by codon. Each codon matches anticodon of charged tRNA (tRNA carrying specific amino acid). Ribosome catalyzes peptide bond between adjacent amino acids. Process continues until stop codon: release factor binds; protein released. Three sites in ribosome: A (acceptor), P (peptidyl), E (exit).

What's a reading frame?

How nucleotide sequence is grouped into codons. Three possible reading frames per strand. Insertion or deletion of single nucleotide: shifts reading frame → all subsequent codons different → frameshift mutation. Usually drastically affects protein. Reading frame established by start codon; ends at stop codon.

What about non-standard amino acids?

20 standard. Some organisms add 21st (selenocysteine — encoded by UGA in specific contexts) and 22nd (pyrrolysine — UAG in some methanogens). Many proteins also have post-translationally modified residues (phosphorylation, methylation, etc.). Synthetic biology: adding artificial amino acids via expanded genetic code (Schultz lab, others).