Genetics
Transposon
Jumping genes, the DDE active site, and why ~45% of your genome is repurposed mobile DNA
Transposons are mobile DNA segments that move within a genome by cut-and-paste (DNA-only Class II) or copy-and-paste (retrotransposon Class I) mechanisms. Discovered by Barbara McClintock in maize (1948), they make up ~45% of the human genome — ~17% LINE-1, ~11% Alu, ~8% LTR retroelements, ~3% DNA transposons. Class II transposases share a conserved DDE catalytic triad that excises and inserts the element. Class I retrotransposons transcribe an mRNA, reverse-transcribe it, and integrate the cDNA into a new locus. Transposons drive genome size variation, antibiotic resistance spread, exon shuffling, and disease.
- DiscoveredBarbara McClintock, maize (1948); Nobel 1983
- % of human genome~45% transposon-derived
- Most numerousAlu (SINE), ~1.1×10⁶ copies
- Class I (retrotransposons)Copy-and-paste via RNA intermediate
- Class II (DNA transposons)Cut-and-paste via DDE transposase
- Clinical impactAntibiotic resistance, hemophilia A, NF1, BRCA insertions
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
What transposons are, and why they're everywhere
A transposon is a stretch of DNA — typically a few hundred to a few thousand base pairs — that can change its position within a genome. The simplest carry only the machinery they need to move (transposase or reverse transcriptase plus structural ends like inverted repeats or LTRs); larger ones haul cargo, including antibiotic resistance genes and regulatory elements.
The numbers in humans are striking. Of 3.1 billion bp, roughly 1.4 billion (45%) trace to transposable elements: ~17% LINE-1, ~11% Alu and other SINEs, ~8% LTR retroelements (endogenous retroviruses), and ~3% DNA transposons (all fossilized in humans). By contrast, only ~1.5% of the genome encodes proteins. Maize is ~85% transposon, wheat ~80%, some salamander genomes >30 Gb. Eukaryotic genome size correlates not with organism complexity but with how aggressively transposons are silenced.
Class I vs Class II — the master division
Transposable elements are split into two classes based on whether their movement passes through an RNA intermediate.
| Property | Class I (Retrotransposons) | Class II (DNA transposons) |
|---|---|---|
| Mechanism | Copy-and-paste (via RNA → cDNA) | Cut-and-paste (DNA → DNA) |
| Key enzyme | Reverse transcriptase + integrase/endonuclease | Transposase (DDE active site) |
| Donor site after move | Intact (template not removed) | Empty (often repaired by NHEJ) |
| Subgroups | LTR retrotransposons; non-LTR (LINEs, SINEs) | TIR (terminal inverted repeat); helitrons; mavericks |
| Famous examples | L1 / Alu / HERV (humans); copia, gypsy (Drosophila); Ty (yeast) | P element (Drosophila); Tn3, Tn5, Tn10 (bacteria); Sleeping Beauty (fish) |
| Genome share (humans) | ~42% | ~3% (all fossilized) |
| Insertion footprint | Target-site duplication (TSD), variable length | Target-site duplication, fixed by transposase |
| Replication coupling | Independent of host replication | Some are replicative (Tn3, Mu); most are conservative |
The asymmetry is consequential. Because Class I copies, Class I is amplified — every successful retrotransposition adds a copy. Class II is conservative; copy number stays roughly constant unless transposition is coupled with replication. That's why mammalian genomes are dominated by retrotransposons: a few active L1 lineages have copied themselves >500,000 times since the rodent-primate split.
Class II mechanism: the DDE transposase and cut-and-paste
The canonical Class II transposase is a homodimer with a catalytic core built around three acidic residues — two aspartates and one glutamate — coordinating two Mg²⁺ ions. The same DDE architecture is shared by HIV integrase and RNase H. The transposase recognizes terminal inverted repeats (TIRs) at the ends of its element and catalyzes:
- Synapsis. Two transposase monomers each bind a TIR at opposite ends; the dimer loops out the element body.
- First-strand cleavage. A water activated by the DDE-coordinated metal attacks the TIR-flanking phosphate, releasing the 3' OH of the element.
- Second-strand cleavage. Tn5/Mu use a hairpin intermediate; mariner cleaves directly. Either way, the element exits as a discrete molecule.
- Strand transfer. The 3' OHs attack staggered phosphates on a new target (TA or TTAA, family-dependent), inserting the element. Host fill-in generates the target-site duplication (TSD) flanking every insertion.
The empty donor site is repaired by NHEJ, often leaving a small footprint. If repair uses the sister chromatid as template (SDSA), the donor sequence is restored — making cut-and-paste effectively replicative.
Class I mechanism: LTR vs non-LTR retrotransposons
LTR retrotransposons (Ty1 in yeast, copia/gypsy in Drosophila, HERV in humans) carry long terminal repeats (250 bp - several kb) and follow the retroviral lifecycle minus the envelope step: Pol II transcription → translation of Gag/Pol → cytoplasmic virus-like particles → reverse transcription with strand-transfer that regenerates full LTRs → DDE-integrase inserts cDNA into a new chromosomal site with a TSD.
Non-LTR retrotransposons (LINEs, SINEs) use target-primed reverse transcription (TPRT):
- L1 mRNA → ORF1p (RNA chaperone) + ORF2p (endonuclease + reverse transcriptase) ribonucleoprotein.
- ORF2p endonuclease nicks target DNA at a TTAAAA-like motif, exposing a 3' OH.
- The 3' OH primes reverse transcription directly off the L1 mRNA at the integration site — synthesis happens in place, not in the cytoplasm.
- Second-strand synthesis and ligation finish the insertion. Insertions are often 5'-truncated because reverse transcription stalls.
SINEs (Alu, B1, B2) encode no enzymes. They parasitize L1's machinery — Alu RNA hijacks ORF1p/ORF2p in trans and gets reverse-transcribed at L1-cleaved targets. Alu and L1 inheritance patterns therefore track each other across primates.
Mechanism diagram
CLASS II — cut-and-paste (e.g. Sleeping Beauty)
──TIR──[transposase + cargo]──TIR── donor
▼ transposase excises
──[empty, NHEJ-repaired]── + [TIR-elt-TIR]
▼ strand transfer at TA target
──TSD──TIR──[elt]──TIR──TSD── new site
CLASS I — copy-and-paste (e.g. L1)
──[L1: ORF1, ORF2]── donor (stays)
│ Pol II transcription
▼ L1 mRNA → ORF1p+ORF2p → RNP → nucleus
▼ ORF2p nicks target at TTAAAA
▼ TPRT: 3' OH primes reverse transcription off mRNA
▼ second-strand synthesis, ligation
──TSD──[new L1 copy, often 5'-truncated]──TSD──
+ ──[original L1]── (donor unchanged)
Real-world impact
Evolution and exaptation. RAG1/RAG2, which assembles V(D)J immunoglobulin genes, is a domesticated transposase. Mammalian syncytin placental fusion proteins derive from captured retroviral env genes — independently in at least three lineages. Thousands of transposon-derived sequences now serve as enhancers, promoters, or insulators.
Disease. Active L1 retrotransposition causes ~1 in 250 genetic-disease alleles. Documented hits: hemophilia A from L1 insertion into F8; NF1 disruption by Alu; Duchenne muscular dystrophy from L1 in DMD; several retinitis pigmentosa cases. ~50% of colorectal cancers carry somatic L1 insertions, occasionally driving tumor-suppressor loss.
Antibiotic resistance. The global AMR crisis is largely a transposon story. Tn3 carries β-lactamases; Tn21 family carries aminoglycoside resistance and integrons; Tn1546 carries vanA (vancomycin resistance). These elements jump between chromosome and conjugative plasmid; plasmids ferry them between species. A single resistance mutation can become globally distributed in years — the 2014 emergence of plasmid-borne mcr-1 (colistin resistance) across E. coli, Salmonella, and Klebsiella is the textbook recent case.
Biotechnology. Sleeping Beauty (resurrected from fish DNA fossils) and piggyBac drive gene therapy delivery, transgenic animals, and CAR-T engineering. P-element revolutionized Drosophila genetics in the 1980s. Tn5 transposase is the molecular core of ATAC-seq.
How cells silence transposons
- DNA methylation. CpG methylation silences transposon promoters in somatic cells. Loss (e.g. ICF syndrome) reactivates them.
- piRNA pathway. Germline 24-32 nt piRNAs + PIWI Argonautes degrade transposon transcripts. Clusters like flamenco act as transposon vaccination archives.
- KRAB-ZFP / TRIM28 / SETDB1. KRAB zinc-fingers recognize transposon families and recruit TRIM28, depositing H3K9me3 heterochromatin via SETDB1.
- APOBEC3 deaminases. Hypermutate retroelement cDNA pre-integration. The same enzymes restrict HIV.
- SAMHD1. Depletes dNTPs in non-dividing cells, starving reverse transcriptase. Loss causes Aicardi-Goutières syndrome.
Variants and notable families
- Tn5 / Tn10. Bacterial cut-and-paste; Tn5 transposase is the core of ATAC-seq tagmentation.
- P element. Drosophila DNA transposon that swept wild populations in the 20th century; basis of modern fly genetics.
- Sleeping Beauty / piggyBac. Engineered DNA transposons for human cell-line and gene-therapy use.
- L1 (LINE-1). The only autonomous retrotransposon active in humans; ~80-100 active copies per individual.
- Alu. Non-autonomous SINE; primate-specific; ~1.1M copies — the most successful mobile element in our genome.
- HERV. LTR retroelements of retroviral ancestry; mostly fossilized, but HERV-K LTRs drive transcription in tumors and embryonic stem cells.
- Helitrons / Mavericks. Rolling-circle and giant DNA transposons (15-40 kb) that capture host fragments; possible ancestors of some DNA viruses.
Common pitfalls and misconceptions
- Calling all transposons "junk DNA." Most copies are degraded, but a substantial fraction has been exapted into regulatory and coding functions.
- Confusing retrotransposons with retroviruses. Retrotransposons (especially non-LTR) lack env and don't form infectious particles. ERVs are integrated retroviruses that lost env.
- Assuming insertions are random. L1 prefers TTAAAA; Alu mirrors that bias; Tc1/mariner targets TA; HIV integrase prefers active genes. Insertion bias matters for biotech and disease risk.
- Treating transposase activity as permanent. Mammalian DNA transposons are all extinct as autonomous elements — the last hAT copies died ~37 Mya. Only retrotransposons remain mobile in humans.
- Underestimating clinical relevance. Transposon-mediated antibiotic resistance, somatic L1 activation in cancer, and de novo retroelement insertions causing Mendelian disease are clinically active today.
Frequently asked questions
What's the difference between Class I and Class II transposons?
Class II (DNA-only) transposons move by cut-and-paste: a transposase enzyme excises the element from one site and inserts it into another. The donor site is left empty (often repaired by NHEJ, sometimes restoring a copy via gap repair). Class I (retrotransposons) move by copy-and-paste: the element is transcribed into RNA, reverse-transcribed into cDNA, and integrated into a new genomic site. The donor stays intact, so element copy number grows. Class I dominates mammalian genomes; Class II dominates many bacteria and plants.
What's the DDE motif and why does it matter?
DDE is a conserved aspartate-aspartate-glutamate triad in the catalytic domain of most Class II transposases (Tn5, Mu, Tc1/mariner, IS3 family) and also of retroviral and LTR-retrotransposon integrases. The three carboxylates coordinate two divalent metal ions (Mg²⁺ or Mn²⁺) that catalyze phosphodiester strand transfer — the same chemistry HIV integrase uses, which is why integrase inhibitors like raltegravir target the DDE site. The motif is one of the most ancient and conserved enzymatic signatures in biology.
Why do Alu elements exist in such huge numbers?
Alu is a ~300 bp SINE derived from the 7SL signal-recognition-particle RNA gene. There are ~1.1 million Alu copies in the human genome — ~11% of total DNA. Alu elements are non-autonomous: they don't encode a reverse transcriptase, but hijack the L1 (LINE-1) machinery to retrotranspose. Their compact size, internal RNA polymerase III promoter, and parasitism of L1 made them spectacularly successful — the most numerous mobile DNA in the human genome.
How do transposons spread antibiotic resistance?
Bacterial transposons like Tn3, Tn21, Tn1546 carry resistance genes (β-lactamase, aminoglycoside acetyltransferase, vancomycin resistance vanA) flanked by inverted repeats and a transposase. They jump readily from chromosome to plasmid and back; conjugative plasmids then ferry the transposon between species. Composite transposons (two IS elements bracketing any cargo gene) can pick up new resistance genes and disseminate them across genera. The global rise of MRSA, VRE, and ESBL-producing Enterobacteriaceae is driven by transposon-mediated horizontal transfer.
Are transposons junk DNA or functional?
Both. Most copies are inactive remnants — accumulated mutations have abolished transposase or reverse transcriptase activity. But functional roles have emerged: V(D)J recombination uses RAG1/2 (domesticated transposase) to assemble immunoglobulin genes; placental syncytin genes derive from retroviral env; thousands of transposon-derived sequences serve as enhancers, promoters, or boundary elements. The textbook label 'junk DNA' has retreated as ENCODE and comparative genomics revealed regulatory roles.
How do cells defend against transposons?
Multiple layers. DNA methylation silences transposon promoters in differentiated cells. The piRNA pathway in germ cells uses ~24-32 nt small RNAs and PIWI Argonaute proteins to recognize and degrade transposon transcripts. APOBEC3 cytidine deaminases hypermutate retroviral and L1 cDNA before integration. KRAB zinc-finger proteins recruit TRIM28 and SETDB1 to deposit H3K9me3 heterochromatin on transposons. Loss of any layer can reactivate transposition — implicated in aging, cancer, and neurological disease.
Who discovered transposons?
Barbara McClintock, working on maize at Cold Spring Harbor in the 1940s. She noticed that color patches on kernels followed non-Mendelian inheritance and proposed mobile 'controlling elements' (Ac/Ds) that broke chromosomes and altered gene expression. The community largely ignored her work for two decades — mobile genes contradicted the static chromosome dogma. Bacterial transposons were discovered in the 1960s, retrotransposons in the 1970s, and McClintock finally received the Nobel Prize in 1983.