Microbiology
Retrovirus
RNA → DNA → integration — viruses that run the central dogma backwards
A retrovirus is an RNA virus that copies its genome into DNA via reverse transcriptase, then integrates the DNA into the host chromosome — running the central dogma backwards. The integrated form, called a provirus, is replicated indefinitely by the host's own machinery. HIV is the most studied example: its 9.7 kb genome encodes nine genes including gag, pol (RT, integrase, protease), and env. Reverse transcriptase was discovered in 1970 by Temin and Baltimore (Nobel 1975), overturning the dogma that information flowed only DNA → RNA → protein, and underwriting 30+ approved antiretroviral drugs that have turned HIV from a fatal disease into a manageable chronic condition.
- GenomeDiploid +ssRNA, 7–12 kb
- Key enzymeReverse transcriptase (RNA → DNA)
- DiscoveryTemin & Baltimore, 1970 (Nobel 1975)
- IntegrationPermanent, into host chromosome
- HIV mutation rate~1 in 10,000 bp / cycle
- Approved drug classesNRTI, NNRTI, PI, INSTI, fusion, CCR5
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The retroviral life cycle, step by step
A retroviral particle finds a target cell, fuses, deposits its core, and writes itself into the genome. Each of the eight stages has been crystallized as a drug target:
- Attachment. HIV gp120 binds CD4 on T helper cells, macrophages, and dendritic cells; conformational change exposes the CCR5 (early) or CXCR4 (late) co-receptor site.
- Fusion. gp41 inserts its fusion peptide and folds into a six-helix bundle, dragging the membranes together. Enfuvirtide blocks this.
- Uncoating. The capsid disassembles in a regulated way as the reverse-transcription complex moves to the nuclear pore.
- Reverse transcription. RT primes from host tRNA-Lys3, synthesizes (–)DNA, template-switches via the R region, then makes (+)DNA. Result: linear dsDNA flanked by long terminal repeats.
- Nuclear entry. Lentiviruses cross the intact envelope of non-dividing cells; gammaretroviruses need mitosis.
- Integration. Integrase performs 3' processing and strand transfer, splicing cDNA into a chromosome. HIV biases toward active genes; MLV toward gene-poor regions — relevant for vector safety.
- Transcription & translation. The 5' LTR is a promoter for Pol II; Tat boosts elongation, Rev exports unspliced RNA.
- Assembly & maturation. Gag recruits two RNA copies, buds as an immature particle, and only becomes infectious after viral protease cleaves Gag and Gag-Pol. Protease inhibitors block this last step.
Pathway diagram — HIV replication
HIV virion ──(gp120 ↔ CD4 + CCR5)──▶ attachment
│
▼ gp41 fusion → cytoplasmic core
│
▼ RT (RNA → cDNA) ◀── NRTI / NNRTI inhibit
│
▼ pre-integration complex → nuclear pore
│
▼ integrase → provirus ◀── INSTI inhibit
│
▼ Pol II transcription → genomic + mRNA
│
▼ Gag, Gag-Pol, Env synthesis → assembly at membrane
│
▼ budding (immature)
│
▼ protease maturation ◀── PI inhibit
│
▼ infectious virion
Retroviruses vs DNA viruses vs other RNA viruses
| Retrovirus (HIV) | DNA virus (Herpes) | (+)ssRNA virus (Polio) | (–)ssRNA virus (Influenza) | |
|---|---|---|---|---|
| Genome type | Diploid +ssRNA | dsDNA | +ssRNA (mRNA-like) | (–)ssRNA, segmented |
| Replication site | Cytoplasm + nucleus | Nucleus | Cytoplasm | Nucleus |
| Polymerase carried in virion | Reverse transcriptase | None (uses host) | None (translated from genome) | RNA-dependent RNA polymerase |
| Integrates into host genome | Yes (integrase) | Latency as episome (mostly) | No | No |
| Latency mechanism | Silent provirus in memory T cells | Episomal, neuron LATs (HSV) | None — acute | None — acute |
| Mutation rate per cycle | ~10⁻⁴ (no proofreading) | ~10⁻⁸ (proofreading polymerase) | ~10⁻⁴ | ~10⁻⁵, plus reassortment |
| Curable by current drugs | Suppressible, not curable | Suppressible (HSV, HBV) | Self-clearing or vaccine | Antivirals shorten course |
Integration is the defining difference. A herpesvirus episome can be cleared if the cell dies; an integrated provirus persists for the life of the daughter lineage. This is why HIV cure is so much harder than herpes suppression.
HIV's three drug-target enzymes
- Reverse transcriptase (RT). A p66/p51 heterodimer with polymerase and RNase H activities. NRTIs (zidovudine, tenofovir, abacavir, lamivudine) are chain-terminating nucleoside analogs; NNRTIs (efavirenz, rilpivirine, doravirine) bind a hydrophobic pocket and lock RT in a non-functional conformation.
- Integrase (IN). A 32 kDa protein whose catalytic DDE motif (Asp64, Asp116, Glu152) coordinates two Mg²⁺ ions. INSTIs displace these metals — the chemistry has no human equivalent, hence high selectivity.
- Protease (PR). A 99-residue aspartyl protease homodimer, crystallized in 1989. Inhibitors (saquinavir 1995, darunavir, atazanavir) bind the active site and leave the virion permanently immature.
Real-world impact
- HIV/AIDS pandemic. ~85 million infections and ~40 million deaths since 1981. Combination ART (HAART, 1996) restored near-normal life expectancy. Modern single-tablet regimens (Biktarvy, Dovato) replace 20-pill daily regimens.
- Pre-exposure prophylaxis (PrEP). Daily oral tenofovir/emtricitabine (Truvada, 2012) and long-acting cabotegravir (Apretude, 2021) reduce HIV acquisition by >99% in adherent users. Lenacapavir (PURPOSE-1, 2024) showed 100% prevention twice-yearly.
- HTLV-1. The first retrovirus found to infect humans (Poiesz & Gallo, 1980), causes adult T-cell leukemia. Endemic in Japan, the Caribbean, and parts of Africa.
- Lentiviral gene therapy. Engineered HIV vectors with gag, pol, and env on separate plasmids deliver therapeutic genes — Zynteglo (β-thalassemia), Skysona (ALD), tisagenlecleucel (CAR-T).
- Oncogene discovery. Rous sarcoma virus v-src and the cellular c-src (Bishop & Varmus, Nobel 1989) launched cancer molecular biology — every named oncogene traces back to retroviral hijacking.
Variants and viral families within Retroviridae
- Lentivirus. Slow-progressing, infect non-dividing cells. HIV-1, HIV-2, SIV, FIV, visna. Workhorse of gene therapy vectors.
- Gammaretrovirus. MLV, feline leukemia virus. Require mitosis for nuclear entry. First-generation gene therapy.
- Alpharetrovirus. Avian leukosis, Rous sarcoma virus — source of v-src, the founding oncogene.
- Betaretrovirus. Mouse mammary tumor virus, Jaagsiekte sheep retrovirus.
- Deltaretrovirus. HTLV-1, HTLV-2, BLV.
- Spumavirus. Foamy viruses — integrate poorly, no known disease, proposed as safer vectors.
- Endogenous retroviruses. ~8% of the human genome; syncytins co-opted for placenta.
Common pitfalls and clinical traps
- M184V resistance. A single methionine-to-valine substitution in RT confers high-level resistance to lamivudine and emtricitabine, but slightly reduces viral fitness — sometimes kept in regimens as a fitness anchor.
- K65R cross-resistance. Tenofovir-selected K65R reduces susceptibility to most NRTIs except zidovudine. Genotypic resistance testing must precede any regimen change.
- NNRTI binding pocket fragility. A single K103N mutation knocks out efavirenz and nevirapine. Modern NNRTIs (rilpivirine, doravirine) tolerate K103N but suffer from E138K.
- Off-target integration. Early gammaretroviral gene therapy for X-linked SCID (1999–2002) cured 17 children but caused leukemia in 5, via MLV LTR activation of the LMO2 oncogene. Modern self-inactivating LTRs and lentiviral vectors reduced this dramatically.
- Drug-drug interactions via CYP3A4. Ritonavir-boosted PIs and cobicistat-boosted INSTIs strongly inhibit CYP3A4. Statins, midazolam, calcium channel blockers, ergotamines become unsafe.
- Reservoir misunderstanding. Undetectable viral load is not cure. Stopping ART after years of suppression leads to rebound within 2–8 weeks.
Frequently asked questions
Why is it called a retrovirus?
Retro- means "backward." Classical biology held information flow as DNA → RNA → protein (Crick's 1958 central dogma). Retroviruses go RNA → DNA → RNA → protein, reversing the first step. The name was coined after Temin and Baltimore showed in 1970 that an RNA tumor virus carries an enzyme — reverse transcriptase — that synthesizes DNA from an RNA template. Retroviruses also integrate their cDNA into host chromosomes, a permanent invasion no other RNA virus performs.
What does the HIV virion contain?
Each ~100 nm particle has a lipid envelope studded with ~14 gp120/gp41 trimers and a conical p24 capsid. Inside: two copies of (+)ssRNA (~9.7 kb), bound by nucleocapsid p7, plus packaged RT, integrase, and protease. The diploid RNA enables template switching during reverse transcription, increasing recombination — unique to retroviruses.
How does reverse transcriptase make so many errors?
HIV-1 RT lacks 3'→5' proofreading. Error rate is ~10⁻⁴ per nucleotide per cycle. Combined with ~10¹⁰ virions per day in untreated patients, every single-nucleotide variant of the 9.7 kb genome is generated daily. Any single resistance mutation already exists in the quasispecies, which is why combination therapy with three drugs from two classes is required — the chance of a triply resistant variant pre-existing is vanishingly small.
What does integrase do, and why is it a drug target?
Integrase catalyzes two reactions: 3'-end processing trims 2 nucleotides from each end of the linear cDNA, then strand transfer covalently joins those ends to staggered cuts in host DNA, producing a permanent provirus. The chemistry has no human equivalent, so INSTIs are highly selective. Raltegravir (2007), elvitegravir, dolutegravir, bictegravir, and cabotegravir are the major INSTIs; dolutegravir is now first-line in WHO and US guidelines.
What is an endogenous retrovirus?
When a retrovirus integrates into a germline cell — sperm, egg, or their precursors — the provirus is inherited like any chromosomal locus. Endogenous retroviruses (ERVs) make up roughly 8% of the human genome, most disabled by mutation. A few have been domesticated: syncytin-1 and syncytin-2, derived from the env genes of ancient retroviral insertions, drive the cell-cell fusion that forms the placental syncytiotrophoblast. Without these co-opted retroviral proteins, the placenta as we know it would not exist. Mammals literally rely on retrovirus genes to gestate.
Why can't we cure HIV?
ART suppresses replication but cannot eliminate the latent reservoir: long-lived CD4+ memory T cells with silent integrated proviruses that produce no viral protein. The reservoir decays with a half-life of ~44 months — waiting alone would take a century. Shock-and-kill, block-and-lock, CRISPR excision, and CCR5-Δ32 transplants (Berlin, London patients) are active strategies. Five people are functionally cured, all via transplants.