Microbiology

CRISPR-Cas Bacterial Immunity

Bacteria file ~30-bp snippets of every virus that attacks them, then use crRNA-guided Cas nucleases to shred it if it ever comes back — the natural origin of gene editing

CRISPR-Cas is the adaptive immune system bacteria and archaea use to fight viruses. When a bacteriophage injects its DNA, the Cas1-Cas2 complex captures a ~30-base-pair fragment and files it as a new "spacer" at the front of a CRISPR array — a chronological genetic memory of past infections. The array is transcribed into short crRNA guides, each carrying one spacer, and these guides load into a Cas nuclease such as Cas9. If the virus returns, the crRNA base-pairs with the matching sequence, the nuclease checks for a short PAM signal next to it, and then makes a double-strand cut that destroys the invader. First seen in 1987 and decoded between 2005 and 2012, this system was reprogrammed into the CRISPR gene-editing tools that won the 2020 Nobel Prize in Chemistry.

  • What it isBacterial/archaeal adaptive immunity
  • Memory unit~30-bp spacer in CRISPR array
  • GuidecrRNA (one spacer each)
  • Targeting signalPAM (e.g. 5'-NGG-3' for Cas9)
  • Found in~40% bacteria, ~85% archaea
  • Decoded1987→2012; Nobel 2020 (Doudna & Charpentier)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What CRISPR-Cas actually is

Strip away the gene-editing hype and CRISPR-Cas is a defense system — the only known adaptive immune system in the prokaryotic world. Bacteria and archaea are under relentless attack by bacteriophages (viruses that infect bacteria), the most abundant biological entities on Earth at an estimated 1031 particles, killing roughly 20–40% of the ocean's bacteria every single day. A microbe that can remember a virus and pre-empt it has an enormous survival edge.

The acronym stands for Clustered Regularly Interspaced Short Palindromic Repeats — a literal description of what the locus looks like in the genome: short identical repeat sequences (typically 23–47 bp) separated by unique "spacer" sequences of similar length. Sitting next to this array is a cluster of cas (CRISPR-associated) genes encoding the protein machinery. The spacers are the memory; the Cas proteins are the hands that write and use it. The whole thing satisfies the textbook definition of adaptive immunity: it is acquired from experience, sequence-specific to a particular invader, and heritable — passed to every daughter cell because it lives in the chromosome.

How the three stages work

CRISPR immunity runs as a three-act process. The first act writes new memory; the second prepares the memory for use; the third uses it to kill.

1. Adaptation (spacer acquisition). When phage or plasmid DNA enters the cell, the Cas1-Cas2 integrase complex — a 4:2 hexamer of two Cas1 dimers bridged by a Cas2 dimer, conserved across nearly all CRISPR types — captures a short fragment of the foreign DNA, the protospacer (~30 bp). Cas1-Cas2 then catalyzes a site-specific integration, much like a retroviral integrase, inserting the new spacer at the leader-proximal end of the array and duplicating the adjacent repeat. Because new spacers always go in at the same end, the array is a chronological log: the spacer nearest the leader is the most recent infection. Acquisition is biased toward foreign DNA because protospacers are preferentially harvested from free DNA ends and stalled replication forks generated by RecBCD as it chews up invading DNA.

2. Expression (crRNA biogenesis). The entire array is transcribed from the leader-end promoter into one long precursor, the pre-crRNA. This is then chopped at each repeat into mature CRISPR RNAs (crRNAs), each carrying a single spacer flanked by a piece of repeat. How it is cut depends on the system: Type I and III use a dedicated endoribonuclease (Cas6) that recognizes hairpins in the repeats; Type II (Cas9) uses a separate small RNA called the tracrRNA that base-pairs with each repeat, recruiting the host enzyme RNase III to cut the duplex.

3. Interference. A mature crRNA loads into a Cas effector — a single multidomain protein like Cas9 (Type II) or Cas12a (Type V), or a multi-subunit complex like Cascade (Type I) or the Csm/Cmr complex (Type III). The loaded effector patrols the cell. When it encounters DNA, it first checks for a short protospacer-adjacent motif (PAM) — for S. pyogenes Cas9 this is 5'-NGG-3'. Only if a PAM is present does the effector locally unwind the double helix and test whether the crRNA can base-pair with the exposed strand. A match across the seed region (the ~10–12 nucleotides next to the PAM) triggers full R-loop formation and activates the nuclease domains — the HNH domain cuts the target strand and the RuvC domain cuts the non-target strand, producing a blunt double-strand break ~3 bp upstream of the PAM. The shredded invader DNA cannot replicate, and the infection is defeated.

The molecular players and conditions

  • The CRISPR array. Repeats (23–47 bp, often partially palindromic so they form RNA hairpins) interspersed with unique spacers (typically 26–72 bp). Arrays range from a couple of spacers to hundreds; the myxobacterium Haliangium ochraceum carries one of the largest known arrays, with nearly 600 spacers. The leader sequence (~100–500 bp, AT-rich) upstream of the first repeat holds the promoter and the integration site.
  • Cas1 and Cas2. The universal "memory-writing" enzymes. Cas1 is the metal-dependent integrase; Cas2 is mostly a structural scaffold. Present in nearly every CRISPR type, which is why they are used to classify systems phylogenetically.
  • The effector nuclease. The defining protein of each type. Cas9 (~1,368 amino acids in S. pyogenes) cuts DNA and makes blunt ends; Cas12a makes staggered ends with 5' overhangs and uses a single RuvC domain twice; Cas13 (Type VI) targets RNA, not DNA.
  • The PAM (or PFS for RNA targeters). The 2–6 bp self/non-self discriminator next to the target. Different effectors read different PAMs: NGG for SpCas9, TTTV for Cas12a, an RNA "protospacer-flanking site" for Cas13.
  • tracrRNA. Unique to Type II — the trans-activating crRNA that both directs pre-crRNA processing and stays bound to Cas9 as a scaffold. The 2012 fusion of crRNA + tracrRNA into one single-guide RNA is what made programmable editing practical.
  • Conditions. Interference needs an active, expressed system and a target with both a matching protospacer and an intact PAM. Type III systems are unusual in needing active transcription of the target to trigger cutting, which lets them ignore dormant prophages and attack only actively replicating ones.

CRISPR system types compared

PropertyType I (Cascade)Type II (Cas9)Type V (Cas12a)Type VI (Cas13)
EffectorMulti-subunit Cascade + Cas3Single Cas9Single Cas12aSingle Cas13
TargetDNADNADNARNA
Cut typeProcessive degradation by Cas3 helicase-nucleaseBlunt double-strand breakStaggered, 5' overhangsSingle-strand RNA, then collateral cleavage
Targeting signal5'-PAM (e.g. AAG)3'-PAM (NGG)5'-PAM (TTTV)Protospacer-flanking site (PFS)
Needs tracrRNA?NoYesNoNo
crRNA processingCas6 endonucleaseRNase III + tracrRNASelf-processed by Cas12aSelf-processed by Cas13
Relative abundance in natureMost common (~50% of loci)~10%, mostly in pathogensLess commonRare
Editing tool legacyLarge-deletion toolsThe original CRISPR editorMultiplex editing, diagnosticsRNA editing, SHERLOCK detection

The numbers

  • Spacer length. ~26–72 bp, most commonly ~30–40 bp. The protospacer captured from the invader matches this length.
  • Guide-target match. Cas9 uses a 20-nucleotide spacer to recognize its target; the ~10–12 nt nearest the PAM (the seed) must match perfectly or cutting aborts.
  • PAM. 2–6 bp; SpCas9 reads 5'-NGG-3'. An NGG occurs on average every ~8 bp in random DNA, so almost any sequence is targetable.
  • Cut position. Cas9 cuts ~3 bp upstream of the PAM, producing blunt ends.
  • Prevalence. CRISPR-Cas loci are found in roughly 40% of sequenced bacteria and about 85% of archaea.
  • Array size. From 2–3 spacers up to ~600; a typical array holds a few dozen.
  • Acquisition rate. Spacer acquisition is rare per cell per generation but, scaled across a population of billions during a phage outbreak, lets resistant clones emerge within hours.
  • Phage abundance. ~1031 phage particles on Earth; phages turn over ~20–40% of ocean bacteria daily — the selective pressure that built CRISPR.
  • Cas9 size. The S. pyogenes Cas9 protein is 1,368 amino acids (~158 kDa); the synthetic sgRNA used in editing is ~100 nucleotides.

Where it shows up — organisms, dairy, and the clinic

  • The yogurt that proved it. The first direct demonstration of CRISPR immunity came not from a hospital but from a dairy company. In 2007, Rodolphe Barrangou and Philippe Horvath at Danisco showed that challenging Streptococcus thermophilus — a workhorse of yogurt and cheese fermentation — with a phage made the survivors acquire new spacers matching that phage and become resistant; deleting the spacer made them susceptible again. Industrial dairy starter cultures are still bred for phage resistance using natural CRISPR adaptation.
  • The arms race in the wild. CRISPR is a battleground. Phages mutate their protospacers and PAMs to escape stored spacers, and many carry anti-CRISPR (Acr) proteins that directly disable Cas effectors — AcrIIA4, for instance, is a small protein that mimics DNA and plugs Cas9's active site. Some jumbo phages even build a proteinaceous "phage nucleus" that physically shields their genome from DNA-targeting Cas enzymes. Bacteria respond with primed acquisition, rapidly grabbing new spacers from an escaped phage.
  • Genotyping and forensics. Because the spacer array is a chronological record, microbiologists use it for spoligotyping — strain-typing Mycobacterium tuberculosis and Salmonella by their spacer patterns — long before CRISPR became an editing tool.
  • The editing revolution. Emmanuelle Charpentier and Jennifer Doudna's 2012 paper reprogrammed the natural S. pyogenes Type II system into a universal DNA-cutting tool, sharing the 2020 Nobel Prize in Chemistry. The first CRISPR therapy, Casgevy (exagamglogene autotemcel), was approved in late 2023 for sickle-cell disease and β-thalassemia — it edits the BCL11A enhancer to reactivate fetal hemoglobin. The molecular machine doing the cutting is the very same bacterial immune nuclease, just handed a human-chosen guide.
  • Diagnostics. The collateral RNA-cleaving activity of Cas13 and the DNA activity of Cas12a power point-of-care tests (SHERLOCK, DETECTR) that detect viral genomes — used during the COVID-19 pandemic — turning the bacterial defense into a readout.

CRISPR vs restriction-modification vs vertebrate immunity

PropertyCRISPR-CasRestriction-modificationVertebrate adaptive immunity
Adaptive (learns from exposure)?YesNo (innate)Yes
Sequence-specific memory?Yes — each spacer is one pathogenNo — cuts any unmethylated siteYes — antigen receptors
Memory storageDNA spacers in the chromosomeNoneMemory B and T cells
Heritable to offspring?Yes (chromosomal, vertical)Yes (the genes), but not learnedNo (somatic, dies with the host)
Recognition moleculecrRNA guide (RNA)Protein domain reads a fixed siteAntibody / TCR protein
Self/non-self discriminationPAM absent in own arraySelf-DNA methylated and protectedThymic negative selection
Speed of new responseHours to days (population-level)Immediate but fixedDays to weeks (clonal expansion)
Found inBacteria, archaeaBacteria, archaeaJawed vertebrates

Common misconceptions and pitfalls

  • "CRISPR was invented as a gene-editing tool." No — CRISPR is a 2.5-billion-year-old bacterial immune system. Humans only borrowed it. The 2012 work reprogrammed an existing natural machine; it did not build a new one.
  • "Cas9 is CRISPR." Cas9 is just one effector, from one type (Type II), of a sprawling family. Cas3, Cascade, Cas12a, and Cas13 are all CRISPR effectors with different targets (some cut RNA) and different cut chemistry. Cas9's fame is an accident of being the simplest single-protein DNA cutter.
  • "The guide RNA finds the target by free 3D search." The effector first scans for PAMs and only unwinds DNA next to a PAM to test base pairing. Without that PAM-first strategy, scanning a multi-megabase genome would be impossibly slow, and the cell would also risk cutting its own spacer array.
  • "The spacer is stored as RNA." The memory is stored as DNA in the chromosome (the array). It is only transcribed into crRNA when needed. This is what makes it heritable.
  • "A perfect 20-bp match is enough to cut." Not without a PAM. And conversely, the seed region (nearest the PAM) is far less tolerant of mismatches than the PAM-distal end — off-target risk in editing comes mostly from seed-matching at non-target PAM-flanked sites.
  • "CRISPR is foolproof immunity." Phages routinely escape by mutating the protospacer or PAM, and many encode anti-CRISPR proteins that shut Cas down entirely. CRISPR is one move in a fast, ongoing coevolutionary arms race, not a permanent shield.
  • "All bacteria have CRISPR." Only about 40% of bacteria do (though ~85% of archaea). Many bacteria rely entirely on innate defenses like restriction-modification, abortive infection, or the dozens of newly discovered anti-phage systems.

Frequently asked questions

What are the three stages of CRISPR-Cas immunity?

CRISPR-Cas immunity runs in three stages. First, adaptation (also called spacer acquisition): when a bacteriophage or plasmid injects DNA, the Cas1-Cas2 complex grabs a roughly 30-base-pair fragment of that DNA, called a protospacer, and inserts it as a new spacer at the leader-proximal end of the CRISPR array, flanked by a fresh repeat. Second, expression (crRNA biogenesis): the whole array is transcribed into one long pre-crRNA, which is cut at each repeat into mature CRISPR RNAs (crRNAs), each bearing a single spacer. Third, interference: a crRNA loads into a Cas effector nuclease and guides it by base pairing to any returning DNA that matches the spacer; the nuclease verifies a short protospacer-adjacent motif (PAM) and then cuts the invading DNA, neutralizing it. The first stage writes the memory; the third stage uses it.

What is a PAM and why does it matter?

A PAM, or protospacer-adjacent motif, is a short DNA sequence (2-6 base pairs) immediately next to the target site in the invader's genome — for the classic Streptococcus pyogenes Cas9 it is 5'-NGG-3', sitting just downstream of the 20-bp protospacer. The PAM is not part of the spacer and is not stored in the CRISPR array. It solves the self versus non-self problem: the bacterium's own CRISPR array contains the spacer sequence but is flanked by CRISPR repeats, not by a PAM, so Cas9 ignores the array and never cuts the cell's own memory. PAM recognition also comes first mechanistically — Cas9 scans for PAMs and only then unwinds the adjacent DNA to test for base pairing, which is what makes target search across a whole genome fast. No PAM, no cut, even with a perfect 20-bp match.

How do bacteria avoid cutting their own genome?

Two safeguards keep CRISPR from being an autoimmune disaster. The first is PAM dependence: the spacer is stored in the array next to a CRISPR repeat, but a real invader presents the matching protospacer next to a PAM. Cas9 requires the PAM to cut, so the array — which lacks the PAM — is safe. The second is target search logic: the crRNA must base-pair with the protospacer over the full seed region (the ~10-12 nucleotides nearest the PAM) before the nuclease commits to cutting; a mismatch in the seed aborts the reaction. When these safeguards fail — for example if a bacterium accidentally acquires a spacer from its own chromosome — the result is self-targeting and cell death, which is exactly why acquisition is biased toward foreign DNA at active replication forks and free DNA ends.

Is CRISPR really an adaptive immune system?

Yes, and that is its defining feature. Adaptive immunity means the defense is acquired from experience, is sequence-specific to a particular pathogen, and is heritable. CRISPR satisfies all three. Each spacer is a record of a specific past infection; the array grows chronologically, so the spacers nearest the leader are the most recent encounters — a literal timeline of which viruses attacked. Because the array sits in the chromosome, the memory is passed to daughter cells, so an entire lineage inherits resistance to a phage one ancestor survived. This is fundamentally different from innate defenses like restriction-modification enzymes, which cut any unmethylated DNA non-specifically and carry no memory. Barrangou and colleagues' 2007 Streptococcus thermophilus experiments proved it directly: feeding bacteria a phage made the survivors acquire spacers matching that phage and become resistant, and deleting the spacer restored susceptibility.

How is the natural CRISPR system different from CRISPR gene editing?

In nature, the guide RNA is made by the bacterium itself, drawn from the spacer memory of past infections, and Type II systems use two RNAs — the crRNA carrying the spacer and a separate tracrRNA that base-pairs to the repeat and recruits the host RNase III and Cas9. The 2012 breakthrough by Jinek, Charpentier and Doudna was to fuse the crRNA and tracrRNA into a single synthetic single-guide RNA (sgRNA) and show that you can program Cas9 to cut any sequence you choose just by changing the 20-nucleotide guide, as long as a PAM is present. So the molecular machine is identical; the difference is who writes the guide. Editing also exploits the host cell's own repair: a Cas9 double-strand break is mended by error-prone non-homologous end joining (knockouts) or precise homology-directed repair (knock-ins). For the engineered tool side, see our CRISPR Mechanism page.

How do phages fight back against CRISPR?

Bacteria and phages are locked in a molecular arms race. Phages escape CRISPR by point mutations or deletions in the protospacer or the PAM, so the stored spacer no longer matches — a single mutation in the seed region or the PAM is often enough. More dramatically, many phages encode anti-CRISPR proteins (Acrs), first described by Bondy-Denomy and Davidson in 2013; these small proteins bind and inhibit Cas effectors directly — some block DNA binding by Cas9, others block cleavage, and at least one (AcrIIA4) mimics DNA to plug the active site. Some giant phages even build a nucleus-like protein shell that hides their genome from Cas nucleases. Bacteria counter by primed acquisition, rapidly grabbing new spacers from an escaped phage, and by carrying multiple CRISPR systems at once. The outcome is a fast coevolutionary cycle that drives the diversity of CRISPR loci seen across the microbial world.