Physiology

Cochlea & Hearing

A 35 mm spiral that sorts sound by frequency — and turns a few nanometres of motion into nerve spikes

The cochlea is a fluid-filled spiral, about 35 mm long uncoiled with ~3,500 inner and ~12,000 outer hair cells, that physically sorts sound by frequency along its basilar membrane — high pitches (up to 20 kHz) peak at the stiff base, low pitches (down to 20 Hz) at the floppy apex. Each hair cell converts a few nanometres of stereocilia deflection into nerve signals by opening mechanically-gated TMC1 channels, letting K+ flood in from the +80 mV endolymph and triggering glutamate release onto the auditory nerve in under a millisecond.

  • Uncoiled length~35 mm, ~2.75 turns
  • Frequency range20 Hz – 20 kHz (human)
  • Hair cells~3,500 inner + ~12,000 outer
  • Threshold deflection~0.3 nm (sub-atomic eardrum motion)
  • Endocochlear potential+80 mV (K+-rich endolymph)
  • Amplifier gainup to ~1000× (prestin, outer hair cells)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What the cochlea actually is

Hold a tiny snail shell to your ear and you have the right picture: the cochlea (Latin cochlea, "snail") is a fluid-filled tube about 35 mm long, coiled into roughly 2.5–2.75 turns around a central bony pillar (the modiolus) and packed into a space the size of a chickpea in the temporal bone. It does the single most important job in hearing: it takes the messy pressure wave that air delivers to your eardrum and breaks it apart into its constituent frequencies — a biological Fourier analyser made of jelly and protein.

In cross-section the tube has three parallel chambers stacked on top of each other. The top chamber (scala vestibuli) and bottom chamber (scala tympani) are filled with ordinary perilymph and join at the very tip through a hole called the helicotrema. Sandwiched between them is the middle chamber (scala media), filled with the strange potassium-rich endolymph. The floor of the middle chamber is the basilar membrane, and riding on top of it is the organ of Corti — the sensory engine, carrying the hair cells. Above them hangs the gelatinous tectorial membrane. When sound enters, every layer in this stack moves, and that motion is what the hair cells read.

How sound becomes a nerve signal, step by step

The journey from air pressure to brain has a clean chain of conversions, each amplifying or focusing the signal:

  1. Air to bone. Sound vibrates the eardrum, which drives the three middle-ear ossicles (malleus, incus, stapes). The stapes hammers the oval window. Because the eardrum is ~17× larger than the oval window and the ossicles act as a lever, this stage boosts pressure roughly 22-fold (~27 dB) so that airborne sound can move dense cochlear fluid without bouncing off.
  2. Fluid travelling wave. The stapes piston pushes perilymph in scala vestibuli, setting up a wave that ripples along the basilar membrane. Because the membrane's stiffness changes 100-fold from base to apex, the wave grows, peaks at the position matching the sound's frequency, then collapses — Georg von Békésy's travelling wave.
  3. Bundle deflection. Where the membrane peaks, the organ of Corti shears against the overlying tectorial membrane. This bends each hair cell's stereocilia bundle toward its tallest row by a few nanometres.
  4. Tip-link gating. The deflection stretches the tip-links — ~150–180 nm filaments of cadherin-23 and protocadherin-15 — that connect each shorter stereocilium to the side of its taller neighbour. Tension pulls open the mechanically-gated transduction channel (built on TMC1) in as little as tens of microseconds.
  5. Potassium influx. The endolymph bathing the bundle is high in K+ and sits at +80 mV; the cell interior is near -45 mV. That ~150 mV driving force shoves K+ into the cell, depolarizing it. (Unusually for a sensory cell, the receptor current is carried by K+, not Na+.)
  6. Glutamate release. Depolarization opens CaV1.3 voltage-gated Ca2+ channels at the cell base. Ca2+ entry triggers vesicles at the ribbon synapse to dump glutamate onto the dendrites of spiral ganglion neurons.
  7. Spike train. Glutamate fires action potentials in the auditory nerve (cranial nerve VIII), which carries the frequency map and the precise timing of the sound to the cochlear nucleus and on up to the auditory cortex. The whole relay, eardrum to first spike, takes under ~1 ms.

The bundle re-closes its channels via adaptation: tiny myosin motors (myosin-1c) climb and slip along the actin core to reset tip-link tension, so the cell can respond to the next cycle even at 20,000 cycles per second.

Tonotopy — why pitch is a place

The cochlea's defining trick is that frequency is encoded as position. The basilar membrane is not uniform: at the base it is narrow (~100 µm) and taut, like the short high strings of a piano; at the apex it is wide (~500 µm) and ~100 times more compliant, like the long bass strings. A 20 kHz tone peaks within the first millimetre of the base; a 20 Hz tone travels almost the whole length to peak near the apex. Roughly, the place of peak vibration moves about 2.5–3 mm per octave, so the human cochlea spreads its ~10 octaves across the 35 mm length.

This "place coding" (place theory, from Hermann von Helmholtz, confirmed mechanically by von Békésy) is preserved all the way up the auditory pathway — the cochlear nucleus, inferior colliculus, and auditory cortex all keep neat frequency maps. It is also what makes cochlear implants work: the implant's electrode array runs along the spiral and stimulates base electrodes for high pitches and apical electrodes for low ones, directly exploiting tonotopy to bypass dead hair cells.

The players: cells, fluids, and proteins

  • Inner hair cells (~3,500, single row). The true sensors. About 95% of afferent (type I) spiral ganglion fibres synapse on them, so they report essentially everything you consciously hear.
  • Outer hair cells (~12,000, three rows). Mostly motors. Their membrane is packed with millions of copies of prestin (SLC26A5), a voltage-driven motor protein that makes the cell shorten and lengthen by a few percent on every sound cycle. This pumps energy back into the travelling wave — the cochlear amplifier — boosting weak sounds up to ~1,000-fold and sharpening frequency tuning.
  • Stria vascularis. A capillary-rich epithelium on the wall of scala media that pumps K+ to generate the +80 mV endocochlear potential — the battery that powers transduction.
  • Spiral ganglion neurons. The bipolar neurons whose central axons form the auditory nerve; type I (95%) wire inner hair cells, type II (5%) wire outer hair cells.
  • Tip-link proteins. Cadherin-23 (upper) and protocadherin-15 (lower) form the gating spring; mutations cause Usher syndrome (deaf-blindness).
  • TMC1. The pore-forming subunit of the mechanotransduction channel; identified in 2011–2018 work and a leading deafness gene (DFNA36/DFNB7).

Inner vs outer hair cells

PropertyInner hair cellsOuter hair cells
Number per ear~3,500 (one row)~12,000 (three rows)
Primary roleSensory — report sound to brainMotor — amplify the travelling wave
Afferent nerve supply~95% of type I fibres (~10–20 per cell)~5% via type II fibres
Efferent controlSparseHeavy (medial olivocochlear feedback)
Key motor proteinNone (passive sensor)Prestin (SLC26A5), ~10⁶ copies/cell
Coupling to tectorial membraneTips free-floating (fluid-driven)Tallest stereocilia embedded in it
Otoacoustic emissionsDo not produceProduce — basis of newborn screening
Loss consequenceProfound sensorineural deafnessLoss of faint-sound gain (~40–50 dB)

The numbers that make hearing extraordinary

  • Sub-atomic sensitivity. At the threshold of hearing the eardrum moves ~10 pm — less than the radius of a hydrogen atom (~53 pm) — and the stereocilia tip deflects by only ~0.3 nm, an angle near 0.003°. The transduction channel gates at only ~0.5 pN per channel.
  • Dynamic range. The ear spans from 0 dB SPL (~20 µPa) to ~120 dB at the pain threshold — a 1012-fold range in intensity (a million-million), compressed by the cochlear amplifier into a manageable neural signal.
  • Speed. Transduction channels open in tens of microseconds and outer hair cells flex at up to 20 kHz — far faster than any conventional ion channel or molecular motor elsewhere in the body.
  • Frequency resolution. A trained ear distinguishes tones differing by ~0.2–0.3% near 1 kHz (a few hertz), thanks to the amplifier sharpening each travelling-wave peak.
  • Fixed inventory. ~15,000 hair cells per ear at birth, none replaced. Compare that to ~120 million rods in one retina — hearing runs on a tiny, irreplaceable cell count.
  • Endocochlear battery. The +80 mV endolymph plus the ~-45 mV cell interior gives a ~150 mV driving force across the apical membrane — the largest standing potential difference across any membrane in the body.
  • Range across animals. Humans hear 20 Hz–20 kHz; dogs to ~45 kHz; mice ~80 kHz; bats and dolphins echolocate up to ~150–160 kHz; elephants and whales sense infrasound below 20 Hz. Higher-frequency hearers have stiffer, shorter basilar membranes.

Where it shows up: disease, drugs, and devices

  • Noise-induced hearing loss. Loud sound mechanically snaps tip-links and overdrives outer hair cells, killing them with Ca2+ overload and reactive oxygen species. A single 120 dB blast can destroy outer hair cells; because high-frequency cells at the base see every travelling wave, the 4 kHz "noise notch" is the classic early sign.
  • Presbycusis (age-related loss). Cumulative hair-cell and stria death, hitting high frequencies first — which is why consonants (rich in 2–8 kHz energy) blur before vowels and "everyone mumbles."
  • Ototoxic drugs. Aminoglycoside antibiotics (gentamicin) and the chemotherapy agent cisplatin enter hair cells through the transduction channel itself and poison them — basal, high-frequency cells first.
  • Cochlear implants. For dead hair cells, an electrode array threaded along the scala tympani directly stimulates spiral ganglion neurons at tonotopically appropriate positions — restoring useful hearing to over a million recipients worldwide.
  • Otoacoustic emissions screening. Healthy outer hair cells leak sound back out of the ear; a microphone in the ear canal detects it. This is the painless test run on newborns within days of birth.
  • Genetic deafness. Mutations in connexin 26 (GJB2, the most common inherited cause), TMC1, the tip-link cadherins (Usher syndrome), prestin (SLC26A5), and stria K+ channels (Jervell and Lange-Nielsen) all converge on the cochlea.

Common misconceptions and pitfalls

  • "Hair cells have actual hairs." No — the "hairs" are stereocilia: rigid, actin-filled rods arranged in a precise height-graded staircase. They don't bend like hairs; they pivot stiffly at their base, and tip-links between them do the gating. (The single true cilium, the kinocilium, is present only during development and is lost in mature mammalian cochlear cells.)
  • "The receptor current is carried by sodium." Hearing is the great exception. Because the bundle sits in K+-rich endolymph, the depolarizing transduction current is carried by potassium flooding in — the opposite of most excitable cells, where Na+ depolarizes.
  • "The cochlea is a passive microphone." It actively amplifies. Outer hair cells inject mechanical energy via prestin on every cycle; the cochlea even emits sound. A purely passive cochlea would be ~40–60 dB less sensitive and far less sharply tuned.
  • "Loudness is coded by bigger nerve signals; pitch by signal size." Pitch is coded by place (which hair cells fire) plus, for low frequencies, phase-locked spike timing. Loudness is coded by firing rate and by how many fibres are recruited — not by amplitude of individual spikes, which are all-or-nothing.
  • "Hair cells grow back like skin." Birds, fish, and amphibians regenerate hair cells via Atoh1; mammals do not. Mammalian hearing loss from hair-cell death is permanent — the central reason hearing damage is so consequential.
  • "The eardrum does the frequency analysis." The eardrum and ossicles only conduct and match impedance. The actual frequency decomposition happens mechanically on the basilar membrane, sharpened by outer hair cells — von Békésy's Nobel-winning insight.

Frequently asked questions

How does the cochlea separate high and low pitches?

The cochlea sorts pitch by place. Its basilar membrane is a graded mechanical filter: at the base (near the oval window) it is narrow — about 100 micrometres wide — and stiff, so it resonates best with high frequencies up to ~20 kHz. As you travel toward the apex it widens to roughly 500 micrometres and becomes about 100 times less stiff, so it resonates with progressively lower frequencies, down to ~20 Hz at the helicotrema. A pure tone sets up a travelling wave that grows as it moves and peaks sharply at one location, then dies away. The brain reads which hair cells are firing as a map of frequency. Georg von Békésy directly observed these travelling waves in cadaver cochleae and won the 1961 Nobel Prize for showing that the membrane itself, not the nerves, performs the first frequency analysis. This place coding is why a cochlear implant works by stimulating different electrode positions for different pitches.

What exactly is a hair cell and how does it turn vibration into a signal?

A hair cell is a sensory receptor topped by a staircase bundle of 30–300 stiff actin-filled rods called stereocilia (not true hairs). When sound vibrates the basilar membrane, the bundle shears and the stereocilia pivot toward the tallest row. Tip-links — fine ~150–180 nm protein filaments of cadherin-23 and protocadherin-15 connecting the tip of each shorter stereocilium to the side of its taller neighbour — are pulled taut and yank open mechanically-gated transduction channels built around the TMC1 protein. The channels gate in tens of microseconds — the fastest known in biology. Because the fluid bathing the bundle (endolymph) is uniquely high in potassium and held at +80 mV, opening the channels lets K+ rush in and depolarize the cell. That opens voltage-gated Ca2+ channels at the base, triggering glutamate release onto the auditory nerve. The whole electrical-to-chemical relay completes in under a millisecond.

How sensitive is hearing — how small a movement can a hair cell detect?

Astonishingly small. At the threshold of hearing the eardrum moves less than the diameter of a hydrogen atom — about 10 picometres — and the stereocilia bundle deflects by only a few nanometres, roughly 0.3 nm at threshold, an angle of about 0.003 degrees. A deflection equivalent to moving the top of the Eiffel Tower by a thumb's width is enough to be audible. The transduction channel needs only ~0.5 pN of force per channel to gate. Outer hair cells make this possible by actively amplifying the travelling wave: their motor protein prestin changes the cell length in step with the sound, pumping mechanical energy back into the membrane and boosting faint sounds up to ~1,000-fold (about 40–60 dB). Without this cochlear amplifier — for example after noise damages outer hair cells — the quietest 40–50 dB of hearing is simply lost.

Why is hearing loss so often permanent?

Mammalian cochlear hair cells do not regenerate. You are born with your full set — roughly 15,000 per ear — and once they die from loud noise, ageing, ototoxic drugs (aminoglycoside antibiotics, the chemotherapy agent cisplatin), or genetic defects, they are gone for life. Birds, fish, and amphibians regrow hair cells from supporting cells via the transcription factor Atoh1, but in mammals the supporting cells stay quiescent. Loud noise also snaps the tip-links and floods cells with damaging Ca2+ and reactive oxygen species; a single 120 dB blast can kill outer hair cells outright. Because high-frequency hair cells sit at the vulnerable base where every travelling wave passes, age-related hearing loss (presbycusis) and noise damage both attack high pitches first. Active research aims to restart Atoh1 or convert supporting cells to restore lost cells.

What is the difference between inner and outer hair cells?

They look similar but do opposite jobs. The ~3,500 inner hair cells form a single row and are the true sensors: about 95 percent of the auditory nerve fibres (spiral ganglion type I neurons) carry their signals to the brain, so they report what you hear. The ~12,000 outer hair cells form three rows and are mostly motors, not reporters: only ~5 percent of nerve fibres leave them, while many efferent fibres from the brainstem control them. Their job is the cochlear amplifier — driven by the membrane motor protein prestin, they shorten and lengthen by a few percent on every sound cycle (up to ~20,000 times per second), sharpening the travelling wave's peak and amplifying weak sounds. This active motion leaks measurable sound back out of the ear, called otoacoustic emissions, which is exactly what newborn hearing screens detect.

What is endolymph and why is the +80 mV endocochlear potential important?

Endolymph is the fluid filling the scala media, the middle chamber that bathes the tops of the hair cells. It is unique among body fluids: high in K+ (~150 mM) and low in Na+, the reverse of normal extracellular fluid, and it is held at a positive voltage of about +80 mV — the endocochlear potential — generated by the stria vascularis pumping K+ into the chamber. Because the inside of a hair cell sits near -45 to -70 mV, the combined electrical and chemical gradient across the apical membrane is enormous, about 150 mV. When transduction channels open, this huge driving force shoves K+ into the cell extremely fast, giving hearing its speed and sensitivity. Knock out the stria's KCNQ1/KCNE1 or KCNJ10 potassium channels and the endocochlear potential collapses, causing deafness (as in Jervell and Lange-Nielsen syndrome). It is, in effect, a standing battery that powers the ear.