Phonetics
Aspiration
The puff of air after a stop — invisible to English ears, lexical in Hindi
Aspiration is the audible puff of air [ʰ] released after a voiceless stop before the following vowel begins. Phonetically it is a long voice onset time — the gap between consonant release and vowel voicing. English speakers produce it automatically in pin [pʰɪn] but never in spin [spɪn], without ever noticing. Hindi speakers, by contrast, hear it as a phoneme: pal "moment" and phal "fruit" differ only in that puff of air. Korean stretches the contrast further into a three-way system. Aspiration is one of the cleanest demonstrations that the same physical sound can have entirely different cognitive status depending on the language.
- IPAsuperscript [ʰ] after stop: [pʰ tʰ kʰ]
- Phonetic measureVOT > ~50 ms (positive)
- Phonemic inHindi, Mandarin, Thai, Korean, Greek (anc.)
- Allophonic inEnglish, German, Danish
- Absent fromSpanish, French, Russian, Japanese
- Hindi minimal pairपल /pal/ "moment" vs फल /pʰal/ "fruit"
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
What aspiration actually is
When you produce a voiceless stop like /p/, three things happen in sequence: the lips close (creating oral pressure), the closure is released (the burst), and at some later moment the vocal folds start vibrating for the following vowel. The interval between release and voice onset is voice onset time, or VOT. Aspiration is what you hear when that interval is long enough that air rushes through an open glottis before voicing begins — an audible [h]-like puff.
Hold your hand in front of your mouth and say pin. You'll feel a clear puff. Now say spin. The puff is gone, even though both words contain "p". English allophonically aspirates voiceless stops at the start of stressed syllables but suppresses aspiration after /s/. Speakers internalise this as a single phoneme /p/ realised as either [pʰ] or [p] depending on context — and they cannot easily tell the two apart by ear.
Voice onset time as a continuum
VOT lets us put voiced, unaspirated, and aspirated stops on a single ruler. Negative VOT means voicing precedes release ("prevoicing"); positive VOT means voicing lags. Languages carve the continuum differently:
| Stop type | Approx VOT | Example language | Sample word | IPA | Gloss |
|---|---|---|---|---|---|
| Voiced (prevoiced) | -100 to -50 ms | Spanish, French, Russian | Spanish baño | [ˈbaɲo] | "bath" |
| Voiceless unaspirated | 0 to 30 ms | Spanish, French, Mandarin /p/ | Spanish paño | [ˈpaɲo] | "cloth" |
| Voiceless unaspirated | 0 to 30 ms | English (after /s/) | English spin | [spɪn] | "spin" |
| Voiceless aspirated | 60 to 100 ms | English (stressed onset) | English pin | [pʰɪn] | "pin" |
| Voiceless aspirated | 70 to 110 ms | Hindi, Mandarin, Thai | Hindi phal | [pʰal] | "fruit" |
| Tense (Korean) | 0 to 20 ms, glottalised | Korean | ppal | [p͈al] | "sucking" |
| Breathy-voiced (Hindi) | negative + breathy | Hindi-Urdu | bhal | [bʱal] | "forehead" |
An English /p/ in pin and a Spanish /p/ in pino sit on opposite sides of the unaspirated/aspirated boundary — about 70 ms apart in VOT. To a Spanish speaker, English pin sounds like it begins with a slightly breathy or "harsh" consonant; to an English speaker, Spanish pino sounds like it might begin with /b/. The boundary is in the listener, not the signal.
English: aspiration is allophonic
English aspiration follows a simple distributional rule. Voiceless stops /p t k/ are aspirated when they are the onset of a stressed syllable, and unaspirated elsewhere — most notably after /s/, where they always lose aspiration:
- Aspirated: pin [pʰɪn], tin [tʰɪn], kin [kʰɪn], appear [əˈpʰɪɹ], repeat [ɹəˈpʰit].
- Unaspirated after /s/: spin [spɪn], stem [stɛm], skin [skɪn]. The /s/ "steals" the aspiration window.
- Unaspirated in unstressed onsets: happy [ˈhæpi], capital [ˈkʰæpətəl] — first /k/ aspirated, second /p/ not.
- Reduced or absent in coda: top, cat, back — the closing stop may be unreleased entirely [tʰɑp̚].
Because no English word changes meaning when you swap [pʰ] for [p], English speakers do not have separate mental categories for the two. The same speaker who fluently produces both will, in a same-different listening test, often fail to discriminate them above chance — a textbook case of categorical perception.
Hindi: a four-way stop contrast
Hindi-Urdu, like Sanskrit before it, treats aspiration and voicing as fully independent phonemic dimensions. Each place of articulation hosts a four-way contrast — voiceless unaspirated, voiceless aspirated, voiced unaspirated, voiced aspirated (breathy):
| Place | VL unasp | VL aspirated | Vd unasp | Vd aspirated (breathy) |
|---|---|---|---|---|
| Bilabial | /p/ pal "moment" | /pʰ/ phal "fruit" | /b/ bal "strength" | /bʱ/ bhal "forehead" |
| Dental | /t̪/ tal "rhythm" | /t̪ʰ/ thal "plate" | /d̪/ dal "lentil" | /d̪ʱ/ dhal "shield" |
| Retroflex | /ʈ/ ʈal "delay" | /ʈʰ/ ʈhal "stand" | /ɖ/ ɖal "branch" | /ɖʱ/ ɖhal "shape" |
| Palatal | /c/ cal "walk" | /cʰ/ chal "deceit" | /ɟ/ jal "water" | /ɟʱ/ jhal "spice" |
| Velar | /k/ kal "yesterday" | /kʰ/ khal "skin" | /ɡ/ gal "cheek" | /ɡʱ/ ghal "destruction" |
Five places × four series = 20 stop phonemes, each writable with a unique Devanagari character. English-speaking learners struggle especially with the breathy-voiced row: producing voicing and an open glottis simultaneously requires precise coordination of laryngeal gestures that English never trains.
Worked example: Korean's three-way contrast
Korean innovates a contrast almost no other language preserves: three voiceless stop series — lenis (plain), aspirated, and tense (fortis). All three are voiceless in word-initial position, and they distinguish hundreds of minimal triplets:
Korean IPA Gloss VOT Glottal state
pal [pal] "foot" (lenis) ~30 ms moderately tense
phal [pʰal] "arm" (aspirated) ~80 ms spread (open glottis)
ppal [p͈al] "sucking" (tense) ~10 ms constricted glottis
The lenis series is voiceless word-initially but voiced between vowels — a regular allophonic alternation. The tense series, marked by doubled consonant letters in Hangul (ㅃ ㄸ ㄲ ㅆ ㅉ), has a glottalised, "stiff" quality and the shortest VOT. Younger Seoul speakers (since the 1990s) are progressively losing the VOT difference between lenis and aspirated, replacing it with a pitch contrast on the following vowel: low pitch after lenis, high pitch after aspirated. Tone, in other words, is being recycled out of an aspiration distinction — a sound change documented in real time.
Aspiration in historical change
Ancient Greek's three-way stop system (voiced / voiceless / voiceless aspirated) collapsed in two waves. Voiced stops devoiced and aspirated stops spirantised, leaving Modern Greek with a two-way fricative-vs-stop contrast:
| Letter | Classical Greek | Modern Greek | Latin transcription | English borrowings |
|---|---|---|---|---|
| π | /p/ | /p/ | p | pneumatic, pterodactyl |
| φ | /pʰ/ | /f/ | ph | philosophy, phoenix, phonetic |
| τ | /t/ | /t/ | t | tonic, atom |
| θ | /tʰ/ | /θ/ | th | theatre, thesis, ethic |
| κ | /k/ | /k/ | c, k | kilo, comic |
| χ | /kʰ/ | /x/ | ch | chronic, chorus, anchor |
That is why English orthography keeps "ph", "th", "ch" in Greek borrowings — the digraphs preserve a fossilised memory of aspirated stops that Greek itself stopped pronouncing roughly 1500 years ago. Latin loans from Greek borrowed both the spelling and (briefly) the pronunciation; the spelling outlived the sound.
Adjacent phenomena
- Pre-aspiration. Icelandic, Faroese, and some Scottish Gaelic dialects place the [h] before the stop closure rather than after release: Icelandic kappi [ˈkʰahpɪ] "champion". Cross-linguistically rare.
- Breathy voice (murmur). The Hindi bʱ series is voiced and aspirated simultaneously — the vocal folds vibrate while the glottis stays partially open. Marathi, Bengali, and Gujarati share this.
- Glottal stops as aspiration counterparts. In some analyses (e.g. Cockney English), word-final /t/ becomes [ʔ] rather than aspirated [tʰ] — the laryngeal gesture survives but the oral place is lost.
- Tense/fortis. Korean's tense stops, sometimes analysed as glottalised or geminated; not aspiration but a sister category along the laryngeal dimension.
- Aspirated fricatives and affricates. Burmese, Korean, and several Bantu languages contrast plain /s/ with aspirated /sʰ/ — a typological rarity.
Why aspiration matters
- Categorical perception. The cleanest psychophysical demonstration that listeners impose phonological categories on a continuous acoustic signal.
- Second-language acquisition. Spanish learners of English fail to aspirate stressed onsets and sound foreign; Hindi learners over-aspirate everywhere.
- Forensic phonetics. VOT distributions are speaker-distinctive enough to contribute to voice identification.
- Historical reconstruction. Aspiration is one of the most easily lost laryngeal features; tracking its disappearance reconstructs proto-language stop systems.
- Speech synthesis. TTS engines must inject appropriate VOT or output sounds robotic; English neural systems learn the allophony implicitly from data.
- Tonogenesis. Korean's ongoing transfer of aspiration cues to pitch shows how laryngeal contrasts can become tonal contrasts within a few generations.
Common pitfalls
- Confusing aspiration with voicelessness. All aspirated stops are voiceless, but most voiceless stops are not aspirated. Spanish /p/ is voiceless and unaspirated; English /p/ in spin likewise.
- Hearing aspiration as a separate consonant. The [ʰ] is not a "separate /h/" — it is part of the stop's release phase. Hindi /pʰ/ is one phoneme, not /p/ + /h/.
- Treating English [pʰ] and [p] as different sounds. They are predictable variants of /p/; English speakers can produce both but rarely perceive the difference without training.
- Assuming all "h"-letters mark aspiration. English hat begins with a fricative /h/, not an aspirated stop. The Greek-derived "ph" "th" "ch" digraphs are historical, not phonetic, in Modern English.
- Confusing Korean tense stops with geminates. Tense stops are short and glottalised, not actually doubled in duration. The Hangul orthographic doubling is symbolic.
- Reading Devanagari "ph" as English /f/. Hindi फ is /pʰ/, not /f/. Some Persian and Arabic loans introduced /f/ with a separate character (फ़), but native फ is unambiguously aspirated /pʰ/.
Frequently asked questions
Why don't English speakers notice aspiration?
Because English aspiration is allophonic — it never distinguishes one word from another. /p/ is aspirated [pʰ] in pin and unaspirated [p] in spin, but no English minimal pair turns on that difference. Native phonological systems filter predictable variation below conscious access. A Hindi speaker, for whom phal "fruit" and pal "moment" are different words, has the opposite problem — they hear an aspirated/unaspirated contrast everywhere, including where English doesn't intend one.
How is aspiration measured?
Voice onset time (VOT): the millisecond gap between the release of the stop closure and the onset of vocal-fold vibration in the following vowel. Voiced stops have negative VOT (voicing leads release, e.g. -100 ms in Spanish /b/). Unaspirated voiceless stops cluster around 0-30 ms (Spanish /p/, English /p/ after /s/). Aspirated voiceless stops range 60-100 ms (English /p/ in pin, Hindi /pʰ/, Korean aspirated /pʰ/). The three categories are bimodally separable on a VOT histogram.
What's the difference between Hindi /pʰ/ and English /pʰ/?
Articulatorily, very little — both are bilabial voiceless stops with long VOT. Phonologically, everything. In Hindi, /pʰ/ is a phoneme that contrasts with /p/, /b/, and /bʱ/ (breathy-voiced) in a four-way system: pal "moment" / phal "fruit" / bal "strength" / bhal "forehead". In English, [pʰ] and [p] are predictable variants of one phoneme /p/, distributed by syllable position. Same surface sound, different cognitive status.
What is Korean's three-way contrast?
Korean stops come in three series — lenis (plain), aspirated, and tense (fortis). All three are voiceless in word-initial position. /p/ pal "foot" has moderate VOT; /pʰ/ phal "arm" has long VOT and a strong release burst; /p͈/ ppal "sucking" has near-zero VOT but a tense, glottalised closure. Younger Seoul speakers are increasingly distinguishing lenis from aspirated by f0 (pitch) on the following vowel rather than VOT — a sound change in progress.
Did Ancient Greek have aspirated stops?
Yes. Classical Greek had a three-way stop contrast: voiced (β γ δ), voiceless unaspirated (π κ τ), and voiceless aspirated (φ χ θ). The aspirated series later spirantised: /pʰ/ → /f/, /tʰ/ → /θ/, /kʰ/ → /x/, which is why modern Greek pi (π) is /p/ but phi (φ) is /f/. Latin transcribed these as "ph", "th", "ch" — preserving the orthographic memory in English borrowings like philosophy, theatre, chronograph long after Greek itself stopped pronouncing them as stops.
Are aspirated stops universal in any sense?
No language is reported to have only aspirated stops with no unaspirated counterpart, but the phonological status varies wildly. Roughly a third of the world's languages treat aspiration as phonemic, including Mandarin, Cantonese, Thai, Hindi-Urdu, Bengali, Korean, Burmese, Quechua, and Armenian. Most Romance and Slavic languages don't aspirate at all. Germanic languages (English, German, Danish) aspirate predictably in stressed onsets — a regional habit dating to early Germanic.