Question 1

How does the source-filter model work?

Accepted Answer

Speech production has two components. The source — usually voiced vocal-fold vibration producing a buzz with harmonics at multiples of the fundamental frequency (F0). The filter — the vocal tract (pharynx, mouth, lips) shaping that source through resonance. The filter's resonant frequencies are formants. Fant (1960) modeled this mathematically. The model predicts vowel formant frequencies from vocal-tract length and shape, and is foundational to acoustic phonetics.

Question 2

Why does F1 track vowel height?

Accepted Answer

F1 reflects the size of the back cavity (pharynx) of the vocal tract. Lowering the tongue (as in [a]) constricts the pharynx and enlarges the front cavity, raising F1 to ~700-900 Hz. Raising the tongue (as in [i] or [u]) widens the pharynx and lowers F1 to ~250-350 Hz. The inverse correlation with height is consistent across speakers and languages, after normalization.

Question 3

What does F2 tell us?

Accepted Answer

F2 reflects the location of the tongue's main constriction along the front-back axis. Front vowels [i, e] push the tongue forward, shortening the front cavity and raising F2 toward 2200-2500 Hz. Back vowels [u, o] retract the tongue, lengthening the front cavity and dropping F2 to 700-1000 Hz. F1 vs. F2 plots are the standard way phoneticians display vowel spaces.

Question 4

How do F1-F2 plots represent vowels?

Accepted Answer

Phoneticians plot F2 on the horizontal axis (decreasing left to right) and F1 on the vertical axis (decreasing top to bottom). The result roughly mirrors the IPA vowel chart — a quadrilateral with [i] top-left, [u] top-right, [a] bottom-center. Each speaker's plot differs in absolute frequencies but shows the same topology after normalization. This visualization revolutionized acoustic phonetics in the 1950s.

Question 5

How are formants measured?

Accepted Answer

From a spectrogram showing energy as a function of frequency and time, the dark horizontal bands are formants. Linear Predictive Coding (LPC) algorithms extract formant frequencies automatically. Praat (Boersma and Weenink) is the standard free phonetics tool. Manual correction is often needed — automatic tracking fails near silences, fricatives, or rapid transitions.

Question 6

Why are children's formants higher?

Accepted Answer

Vocal tract length determines formant frequencies — shorter tract, higher formants. Adult males have ~17 cm vocal tracts, females ~15 cm, children 10-13 cm. Children's formants are 30-50% higher than adult males'. This poses a problem for speech recognition: the same vowel has different absolute frequencies across speakers. Vocal tract length normalization handles this in ASR.

Question 7

What about consonant formants?

Accepted Answer

Consonants influence formants in adjacent vowels through transitions. Locus theory (Delattre, Liberman, Cooper 1955) holds that each place of articulation has a characteristic F2 starting point. Bilabial /b/ pulls F2 toward ~700 Hz; alveolar /d/ toward ~1700 Hz; velar /g/ toward ~3000 Hz. These transitions are the primary perceptual cue to consonant place — vowels carry the consonant's signature.

Formants

Interactive visualization

Watch the 60-second explainer

Why formants matter

Common misconceptions

Frequently asked questions

Interactive visualization

Watch the 60-second explainer

Why formants matter

Common misconceptions

Frequently asked questions

Related concepts