Morphology

Compounding

Building new words by combining existing ones — blackboard, grandfather, doorknob

Compounding is the morphological process of forming new words by joining two or more existing words or stems, creating a single lexical item with its own meaning. English "blackboard" is not just a board that is black — it has a specialized meaning as a teaching surface. Compounds are universal but vary widely in productivity: English forms them moderately, German rampantly (Donaudampfschiffahrtsgesellschaftskapitän), Mandarin extensively (火車 "fire-vehicle" = train), and French sparingly. Distinguished from phrases by stress patterns, semantic non-compositionality, and inability to insert modifiers between elements. Headedness — which element determines the syntactic category — is a key parameter (right-headed in English and German; left-headed in Romance).

  • Stress testEnglish compounds stress the first element (BLACKboard) vs. phrases (black BOARD)
  • HeadednessEnglish/German right-headed; French/Italian left-headed
  • Endocentric vs. exocentricEndocentric has internal head (apple-pie); exocentric does not (pickpocket)
  • Famous exampleGerman Donaudampfschiffahrtsgesellschaftskapitän
  • MandarinVast compounding due to monosyllabic root limits
  • ProductivityVariable across languages; English moderate

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why compounding matters

  • Vocabulary growth. Most new English words enter via compounding (smartphone, doomscroll, deepfake).
  • Morphology theory. Tests theories of word vs. phrase boundary.
  • Translation. German and Mandarin compounds often translate as English phrases.
  • NLP tokenization. Solid compounds (German, Finnish) require subword segmentation; BPE was designed partly for this.
  • Cognitive psychology. Compound processing reveals lexical access architecture (Libben, Jarema).
  • Lexicography. Dictionaries must decide which compounds get headword status — productivity vs. fixity.
  • Language acquisition. Children produce novel compounds (Clark 1981) showing they grasp the productive process early.

Common misconceptions

  • Compounds are always written solid. Orthography varies — "ice cream" is open, "high-school" hyphenated, "newspaper" solid; all are compounds.
  • Compounds equal sum of parts. Most are non-compositional; "honeymoon" is not moon-related.
  • English does not compound much. It compounds extensively but writes them open or hyphenated, hiding the morphology.
  • Compound = derived word. Derivation adds affixes (un-, -ity); compounding combines stems.
  • The first element is always a modifier. Left-headed languages reverse this; even English has exocentric exceptions.
  • Long compounds are pathological. German speakers parse them effortlessly; length is a feature, not a bug.

Frequently asked questions

How are compounds different from phrases?

Stress placement (BLACKboard vs. black BOARD), semantic specialization (a "greenhouse" is not a green house), and insertability (you cannot say "very blackboard" or "black large board" preserving meaning). Compounds also resist modification of internal parts. Some compounds are written solid (notebook), some hyphenated (mother-in-law), some open (high school) — orthography is unreliable.

What is endocentric vs. exocentric?

Endocentric compounds have a head element whose category and basic meaning the whole inherits. "Doghouse" is a kind of house (head = house). Exocentric compounds (also called bahuvrihi from Sanskrit) have meaning outside the constituents. "Redhead" is not a kind of head — it is a person. "Pickpocket" is not a pocket. Hindi/Urdu and Sanskrit grammar named bahuvrihi compounds first.

Why does German compound so freely?

German allows recursive nominal compounding without spaces, and morphological licensing is permissive. "Fußball" (football) + "weltmeisterschaft" (world championship) yields "Fußballweltmeisterschaft". There is no syntactic upper bound. Twain's essay "The Awful German Language" (1880) famously mocked compounds like "Generalstaatsverordnetenversammlungen". Productivity correlates with right-headed analytic morphology and weak phrasal alternatives.

How does Mandarin compounding work?

With monosyllabic morphemes and few inflectional resources, Mandarin builds new vocabulary by compounding. 電 (electric) + 腦 (brain) = 電腦 (computer). 飛 (fly) + 機 (machine) = 飛機 (airplane). Most modern Mandarin words are bisyllabic compounds. Packard (2000) catalogs the productive patterns: V-V (resultative), N-N (modificational), V-N, N-V, A-N.

Are noun-noun phrases compounds?

Contested. English "apple pie" has compound stress, behaves morphologically as a unit, and is non-compositional in some uses, suggesting compound. But syntactic tests sometimes treat it as a phrase. Selkirk (1982), Lieber (2004), and others propose mixed analyses. The line between syntax and morphology is unclear here — possibly intentionally fuzzy.

What is incorporation?

A polysynthetic relative of compounding where a noun is incorporated into a verb. Mohawk "wa-k-nuhs-ahninu" = "I bought a house" with "house" inside the verb. Mithun (1984) typologized four types. Distinct from English noun-incorporation-like compounds (babysit, sightsee) which are lexical, not productive syntax. Incorporation is rare but well-attested in Iroquoian, Inuit, and some Australian languages.

Can compounds be ambiguous?

Yes. "Toy factory" can mean a factory that makes toys or a factory that is a toy. "Woman doctor" can mean a doctor who is a woman or a doctor who treats women. Ambiguity resolved by context, prosody, or community convention. Bracketing differences ([toy [stove fire]] vs. [[toy stove] fire]) yield different readings in long compounds.