Morphology

Nonconcatenative Morphology (Templatic)

When morphemes interleave instead of chaining — Arabic √k-t-b, Hebrew binyanim, and the autosegmental tier

Most languages build words by chaining morphemes end-to-end: walk + -ed = walked. Nonconcatenative morphology breaks that pattern. The Arabic root √k-t-b carries the meaning "writing"; its three consonants weave through different vowel templates to produce kataba (he wrote), yaktubu (he writes), kitaab (book), kaatib (writer), maktuub (written), maktaba (library). The morpheme boundaries cross each other rather than line up. John McCarthy's 1979 autosegmental analysis explained the puzzle by putting consonants and vowels on separate phonological tiers.

Theoretical foundationMcCarthy 1979 (autosegmental)
Canonical exampleSemitic root-and-pattern
Other instancesAblaut, infixation, reduplication, subtractive
Arabic verb FormsTen classical templates (I-X)
Hebrew binyanimSeven verbal patterns
Root sizeMostly 3 consonants; some 2 or 4

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The puzzle: discontinuous morphemes

English builds unbelievable by chaining: un- + believe + -able. The morphemes line up like beads on a string. You can draw boundaries between them, and on either side of a boundary the segments belong to one morpheme or the other. This is concatenative morphology.

Now consider the Arabic word kataba ("he wrote"). Three consonants k, t, b belong to the root √k-t-b ("writing"). The two a vowels belong to the perfective active template (CaCaCa). The third-person singular masculine subject is encoded by the same template. There is no way to chop kataba into linear segments where each segment is one morpheme. The root and the template are interwoven — the consonants live in slots 1, 3, 5 and the vowels live in slots 2, 4, 6.

The same root √k-t-b plugged into different templates gives an entire derivational family. The pattern is called root-and-pattern or templatic morphology, and it is the textbook case of nonconcatenative morphology.

Worked example: the Arabic root √k-t-b

Surface form	Template	Form	Gloss
kataba	CaCaCa	I, perfect active	"he wrote"
kutiba	CuCiCa	I, perfect passive	"it was written"
yaktubu	yaCCuCu	I, imperfect active	"he writes"
kitaab	CiCaaC	maṣdar (verbal noun)	"book"
kaatib	CaaCiC	active participle	"writer"
maktuub	maCCuuC	passive participle	"written, letter"
maktab	maCCaC	locative noun	"office, desk"
maktaba	maCCaCa	locative + feminine	"library"
kattaba	CaCCaCa	Form II (causative)	"he made write"
istaktaba	istaCCaCa	Form X (request)	"he asked to write"

Memorize the root and you have not learned one word — you have learned a slot in a productive grid. Classical Arabic dictionaries are organized by root: to look up maktaba a student turns to √k-t-b and reads the whole family. Modern Standard Arabic learners drill verb tables that lay templates side by side.

McCarthy's autosegmental analysis

How can three consonants k-t-b form a morpheme when they are not adjacent on the surface? John McCarthy's 1979 MIT thesis answered with autosegmental phonology. Three independent tiers represent the word:

Root tier:        k    t    b
                  |    |    |
CV skeleton:    C V C V C V         ← the template
                  |    |    |
Vocalic tier:     a    a              ← inserted by perfect active

The root consonants associate to the C-slots; the template vowels associate to the V-slots. The surface string kataba is the concatenation of skeletal slots, but the morphemes themselves live on separate tiers. Discontinuity dissolves: each tier is concatenative; only the interleaving on the skeleton looks discontinuous.

This insight, originally made for tonal languages by John Goldsmith (1976), launched modern prosodic morphology. McCarthy and Alan Prince extended it to reduplication, infixation, and templatic patterns generally; their framework powers Optimality Theory analyses of templatic phenomena.

Cross-linguistic data

Language	Family	Mechanism	Example	Gloss
Arabic	Semitic	Root-and-pattern	√k-t-b → kataba, kitaab, maktuub	write/book/written
Hebrew	Semitic	Binyanim (7 patterns)	√k-t-b → katav, miktav, ktav	wrote/letter/handwriting
Tigrinya	Ethiopic Semitic	Templates + reduplication	√s-b-r → säbärä, säbäbärä	break/repeatedly break
Tashlhiyt Berber	Berber	Templatic + apophony	√rgl → argal, irgəl, argal	close (verbal forms)
English	Germanic	Ablaut	sing / sang / sung	present/past/participle
German	Germanic	Ablaut + umlaut	singen / sang / gesungen / Sänger	sing/sang/sung/singer
Tagalog	Austronesian	Infixation	sulat → sumulat	write → wrote
Yawelmani	Yokuts	Subtractive	panat → pana	aorist subtraction
Indonesian	Austronesian	Reduplication	buku → buku-buku	book → books

The world's nonconcatenative phenomena cluster into roughly four mechanisms: root-and-pattern (Semitic), internal change/ablaut (Germanic), infixation (Austronesian), and reduplication (very widespread). Subtractive morphology, where a piece is removed rather than added, is rarer but well-attested in Yawelmani and several Pacific languages.

Hebrew binyanim: a smaller-scale Semitic system

Hebrew organizes its verb morphology around seven binyanim (literally "buildings"). Each binyan combines a fixed CV template with a particular voice and valence. The same root √l-m-d ("learn") appears in:

lamad (paʕal — "he learned"), basic active
limmed (piʕel — "he taught"), causative
limmad (puʕal — "he was taught"), passive of piʕel
hitlammed (hitpaʕel — "he taught himself"), reflexive
nilmad (nifʕal — "it was learned"), middle/passive

The seven-by-three matrix (binyan × tense) gives a Hebrew speaker twenty-one cells per root, each with predictable form and meaning. Children acquire the system early; literate adults parse novel roots fluently.

English ablaut: residual templatic morphology

English is mostly concatenative, but it preserves a small templatic remnant: strong verb ablaut. Roughly 200 verbs change their internal vowel to mark tense:

Present	Past	Participle	Pattern
sing	sang	sung	i / a / ʌ
ring	rang	rung	i / a / ʌ
drink	drank	drunk	i / a / ʌ
swim	swam	swum	i / a / ʌ
write	wrote	written	aɪ / oʊ / ɪ
break	broke	broken	eɪ / oʊ / oʊ
foot	feet	—	ʊ / iː (umlaut, noun)
man	men	—	æ / ɛ (umlaut, noun)

This is descended from Proto-Indo-European ablaut, the same vowel-alternation system that gives Greek leip-/loip-/lip- and Latin tego/toga. Comparative reconstruction shows the alternation was once productive across the family. English preserves only fragments; Sanskrit grammarians (Pāṇini, ~400 BCE) had already systematized the equivalent for their language two and a half millennia ago.

Infixation: morphemes inside the root

An infix breaks into the middle of a root. Tagalog uses the infix -um- after the first consonant of a verb stem to mark perfective/agent-focus:

sulat (write) → s-um-ulat (wrote)
bili (buy) → b-um-ili (bought)
tawag (call) → t-um-awag (called)

The infix splits the host word. McCarthy and Prince analyzed Tagalog -um- as a clitic that prosodically wants to be word-initial but phonologically must follow an onset, and the contradiction lands it after the first consonant. English has the famous expletive infix: speakers insert profanity before a stressed syllable to give abso-bloody-lutely or fan-fucking-tastic. The placement is governed by stress, not by phonemic boundaries — a templatic-prosodic phenomenon.

Variants and edge cases

Biconsonantal roots. Some Semitic roots have two consonants (Arabic √y-d, "hand"). They behave templatically with reduced patterns.
Quadrilateral roots. Other Semitic roots have four (Arabic √t-r-j-m, "translate", from a Greek loan). The CV templates expand to accommodate.
Loanword integration. Modern Hebrew borrows verbs by extracting consonants and inserting them into binyanim — "to fax" becomes √f-k-s in piʕel, surfacing as fikses. The templatic system swallows new vocabulary.
Frequentative reduplication. Tigrinya doubles a consonant inside the template to mark repeated action. The reduplicant is morphologically motivated but lives on the same skeletal tier.
Apophonic chains. Some Berber and Cushitic systems use multi-step vowel chains (a → i → u) governed by aspect.
Tone as a templatic morpheme. Many Bantu languages use tone melodies as floating tonal templates that associate to syllables — formally identical to vocalic templates in Semitic.

Common pitfalls and misconceptions

"Nonconcatenative = Semitic." Semitic is the showcase, but ablaut, infixation, reduplication, and subtractive morphology occur worldwide. The pattern is general; root-and-pattern is one variety.
"Templatic morphology is exotic." English has it (sing/sang/sung). Every Indo-European speaker carries vestiges of templatic ablaut. Semitic speakers just have a productive system rather than fossils.
"The root is just the consonants." The root is an abstract morpheme that has consonants on the surface. The vowel template is also a morpheme — it carries voice, aspect, and category. Treating one as more morpheme-like than the other distorts the analysis.
"You can write Arabic with vowels removed." Arabic script does mark short vowels with optional diacritics; literate readers parse unvoweled text by recognizing templates. The script reflects the morphology — root letters carry information density, vowels are predictable from context.
"Root-and-pattern means concatenation never happens." Semitic has plenty of concatenation too — case suffixes, definite articles, pronominal clitics. The templatic mechanism handles stem formation; concatenation handles inflection on top of stems.
"Children of Semitic languages must learn enormous paradigms." They do not memorize each form. They learn the template system as a productive grammar and apply it to novel roots, just as English children apply -ed to novel verbs.

Frequently asked questions

What is the Arabic root √k-t-b and why is it the textbook example?

Three consonants k-t-b carry the abstract semantic core of writing. Slot them into different vowel templates and you get a whole derivational family: kataba (he wrote), yaktubu (he writes), kitaabun (book), kaatibun (writer), maktuubun (written), maktab (office, place of writing), maktaba (library), istaktaba (he caused to be written). Every literate Arab learner internalizes the template grid; lookup in classical Arabic dictionaries is by root, not surface form.

Is English ablaut (sing/sang/sung) nonconcatenative?

Yes — it is a smaller-scale instance of internal change. The morpheme of past tense modifies a vowel inside the root rather than appending an affix. English has roughly 200 such verbs (irregular strong verbs). In Semitic the same principle scales: vowel changes carry productive grammatical meaning across thousands of roots. English ablaut is residual; Semitic templatic morphology is the language's primary mechanism.

How did John McCarthy's autosegmental analysis change the field?

McCarthy's 1979 MIT dissertation argued the consonantal root and the vowel melody live on separate phonological tiers, associated by independent rules to a CV skeleton. This dissolved the puzzle of how three discontinuous consonants form a morpheme — they don't, on the surface, but they do underlyingly. The model launched modern templatic morphology and influenced prosodic morphology, reduplication theory, and Optimality Theory.

Do non-Semitic languages have templatic morphology?

A few. Berber languages (Tamazight, Tashlhiyt) have templatic patterns. Cushitic languages (Somali, Oromo) show partial templates. Tagalog and other Philippine languages use infixation (-um-, -in-) and reduplication, which are nonconcatenative. Yawelmani and several Penutian languages use subtractive morphology. But the Semitic three-consonant root-and-pattern system is unusually elaborate, and most languages stick with prefix-and-suffix concatenation.

Are Hebrew binyanim the same as Arabic forms?

Cognate but reorganized. Hebrew has seven binyanim (verbal patterns): pa'al, nif'al, pi'el, pu'al, hif'il, huf'al, hitpa'el. Arabic has ten classical Forms (I-X), each with a stem template, voice pattern, and meaning shift. Both descend from Proto-Semitic templatic morphology. Hebrew lost some forms and re-purposed others; Modern Hebrew also borrowed from Arabic and Aramaic. The general logic — root is meaning, template is grammar — is identical.

What is infixation and why is it nonconcatenative?

An infix is an affix inserted inside a root rather than at its edges. Tagalog sulat ("write") becomes sumulat ("wrote") with -um- infixed after the first consonant. English profanity exhibits expletive infixation: abso-bloody-lutely. Because the morpheme breaks into the middle of the host word, the result is not a simple chain — the operation is nonconcatenative.

Can statistical/neural models learn templatic morphology?

Modern neural systems handle Arabic and Hebrew morphology well after training on enough data, but they struggle without explicit root-and-pattern features in low-resource settings. Finite-state morphological analyzers like the Buckwalter Arabic Morphological Analyzer encode the templates explicitly and remain competitive. Compositional models that separate consonantal tier from vowel tier outperform plain transformers on cross-paradigm generalization.