Theory

Principles and Parameters (Chomsky)

Universal Grammar as a fixed core plus a handful of binary switches

Principles and Parameters is Noam Chomsky's theory that Universal Grammar consists of invariant principles plus a small set of binary switches (parameters) that languages set differently. Articulated in his 1979 Pisa Lectures and published as Lectures on Government and Binding (1981), the framework treats first-language acquisition as parameter-setting from limited input rather than rule-learning. Setting the head parameter once explains why English says eat sushi and Japanese says sushi tabe-ru; setting the pro-drop parameter explains why Italian drops subject pronouns and English does not.

  • OriginatorNoam Chomsky, Pisa Lectures 1979
  • Foundational textLectures on Government and Binding (1981)
  • Core ideaInvariant principles + finite binary parameters
  • Classic parametersHead, pro-drop, wh-movement, configurationality
  • Acquisition modelSwitch-setting from positive evidence
  • SuccessorMinimalist Program (Chomsky 1995)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How the framework works

The puzzle Chomsky's 1979 Pisa Lectures set out to solve was Plato's problem: how do children acquire a system as complex as a natural-language grammar from input that is finite, fragmentary, and free of correction? Pre-1979 generative grammar answered with rule systems — long lists of phrase-structure rules and transformations a child had to induce. By the late 1970s the rule lists had grown unwieldy and language-specific, and the acquisition story was failing.

The Principles and Parameters framework reframed the problem. Suppose Universal Grammar (UG) is a fixed core of principles — laws of grammatical computation — plus a small finite set of parameters — binary switches the child must set from input. Then acquisition is not rule-learning but parameter-fixing. A handful of switches, once set, generate a complete grammar.

The architecture has three parts:

  1. Principles are invariant. Structure-dependence says operations target hierarchical units, not linear positions. Subjacency bounds movement across two cyclic nodes at most. The Binding Theory (Chomsky 1981) constrains the distribution of anaphors (himself), pronouns (him), and referring expressions (John). The Empty Category Principle licenses traces of movement.
  2. Parameters vary across languages. The head parameter sets phrase-internal order. The null-subject (pro-drop) parameter governs whether subject pronouns may be dropped. The wh-movement parameter distinguishes English-type fronting from Mandarin-type in-situ. The configurationality parameter distinguishes English from non-configurational Warlpiri.
  3. Lexicon holds idiosyncratic word-specific information — argument structure, irregular morphology, exceptions.

The cross-linguistic payoff is that a parameter setting cascades. Once a child fixes head-initial, every projection — verb-object, preposition-noun, noun-complement — falls into line. The economy of the explanation drove the framework's adoption.

Worked example: setting the head parameter

Greenberg's typological surveys (1963) revealed that languages tend to be either consistently head-initial (like English) or consistently head-final (like Japanese). The Principles and Parameters answer is that a single binary switch determines the orientation of all heads.

Consider the input a child receives:

  • English child hears: "drink milk", "on the chair", "kicked the ball", "happy with the cake". Verb before object, preposition before noun, adjective followed by complement.
  • Japanese child hears: miruku-o nomu (milk-OBJ drink), isu-no ue-ni (chair-GEN top-on), booru-o ketta (ball-OBJ kicked), keeki-de ureshii (cake-with happy). Object before verb, noun before postposition, complement before adjective.

A handful of utterances is enough to fix the parameter. In Lila Gleitman and colleagues' acquisition work, English-learning toddlers respect verb-object order from the earliest two-word stage; Japanese-learning toddlers respect object-verb order from the same stage. The cascade across categories is in place essentially from first multi-word combinations.

Subsequent typological work has complicated the simple binary picture. Mixed-headedness languages exist (German is head-final in VP but head-initial in PP). Mark Baker (The Atoms of Language, 2001) recast this as a hierarchy of parameters, with high-level macroparameters constraining clusters of microparameters. The head parameter remains the textbook illustration.

Worked example: English do-support as a parametric consequence

English negation and questions look bizarre cross-linguistically. "John does not eat sushi" — why the auxiliary does? Why not "John eats not sushi", parallel to French Jean ne mange pas de sushi?

The Principles and Parameters answer threads several pieces. In Romance and Old English, the lexical verb raises to a higher functional head (T or Infl) to combine with tense and negation. In Modern English, that verb-raising has been lost — only auxiliaries (be, have, modals) raise. When negation or question fronting demands an element in T, English inserts a dummy do. This is Pollock's (1989) classic analysis in Linguistic Inquiry.

The parametric switch is not idiosyncratic English oddity — it is a single setting (loss of V-to-T movement) with multiple visible consequences: do-support, subject-auxiliary inversion patterns, the placement of adverbs ("often eats" vs French "mange souvent"), and the position of negation. One parametric change in late Middle English explains all four phenomena. This is the kind of bundling the framework was built to capture.

Principles and Parameters vs other syntactic frameworks

Principles & ParametersMinimalismLFGHPSGConstruction GrammarSystemic Functional
OriginatorsChomsky 1981Chomsky 1995Bresnan, Kaplan 1982Pollard, Sag 1994Goldberg 1995, 2006Halliday 1985
Universal GrammarRich, innateSlimmer, mostly MergeYes, modularYes, schemasRejected — emergentistRejected — social
Cross-linguistic variationParameter settingLexical featuresLexical entriesType hierarchyConstruction inventoryRegister / context
MovementMove-αInternal MergeFunctional structure (no movement)Feature percolationNoneNone
Acquisition storySwitch settingFeature acquisitionLexical learningSchema abstractionItem-based learningFunctional learning
Computational tractabilityLimitedLimitedStrongStrongStrongModerate
Major textsGB Lectures (1981)Minimalist Program (1995)Bresnan 1982, 2001Pollard & Sag 1994Goldberg 1995IFG (1985, 2014)

Parameter setting and the acquisition argument

The framework's strongest selling point was its solution to the acquisition problem. Children master grammar by age four or five from input that, by Chomsky's argument, is too impoverished to support induction of arbitrary rule systems — the Poverty of the Stimulus argument. Switch-setting circumvents the difficulty: there is nothing to induce, only a finite set of switches to fix.

Steven Pinker's The Language Instinct (1994) popularized the case. Pinker's argument: complex inferences (auxiliary inversion, anaphora, constraints on movement) are mastered without explicit instruction, despite input that does not unambiguously cue the right rule. A built-in parameter space explains the speed and uniformity of acquisition.

Charles Yang's variational model (Knowledge and Learning in Natural Language, 2002) made the acquisition mechanism formal. The child entertains multiple candidate grammars; each parses incoming input with some probability of success; weights shift toward grammars that fit. Parameters with strong signal (head direction) converge in months; parameters with weak signal (residual V2, German) converge over years. Yang's model fits acquisition data closely and motivates a cleaner version of switch-setting than Chomsky's original triggering-experience proposal.

Counterarguments and the construction-grammar challenge

Adele Goldberg's Constructions: A Construction Grammar Approach to Argument Structure (1995) launched the most serious modern alternative. Goldberg argued grammar is not principles-plus-parameters but an inventory of constructions — pairings of form with meaning at every level from morpheme to clause. The ditransitive construction Subj V Obj1 Obj2 contributes its own meaning ("X causes Y to receive Z") independent of the verb. Argument structure is constructional, not lexical.

Functionalists (Givón, Bybee, Croft) argue grammatical regularities emerge from frequency and use, not innate switches. Statistical learning over corpora reproduces many grammatical patterns without postulating Universal Grammar.

Typologists raise empirical worries. Rizzi's classic pro-drop cluster — null subjects, free inversion, that-trace, expletive drop — fails to bundle in many languages. Greek allows null subjects without the inversion pattern. Brazilian Portuguese is losing pro-drop while keeping inversion. Evans and Levinson's The Myth of Language Universals (Behavioral and Brain Sciences, 2009) marshalled cross-linguistic data to argue that the parameter space is too small to capture observed diversity.

Defenders respond that parameters were never meant as exhaustive descriptive devices. They are heuristics organizing variation. The Minimalist reformulation — variation lives in lexical features on functional heads — addresses the typological challenges by giving up the macro-cluster claim while keeping the principles-plus-variation architecture.

Variants and successors

  • Government and Binding (1981–1995). The classic articulation. Modules: X-bar theory, theta-theory, case theory, government, binding theory, bounding theory, control theory, ECP. Rich, intricate, and the basis of most graduate syntax courses through the 1990s.
  • Minimalist Program (1995–). Chomsky's The Minimalist Program dismantled GB modules, replacing them with Merge plus interface conditions. Parameters are recast as features on functional heads. The architecture is leaner; the empirical coverage debated.
  • Cartographic syntax (Cinque, Rizzi 1990s–2000s). Builds elaborate hierarchies of functional projections (TopP, FocP, FinP) with fixed cross-linguistic order. Parameters select which projections are pronounced.
  • Macroparameters (Baker 2001). The Polysynthesis Parameter, the Head-Directionality Parameter, and others organized into hierarchies. Makes clear empirical predictions about clusters.
  • Microparametric variation (Kayne 2000–). Variation lives in tiny lexical-feature differences between dialects. Closer Italian dialects differ in dozens of microparameters.

Common pitfalls in interpreting the framework

  • Treating parameters as literal psychological switches. Chomsky's original talk of "switches" is heuristic; the cognitive reality is open. Parameters are research tools that organize cross-linguistic generalizations.
  • Expecting all classical clusters to bundle. Rizzi's pro-drop cluster fails empirically in several languages. Modern work treats clustering as gradient, not categorical.
  • Conflating Government and Binding with Minimalism. They are stages of one research program but differ substantively in machinery. GB has government, traces, indices, and modules; Minimalism reduces to Merge plus features.
  • Reading "Universal Grammar" as identical sentences cross-linguistically. UG is the architecture of grammar — what kinds of rules are possible, how acquisition proceeds — not a set of universal sentences. Surface patterns vary; the system that generates them is constrained.
  • Assuming parameters explain everything in typology. Word-frequency effects, areal contact, sociolinguistic conditioning, and semantic shifts shape language change in ways orthogonal to parametric architecture.
  • Mistaking critique of clusters for refutation of UG. Even if classical parameter clusters fail, the architectural claim — invariant computation plus finite variation — survives multiple reformulations.

Legacy and current status

Principles and Parameters reorganized syntactic theory in the 1980s and remains the dominant framework in generative linguistics, in its Minimalist successor form. Comparative syntactic work in Romance, Germanic, and Bantu languages continues to be parametric in spirit. The framework's acquisition story shaped psycholinguistics for two decades. Its typological program, with Mark Baker's macroparameters and Roberts and Holmberg's parameter hierarchies, is active research.

The competitor frameworks — LFG, HPSG, Construction Grammar, Cognitive Grammar, Systemic Functional Grammar — have grown alongside and offer real alternatives, but no single replacement has achieved the explanatory unification of acquisition, typology, and theoretical syntax that Principles and Parameters offered. The 2009 Evans–Levinson critique sharpened the empirical questions; the 2010s saw nuanced parameter-hierarchy responses. The debate is unresolved and active.

Frequently asked questions

What is the difference between a principle and a parameter?

A principle is invariant — it holds in every language. Structure-dependence (operations target hierarchical units, not linear positions), the binding conditions (Conditions A, B, C on anaphors, pronouns, and names), and the Empty Category Principle are principles. A parameter is a small, finite choice the child fixes from input — typically binary. The head parameter (head-initial vs head-final), the pro-drop parameter (subject pronouns optional vs obligatory), the wh-movement parameter (overt vs in-situ), and the configurationality parameter (fixed vs free word order) are classic examples. Principles explain what is universal; parameters explain what varies.

What is the head parameter?

The head parameter sets whether a phrase places its head before or after its complements. English is head-initial: verbs precede objects ("eat sushi"), prepositions precede nouns ("on the table"), nouns precede their PP complements ("book about syntax"). Japanese is head-final: verbs follow objects (sushi-o tabe-ru, literally "sushi-OBJ eat"), postpositions follow nouns (teburu-no ue-ni, "table-GEN top-on"), nouns follow their relative clauses. A child needs only a handful of head-complement examples to fix the value. Once set, the parameter cascades across categories — the same setting is taken to govern V, P, N, and A simultaneously, explaining the cross-categorial consistency Greenberg (1963) observed.

What is the pro-drop parameter?

The pro-drop (or null-subject) parameter governs whether a language permits subject pronouns to be omitted. Italian and Spanish are pro-drop: "Parla italiano" (speaks Italian) is grammatical without an overt subject; rich verbal agreement recovers person and number. English is non-pro-drop: "*Speaks Italian" is ungrammatical without "He." Luigi Rizzi's Issues in Italian Syntax (1982) developed the parameter in detail. It bundles related properties: free subject inversion, empty expletives, and that-trace effects. Pro-drop is now considered a cluster, not a single switch — Greek allows subject drop without inversion patterns, complicating the original cluster claim.

How do children set parameters?

The child encounters input and fixes parameters from positive evidence — utterances they hear. Negative evidence (correction) is absent or sparse cross-linguistically (Brown and Hanlon, 1970). Charles Yang's variational learning model (Knowledge and Learning in Natural Language, 2002) has children entertain multiple grammars in parallel, weighting them by parsing success on input. Parameters with robust signal (head direction is signaled by every VP and PP) set quickly; parameters with subtle signal (verb-second in residual V2 contexts) set later. Acquisition order roughly tracks signal strength.

Why did Chomsky abandon Government and Binding for Minimalism?

Government and Binding (GB) accumulated rich theoretical machinery — government, the binding theory, theta-theory, case theory, the empty-category principle. By the early 1990s Chomsky began asking which parts were necessary. The Minimalist Program (1995) reduced syntactic operations toward Merge alone and shifted parametric variation to the lexicon ("the lexical parameterization hypothesis") and the morphology-phonology interface. Many linguists view Minimalism as a continuation rather than a rejection — principles persist, but parameters are recast as features on functional heads. Critics argue Minimalism trades empirical coverage for elegance.

What are the main objections to Principles and Parameters?

Functionalists (Givón, Bybee) argue grammar is shaped by use, not innate switches; statistical regularities suffice. Construction Grammar (Goldberg 1995, 2006) holds that constructions, not parameters, are the units of grammatical knowledge. Typologists (Croft, 2001) note the predicted parameter clusters often fail empirically — pro-drop properties don't bundle as Rizzi predicted. Evans and Levinson's "The Myth of Language Universals" (2009) argued cross-linguistic diversity is too deep for a small parameter space. Defenders respond that the framework's value is heuristic, not literal; parameters are research tools, not psychological reality.

Is Principles and Parameters still active research?

Yes, though largely absorbed into Minimalism. Mark Baker's The Atoms of Language (2001) proposed a parameter hierarchy organizing variation. Roberts and Holmberg's Parameter Hierarchies (2010s) elaborated this. Comparative work in Romance, Germanic, and Bantu syntax continues to test parametric predictions. The framework remains the dominant generative explanation for cross-linguistic variation in graduate syntax curricula, even where Minimalist machinery has replaced GB-era technicalities.