Set Theory

Zorn's Lemma

Every partially ordered set in which every chain has an upper bound contains a maximal element

Zorn's Lemma states: if every totally ordered subset (chain) of a partially ordered set P has an upper bound in P, then P contains at least one maximal element. Logically equivalent to the Axiom of Choice (AC) and the Well-Ordering Theorem — proven by Zermelo (1904), Kuratowski (1922), and named after Max Zorn (1935). Used to prove existence of bases for infinite-dimensional vector spaces, maximal ideals in rings (every nonzero ring has one), Hahn-Banach theorem, Tychonoff's theorem (product of compact spaces is compact). All three statements are independent of ZF — Cohen 1963 proved AC is not derivable from ZF; Gödel 1938 proved it is not refutable.

  • EquivalentAC, Well-Ordering
  • Named afterMax Zorn (1935)
  • First provedHausdorff 1914, Kuratowski 1922
  • Independent of ZFCohen 1963, Gödel 1938
  • Used forMaximal ideals, vector basis, Hahn-Banach
  • SymbolUsually unstated assumption in algebra

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why Zorn's Lemma matters

  • Algebraic existence theorems. Every nonzero ring has a maximal ideal; every field has an algebraic closure; every vector space has a basis. None of these admit constructive proofs in the infinite case.
  • Functional analysis. The Hahn-Banach theorem extends a bounded linear functional from a subspace to the whole space — the engine behind duality, separation theorems, and the geometry of Banach spaces.
  • Hamel basis. Every infinite-dimensional vector space has a basis. ℝ over ℚ has cardinality 2^ℵ₀ and is unimaginable without Zorn — yet algebraic arguments routinely invoke it.
  • Topology. Tychonoff's theorem — the product of any family of compact spaces is compact — is itself equivalent to AC. Zorn organizes the proof around maximal filters and ultrafilters.
  • Model theory. The compactness theorem for first-order logic and the existence of ultrafilters used in nonstandard analysis both lean on Zorn's Lemma.
  • Order theory. Hausdorff's maximal principle, the Bourbaki-Witt theorem, and Tukey's lemma are all reformulations — Zorn is the most usable form for working algebraists.
  • Daily working tool. When a textbook says "by a standard application of Zorn's Lemma," it means: (1) define a partial order whose maximal elements solve the problem, (2) check chains have upper bounds, (3) apply Zorn.

Precise statement

Let (P, ≤) be a partially ordered set. Suppose every chain C ⊆ P has an upper bound in P — that is, there exists u ∈ P with c ≤ u for all c ∈ C. Then P contains at least one maximal element m, where maximal means there is no x ∈ P with m < x.

Three subtleties trip up newcomers. First, "chain has an upper bound" requires the bound to live inside P, not in some larger ambient set. Second, the empty chain counts: its upper bound is any element of P, so Zorn implicitly requires P to be nonempty. Third, "maximal" is not "maximum" — there can be many maximal elements that are mutually incomparable, and there is no requirement that m be greater than every element of P, only that nothing exceeds it.

Proving the three are equivalent

The classical chain of implications goes Axiom of Choice → Zorn's Lemma → Well-Ordering Theorem → Axiom of Choice. We sketch the harder direction, AC implies Zorn.

Suppose every chain in P has an upper bound but P has no maximal element. Then for every x ∈ P there exists some y(x) with x < y(x). Let f be a choice function on the nonempty sets {y : x < y}. Define a transfinite sequence by p₀ chosen arbitrarily, p_{α+1} = f({y : p_α < y}), and at limit ordinals λ let p_λ be the upper bound of the chain {p_α : α < λ}, then strictly above it. This builds an injective function from the class of all ordinals into the set P — impossible by Hartogs' theorem, contradiction. Hence P has a maximal element.

Canonical applications

Hamel basis. Let V be a vector space. Order the linearly independent subsets of V by inclusion. The union of a chain of linearly independent sets is linearly independent (any finite dependence would already lie in some chain element). Hence chains have upper bounds. Zorn yields a maximal linearly independent set B. Maximality means no v ∈ V can be added without breaking independence, which means v ∈ span(B) for every v — so B spans V and is therefore a basis.

Maximal ideal. Let R be a ring with 1 ≠ 0. Order the proper ideals of R by inclusion. The union of a chain of proper ideals is an ideal (closed under sums and absorption because the operations have finite arity), and is proper because 1 cannot belong to it (it would have to belong to some chain ideal, which would not be proper). Zorn yields a maximal proper ideal M, so R/M is a field.

Hahn-Banach. Let f be a bounded linear functional on a subspace Y of a normed space X. Order the pairs (Z, g) where Y ⊆ Z ⊆ X and g extends f with the same norm bound, ordered by extension. Chains glue together pointwise. Zorn yields a maximal extension; an algebraic argument shows the maximum domain must be all of X.

Independence from ZF

Gödel constructed the constructible universe L in 1938 — every set in L is built by transfinite iteration of definable operations. Inside L, ZF holds and so does AC. This shows ZF cannot disprove AC: if it could, the contradiction would replicate inside L. Cohen invented forcing in 1963 to construct models of ZF in which AC fails — the Cohen model has a set with no choice function. Together: AC is independent of ZF, neither provable nor refutable.

Many specialists accept AC by default and write ZFC = ZF + AC. A smaller set of constructive mathematicians work in ZF + Dependent Choice and a smaller still group reject all forms of choice. The 2020s consensus among working mathematicians is to use AC freely while flagging when a result holds without it.

Common misconceptions

  • Maximal equals maximum. A maximum is greater than every element; a maximal element merely has nothing strictly above it. The set of all proper subsets of {1, 2} ordered by inclusion has two maximal elements ({1} and {2}) and no maximum.
  • Always usable. Constructive and intuitionistic mathematics reject Zorn outright. Effective and computable mathematics also avoid it because Zorn produces objects with no algorithmic description — you can prove existence without ever exhibiting an example.
  • Needed for finite cases. Finite-dimensional vector spaces have bases by Gaussian elimination. Finite rings have maximal ideals by direct enumeration. Zorn's role is exclusively in the infinite world; it adds nothing in finitely many steps.
  • Equivalent to mathematical induction. Induction proves statements about ℕ. Zorn proves existence in arbitrary partial orders. The mechanism is transfinite, not finite, and the conclusions are about objects rather than properties.
  • Always gives a unique maximal element. Zorn guarantees existence, not uniqueness. The lattice of subspaces of a Hilbert space has many maximal proper subspaces, the lattice of orderings on ℝ has many maximal extensions, and so on.
  • Equivalent to "every set has a maximum." No — well-ordered nonempty sets without an upper bound (such as ℕ) have no maximum. Zorn requires the chain hypothesis, which fails for ℕ as a chain in itself.

A short history

Zermelo proved the Well-Ordering Theorem in 1904 using the principle that would later be christened the Axiom of Choice. Kuratowski stated a maximal-element principle in 1922; Hausdorff had given a chain version in 1914. Max Zorn published "A remark on method in transfinite algebra" in 1935 emphasizing the power of the maximal-element formulation as a working tool for algebraists who wanted to avoid transfinite induction. The name stuck despite Zorn not being the first to discover the principle — a recurring pattern in mathematical naming.

Frequently asked questions

How is Zorn's Lemma equivalent to the Axiom of Choice?

Inside ZF, the three statements Axiom of Choice, Well-Ordering Theorem, and Zorn's Lemma are mutually equivalent. AC implies Well-Ordering by transfinite recursion: pick the choice function on nonempty subsets of X, then build a well-order by repeatedly picking an unused element. Well-Ordering implies Zorn: well-order the set, climb the chain greedily until trapped at a maximal element. Zorn implies AC: for any family of nonempty sets, the partial order of partial choice functions ordered by extension has a maximal element which must be defined on every set.

What is a chain in a partial order?

A chain is a totally ordered subset. Given a partial order (P, ≤), a subset C ⊆ P is a chain if for every pair x, y ∈ C either x ≤ y or y ≤ x — there are no incomparable elements inside C. Examples: in the subset lattice of a set, any tower of nested subsets ∅ ⊂ A₁ ⊂ A₂ ⊂ ... is a chain. In the divisibility order on positive integers, {1, 2, 4, 8, 16} is a chain. The hypothesis of Zorn's Lemma asks every such chain to have an upper bound — an element u ∈ P with c ≤ u for every c in the chain.

Why is Zorn's Lemma needed for infinite-dimensional vector spaces to have a basis?

For finite-dimensional spaces, Gaussian elimination explicitly constructs a basis — no choice is needed. For infinite-dimensional spaces such as ℝ over ℚ, no constructive procedure produces a basis. Apply Zorn to the partial order of linearly independent subsets ordered by inclusion. Any chain of linearly independent sets has the union as upper bound, which is itself linearly independent because linear independence is a finite-witness property. Zorn yields a maximal linearly independent set, and maximality forces it to span — this is a Hamel basis. Without AC, Cohen-style models exist where ℝ over ℚ has no basis.

What is a maximal ideal and how does Zorn give it?

In a ring R with unity, an ideal M is maximal if M ≠ R and no ideal lies strictly between M and R. Equivalently, R/M is a field. To prove every nonzero ring has a maximal ideal, take the partial order of proper ideals ordered by inclusion. The union of any chain of proper ideals is again a proper ideal because 1 cannot suddenly appear (it would have to lie in some chain element). Zorn gives a maximal element. The same argument shows every proper ideal is contained in a maximal ideal, which underlies most of commutative algebra and algebraic geometry.

Why is the Axiom of Choice controversial — Banach-Tarski paradox?

The Banach-Tarski paradox (1924) uses AC to decompose the unit ball in ℝ³ into 5 pieces and reassemble them by rigid motions into two unit balls. The pieces are not Lebesgue-measurable — AC produces sets that no constructive procedure can describe. Critics view this as evidence that AC creates objects with no physical or computational content. Defenders point out that infinite-dimensional functional analysis, algebra, and topology depend on AC for their cleanest theorems, and that rejecting AC leaves a cramped mathematical universe where ℝ might not have a basis over ℚ and ultrafilters might not exist.

Can analysis be done without the Axiom of Choice?

Yes, but with concessions. ZF + Dependent Choice (DC) suffices for most of classical real analysis: every Cauchy sequence converges, the Heine-Borel theorem on ℝⁿ holds, separable Banach spaces behave normally. Solovay's model (1970) shows ZF + DC + 'every set of reals is Lebesgue measurable' is consistent — Banach-Tarski disappears. What you lose without full AC: Hahn-Banach in nonseparable spaces, Tychonoff for general products, well-orderings of ℝ, ultrafilters on ℕ, and a basis for ℝ over ℚ. Constructive mathematicians reject even DC, but most working analysts accept ZFC.