Set Theory

Cardinality of Infinite Sets

Cantor's hierarchy of infinities

Two sets have the same cardinality when there is a bijection between them. Cantor (1874-1891) discovered that this seemingly modest definition produces strictly different sizes of infinity: |ℕ| = |ℤ| = |ℚ| = ℵ₀, but |ℝ| = 2^ℵ₀ is strictly larger, and the hierarchy continues without end. The continuum hypothesis — that |ℝ| is the second-smallest infinite size — is provably independent of the standard ZFC axioms.

  • Same size meansA bijection exists
  • Smallest infiniteℵ₀ = |ℕ|
  • |ℝ|2^ℵ₀ = c
  • Cantor's theorem|2^X| > |X| always
  • CHIndependent of ZFC

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Cardinality and bijections

Two sets A and B have the same cardinality, written |A| = |B|, if there exists a bijection f : A → B — a function that is both injective (no two elements of A go to the same element of B) and surjective (every element of B has a preimage). Cardinality is then a generalisation of "counting": for finite sets, |A| = n iff A has exactly n elements; for infinite sets, the bijection definition extends naturally.

Define |A| ≤ |B| if there is an injection A → B. The Cantor-Bernstein theorem: if |A| ≤ |B| and |B| ≤ |A|, then |A| = |B|. This proves cardinal comparison is anti-symmetric and lets us reason about cardinalities without explicit bijections.

Countable infinity — ℵ₀

A set A is countable (or countably infinite) if |A| = |ℕ|. The cardinality is denoted ℵ₀ (aleph-null), the first transfinite cardinal. Countable sets are exactly those that can be listed in a single infinite sequence a₀, a₁, a₂, … with no repeats and no omissions.

Surprising countable sets:

  • via the bijection 0 → 0, 1 → 1, −1 → 2, 2 → 3, −2 → 4, …; alternately, n ≥ 0 → 2n and n < 0 → 2|n| − 1.
  • ℕ × ℕ via Cantor's pairing function π(m, n) = (m + n)(m + n + 1)/2 + n, which lists pairs by anti-diagonal.
  • = positive and negative fractions p/q. Lay out p/q on the integer lattice; walk a zigzag through it; skip any fraction not in lowest terms; you have an enumeration of ℚ.
  • The set of finite words over a finite alphabet. List all length-1 words first, then length-2, etc.
  • The set of algebraic numbers. Each algebraic number is a root of an integer polynomial; integer polynomials of degree d form a countable set; each polynomial has finitely many roots; countable union of finite sets is countable.
  • Computable real numbers. Each is specified by a Turing machine description, and Turing machines can be enumerated.

The pattern: any set whose elements can be specified by a finite-length description from a countable alphabet is itself countable.

Proving |ℕ| = |ℤ| = |ℚ|

|ℕ| = |ℤ|. Define f : ℕ → ℤ by f(0) = 0, f(2k − 1) = k, f(2k) = −k for k ≥ 1. This is a bijection: every integer appears exactly once.

|ℕ| = |ℕ × ℕ|. The Cantor pairing function π : ℕ × ℕ → ℕ given by π(m, n) = (m + n)(m + n + 1)/2 + n is bijective. Geometrically, it walks the lattice along anti-diagonals m + n = 0, 1, 2, … and within each anti-diagonal increments n.

|ℕ| = |ℚ|. First, |ℕ| = |ℕ × ℕ|. Map (m, n) ↦ m/n for n > 0 to surject onto ℚ⁺; combine with negatives and zero. Skipping non-lowest-terms gives a bijection; alternatively, take a surjection ℕ → ℚ and use the fact that any infinite subset of ℕ is in bijection with ℕ (a consequence of well-ordering). Either way: |ℚ| = ℵ₀.

Cantor's diagonal argument — |ℝ| > ℵ₀

Theorem (Cantor, 1891). The set of real numbers in [0, 1] is uncountable.

Proof. Suppose, for contradiction, that the reals in [0, 1] could be listed: r₁, r₂, r₃, …, where each rᵢ has a decimal expansion

r₁ = 0.d₁₁ d₁₂ d₁₃ d₁₄ …
r₂ = 0.d₂₁ d₂₂ d₂₃ d₂₄ …
r₃ = 0.d₃₁ d₃₂ d₃₃ d₃₄ …
r₄ = 0.d₄₁ d₄₂ d₄₃ d₄₄

Define a new number x = 0.x₁x₂x₃… by choosing each digit xₙ ≠ dₙₙ (and not equal to 9 or 0, to avoid the dyadic-expansion ambiguity). Specifically: xₙ = 5 if dₙₙ ≠ 5, else xₙ = 6.

This x is in [0, 1]. But x ≠ rₙ for any n, because x and rₙ disagree in the n-th decimal digit. So x is not on the list — contradicting the assumption that the list was complete. The reals in [0, 1] cannot be enumerated. By the obvious bijection [0, 1] ↔ ℝ (via, e.g., x ↦ tan(π(x − 1/2))), |ℝ| > ℵ₀. □

|ℝ| = 2^ℵ₀

Every real x ∈ [0, 1] has a binary expansion 0.b₁b₂b₃…, which is a function ℕ → {0, 1}, equivalently a subset of ℕ (positions where bᵢ = 1). So

|[0, 1]| ≤ |2^ℕ| = 2^ℵ₀.

Conversely, given any subset S ⊆ ℕ, the indicator binary expansion 0.b₁b₂… (bᵢ = 1 iff i ∈ S) defines a real in [0, 1], so |[0, 1]| ≥ |2^ℕ|. Cantor-Bernstein gives equality (modulo countably many points where dyadic rationals have two binary expansions, which doesn't affect cardinality).

The cardinality of ℝ is therefore c = 2^ℵ₀, the "cardinality of the continuum". Cantor's theorem (next section) confirms 2^ℵ₀ > ℵ₀, so this is genuinely larger than countable infinity.

Cantor's theorem and the power set

Theorem (Cantor, 1891). For every set X, |X| < |2^X|, where 2^X denotes the set of all subsets of X.

Proof. Suppose f : X → 2^X is any function. Define D = { x ∈ X : x ∉ f(x) }. We claim D is not in the range of f. If D = f(y) for some y ∈ X, then by definition y ∈ D iff y ∉ f(y) = D, a contradiction. So no surjection X → 2^X exists. Combined with the obvious injection x ↦ {x} that gives |X| ≤ |2^X|, we conclude |X| < |2^X|. □

This produces an unbounded ascending chain of cardinalities:

ℵ₀ < 2^ℵ₀ < 2^(2^ℵ₀) < 2^(2^(2^ℵ₀)) < …

There is no largest infinite cardinality. The class of all cardinals is itself "too big" to be a set — the Burali-Forti paradox lurks here, resolved in ZFC by treating ordinals (and hence cardinals) as a proper class.

ℵ₀ vs ℵ₁ vs c

SymbolDefinitionConcrete examplesEquals what?
ℵ₀ (aleph-null)Cardinality of ℕ; smallest transfinite cardinalℕ, ℤ, ℚ, finite words over a finite alphabet, algebraic numbers, computable reals|ℕ|; ℵ₀ = ℵ₀ + ℵ₀ = ℵ₀ · ℵ₀
ℵ₁Smallest cardinal strictly larger than ℵ₀; the next alephSet of countable ordinals ω₁; provably uncountable, provably no smaller uncountable cardinal existsℵ₁ ≤ 2^ℵ₀; equality is CH
c (continuum)Cardinality of ℝ; equals 2^ℵ₀ℝ, ℂ, irrationals, transcendentals, ℝⁿ, infinite binary sequences {0,1}^ℕ, P(ℕ)2^ℵ₀
ℵ₂Smallest cardinal larger than ℵ₁Set of all ordinals smaller than ω₂ℵ₂ ≤ 2^ℵ₁
ℶ₀ (beth-null)= ℵ₀; the beth and aleph hierarchies start at the same pointSame as ℵ₀ℵ₀
ℶ₁ (beth-one)= 2^ℵ₀ = c; the cardinality of the continuumℝ, P(ℕ), continuous functions ℝ → ℝc = 2^ℵ₀; CH says ℶ₁ = ℵ₁
ℶ₂2^c = 2^(2^ℵ₀)P(ℝ), all functions ℝ → ℝ, the entire space of subsets of the real line2^c

The aleph and beth hierarchies are different: ℵₙ₊₁ is "the next infinite cardinal" while ℶₙ₊₁ is 2^ℶₙ. They agree at ℶ₀ = ℵ₀. The Generalized Continuum Hypothesis (GCH) asserts ℵₙ = ℶₙ for all n; like CH, it is independent of ZFC.

The continuum hypothesis

Cantor's question, first stated 1878: Is there a set whose cardinality is strictly between ℵ₀ and 2^ℵ₀?

The continuum hypothesis (CH) says no: 2^ℵ₀ = ℵ₁. Equivalently, every infinite subset of ℝ either has cardinality ℵ₀ (countable) or cardinality c (continuum) — no "intermediate" subset exists.

CH appeared as Hilbert's first problem in 1900. The resolution was unusual:

  • Gödel (1940). Constructed the inner model L (constructible universe) inside ZFC. L satisfies CH (and even GCH). Hence CH is consistent with ZFC — it cannot be disproved.
  • Cohen (1963). Invented forcing to build a model of ZFC + ¬CH. Hence ¬CH is consistent with ZFC — CH cannot be proved.

CH is therefore independent of ZFC. It is the first concrete mathematical statement shown to be independent of the standard foundation. Set theorists since have explored extensions of ZFC (large cardinal axioms, forcing axioms, Woodin's Ω-logic) that aim to decide CH, but no such extension has won broad consensus.

Cardinal arithmetic

For infinite cardinals κ, λ:

  • Sum. κ + λ = max(κ, λ). So ℵ₀ + ℵ₀ = ℵ₀, ℵ₀ + 1 = ℵ₀.
  • Product. κ · λ = max(κ, λ) for κ, λ ≥ ℵ₀ at least one infinite. So ℵ₀ · ℵ₀ = ℵ₀, c · ℵ₀ = c.
  • Exponentiation. Genuinely different. 2^ℵ₀ > ℵ₀, ℵ₀^ℵ₀ = 2^ℵ₀ = c, c^ℵ₀ = c, c^c = 2^c.
  • König's theorem. If κᵢ < λᵢ for all i ∈ I, then Σ κᵢ < ∏ λᵢ. Implies κ < κ^cf(κ) — exponentiation always escapes upward.

The strangeness: addition and multiplication collapse, but exponentiation doesn't. Adding or multiplying countably infinite sets stays countable; raising 2 to ℵ₀ does not.

Why it matters

  • Foundations of analysis. Knowing |ℝ| > |ℕ| means most real numbers are not specifiable by any finite description. The transcendentals are uncountable; the algebraics are countable. Almost every real number is transcendental, but explicitly producing one (Liouville's number, π, e) is non-trivial.
  • Measure and probability. Lebesgue measure assigns positive measure only to uncountable sets — every countable set has measure zero. The fact that ℚ has measure zero in ℝ underwrites most of integration theory.
  • Computability. The set of Turing machines is countable; the set of total functions ℕ → ℕ is uncountable; therefore most functions are uncomputable. This counting argument is the easiest proof of the existence of uncomputable problems.
  • Logic. Löwenheim-Skolem theorem says every consistent first-order theory has a countable model. Skolem's "paradox": a countable model of ZFC contains a set the model believes is uncountable, even though externally only countably many sets exist. Cardinality is model-relative.
  • Topology. The Baire category theorem distinguishes "fat" (residual) from "thin" (meager) sets, a topological analogue of cardinal-theoretic distinctions. Many genericity arguments in analysis depend on the cardinality difference between countable and uncountable.

Sketching the diagonal

Diagonalization is a single visual idea applied repeatedly. Picture an infinite list of binary sequences, one per row, indexed by row number 1, 2, 3, …

Row 1: 0 1 0 1 1 0 1 …
Row 2: 1 0 1 0 0 1 1 …
Row 3: 0 1 1 0 1 0 1 …
Row 4: 1 1 0 1 0 1 0 …
Row 5: 0 0 1 1 0 1 1 …

Read the diagonal: 0, 0, 1, 1, 0, … Now flip every bit: 1, 1, 0, 0, 1, … This new sequence differs from row n in column n, so it cannot equal any row in the list. Hence no list of all binary sequences exists — |2^ℕ| > ℵ₀.

The same trick proves Russell's paradox, the halting problem's undecidability, Gödel's incompleteness, Tarski's undefinability of truth, and Cantor's theorem in full generality. Every "size" or "describability" barrier in mathematics is, ultimately, a diagonal.

Common mistakes

  • Believing |ℕ| < |ℚ| because ℚ is dense. Density is metric, cardinality is combinatorial. ℚ is dense in ℝ but |ℚ| = ℵ₀ = |ℕ|. The two notions of "size" are unrelated.
  • Believing 2 · ∞ > ∞. In cardinal arithmetic, 2 · ℵ₀ = ℵ₀. Doubling ℕ (taking pairs) doesn't increase the cardinality.
  • Confusing "uncountable" with "very large". Uncountable means strictly larger than ℕ. There are uncountable sets that are perfect, nowhere-dense, and have measure zero — e.g., the Cantor set. "Uncountable" doesn't imply "spread out".
  • Thinking the continuum hypothesis is a question we'll eventually settle. Independence is a permanent fact about ZFC. Settling CH requires extending ZFC with new axioms — which set theorists do (large cardinals, forcing axioms), but no extension has won consensus as "the right way" to settle it.
  • Confusing cardinality with order type. ω + 1 (the natural numbers followed by one extra element) and ω + ω (two copies of ℕ joined) and ω · ω (lexicographic ℕ × ℕ) are all distinct ordinals but all have cardinality ℵ₀. Order is finer than cardinality.
  • Believing Cantor's argument is "circular" or "construct a hard list" pseudo-argument. It's a fully constructive proof: given any specific listing, the diagonal element is computed digit-by-digit and provably differs from every listed real. The argument is bulletproof and has stood for 130 years.

Frequently asked questions

How can ℕ and ℚ have the same size when ℚ is dense and ℕ has gaps?

"Same size" for infinite sets means "there exists a bijection", not "looks the same on a number line". Cantor's pairing function lists all positive rationals p/q in a zigzag through the integer lattice, hitting every rational exactly once. Density is a metric/topological property; cardinality only counts. ℚ is dense in ℝ but still countable; ℝ \ ℚ (the irrationals) is uncountable.

What does Cantor's diagonal argument actually show?

It proves that no list of real numbers can include every real. Suppose someone claims to have listed all reals in [0, 1] as r₁, r₂, r₃, … Construct a new real x by taking the n-th decimal digit different from the n-th digit of rₙ. Then x differs from every rₙ in at least one decimal place, so x isn't on the list. By contradiction, no such list exists. ℝ is uncountable.

What is the continuum hypothesis?

Cantor's question: is there an infinite set whose cardinality is strictly between ℵ₀ (size of ℕ) and 2^ℵ₀ (size of ℝ)? CH says no — that 2^ℵ₀ = ℵ₁, the smallest cardinal larger than ℵ₀. Gödel (1940) proved CH is consistent with ZFC; Cohen (1963) proved its negation is consistent. CH is independent of the standard foundation.

Is ℵ₀ the smallest infinity?

Yes. ℵ₀ = |ℕ| is the smallest infinite cardinal: every infinite set has a countably infinite subset (in ZFC; in ZF + countable choice). Smaller infinities don't exist; larger ones — ℵ₁, ℵ₂, …, ℵω, … — form a proper class of transfinite cardinals.

How is cardinal arithmetic different from regular arithmetic?

Strange. ℵ₀ + ℵ₀ = ℵ₀, ℵ₀ · ℵ₀ = ℵ₀, ℵ₀ + 1 = ℵ₀, but ℵ₀^ℵ₀ = 2^ℵ₀ > ℵ₀. Multiplication by a non-zero finite or countable factor doesn't change ℵ₀, but exponentiation jumps to a strictly larger cardinality. König's theorem: κ < κ^cf(κ) — exponentiation always escapes to a larger cardinal.

Why is |ℝ| = 2^ℵ₀?

Each real number in [0, 1] corresponds to a binary expansion 0.b₁b₂b₃… — a function ℕ → {0, 1}, equivalently a subset of ℕ (positions where bᵢ = 1). The set of all subsets of ℕ has cardinality 2^|ℕ| = 2^ℵ₀, modulo a countable correction for dyadic rationals having two expansions. Cantor's theorem 2^κ > κ for any cardinal κ then explains why |ℝ| > |ℕ|.