Topology
Metric Space
Distance, axiomatized
A metric space is a set X equipped with a distance function d : X × X → ℝ≥0 satisfying three axioms: identity (d = 0 only at equal points), symmetry, and the triangle inequality. From these three rules — and nothing else — emerge open and closed sets, convergence, continuity, completeness, and the entire substrate of real analysis.
- Axioms3 (identity, symmetry, △)
- d : X × X →ℝ≥0
- Open ballB(x, r) = {y : d(x,y) < r}
- CompleteAll Cauchy seq's converge
- GeneralizesEuclidean ℝⁿ to any set
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The three axioms
A metric space is a pair (X, d) where X is a non-empty set and d : X × X → ℝ is a function satisfying, for all x, y, z ∈ X:
- Identity of indiscernibles — d(x, y) = 0 if and only if x = y. Distinct points are at strictly positive distance; a point is at zero distance only from itself.
- Symmetry — d(x, y) = d(y, x). The distance from London to Paris equals the distance from Paris to London.
- Triangle inequality — d(x, z) ≤ d(x, y) + d(y, z). Detours never shorten the journey.
Non-negativity (d ≥ 0) is sometimes listed as a fourth axiom, but it is a consequence of the other three: from 0 = d(x, x) ≤ d(x, y) + d(y, x) = 2d(x, y), so d(x, y) ≥ 0.
That's it. No vector space, no inner product, no algebraic operations. Any set you can equip with such a function becomes a metric space, and you immediately inherit a vast library of analytic theorems — Banach's fixed-point theorem, Baire's category theorem, Stone-Weierstrass — that apply uniformly to every concrete instantiation.
What the metric buys you
Once you have d, you can define everything analysis needs. The chain is short and powerful:
- Open ball — B(x, r) = { y ∈ X : d(x, y) < r }. The set of all points within radius r of x.
- Open set — U ⊆ X is open if every x ∈ U has some r > 0 with B(x, r) ⊆ U. (Open balls themselves are open.)
- Closed set — complement of an open set; equivalently, contains all its limit points.
- Convergence — xₙ → x means d(xₙ, x) → 0.
- Cauchy sequence — for every ε > 0 there is an N with d(xₘ, xₙ) < ε for all m, n > N.
- Continuity — f : X → Y is continuous at x if for every ε > 0 there is δ > 0 with d_X(x, x') < δ ⟹ d_Y(f(x), f(x')) < ε. Equivalently, preimages of open sets are open.
- Completeness — every Cauchy sequence converges in X.
- Compactness — every open cover has a finite subcover (or, equivalently in metric spaces, every sequence has a convergent subsequence).
The whole edifice rests on three axioms about a single function.
Canonical examples
- Euclidean ℝⁿ. d(x, y) = √Σ (xᵢ − yᵢ)². The prototype.
- Taxicab metric on ℝ². d(x, y) = |x₁ − y₁| + |x₂ − y₂|. Distance you'd walk on a grid of streets. Same topology as Euclidean, different geometry — open balls are diamonds, not disks.
- Chebyshev (max) metric. d(x, y) = max |xᵢ − yᵢ|. King's distance on a chessboard; balls are squares.
- Discrete metric. d(x, y) = 1 if x ≠ y, else 0. Every set carries this metric. The induced topology is discrete: every singleton is open.
- Hamming metric on {0, 1}ⁿ. d(x, y) = number of positions where x and y differ. Used in coding theory; minimum distance of a code controls error correction.
- Edit (Levenshtein) distance on strings. Minimum number of insertions, deletions, and substitutions to convert one string to another. Powers spell-checkers and DNA alignment.
- Sup metric on C([a, b]). d(f, g) = sup |f(x) − g(x)|. Convergence is uniform convergence; the resulting space is complete.
- L¹ metric on integrable functions. d(f, g) = ∫|f − g|. Convergence is L¹ convergence; need to identify functions equal almost everywhere to satisfy the identity axiom.
- p-adic metric on ℚ. d_p(x, y) = p^{−v_p(x − y)} where v_p is p-adic valuation. Numbers are "close" if their difference is divisible by a high power of p.
Metric vs pseudometric vs ultrametric
| Metric | Pseudometric | Quasimetric | Ultrametric | Premetric | |
|---|---|---|---|---|---|
| d(x, x) = 0 | Yes | Yes | Yes | Yes | Yes |
| d(x, y) = 0 ⟹ x = y | Yes | No (allowed) | Yes | Yes | No |
| Symmetry d(x, y) = d(y, x) | Yes | Yes | No (allowed asymmetric) | Yes | Optional |
| Triangle inequality | d(x, z) ≤ d(x, y) + d(y, z) | Same | Same | Stronger: d(x, z) ≤ max(d(x, y), d(y, z)) | Often dropped |
| Induces Hausdorff topology | Yes | No | Possibly not | Yes (totally disconnected) | No |
| Canonical example | Euclidean d(x, y) = ‖x − y‖ | Seminorm, e.g. d(f, g) = |∫ (f − g)| | Earth-mover with one-way costs; directed graph distance | p-adic |x − y|_p; word metric on rooted trees | Sometimes weakened distance functions for clustering heuristics |
| Recovery to a metric | — | Quotient by d(x, y) = 0 equivalence | Symmetrize: max(d(x,y), d(y,x)) | Already a metric | Add missing axioms case by case |
The ultrametric inequality d(x, z) ≤ max(d(x, y), d(y, z)) has a strange consequence: every triangle is isoceles with the two longer sides equal, and every point of an open ball is its centre. This is why p-adic numbers and rooted-tree distances behave so differently from Euclidean intuition.
From norm to metric — and not back
Every normed vector space (V, ‖·‖) becomes a metric space under d(x, y) = ‖x − y‖. The metric inherits two extra properties from the algebra:
- Translation invariance. d(x + a, y + a) = d(x, y) for any a ∈ V.
- Homogeneity. d(αx, αy) = |α| d(x, y) for any scalar α.
A general metric on a vector space need not satisfy either: the discrete metric on ℝ² is neither translation-invariant nor homogeneous. So normed spaces are a strict subclass of metric spaces, and the discrete metric is the cleanest counter-example showing why "every metric is a norm" is false.
Completeness and Cauchy sequences
A sequence (xₙ) in (X, d) is Cauchy if its terms eventually cluster: ∀ε > 0 ∃N such that d(xₘ, xₙ) < ε whenever m, n ≥ N. Every convergent sequence is Cauchy, but the converse can fail.
(X, d) is complete if every Cauchy sequence converges to a point in X. The most important examples:
- (ℝ, |·|) is complete; ℚ with the same metric is not — the Cauchy sequence of rational truncations of √2 has no rational limit.
- ℝⁿ with any standard norm is complete.
- C([a, b]) with the sup metric is complete (uniform limits of continuous functions are continuous).
- C([a, b]) with the L¹ metric is not complete; its completion is L¹([a, b]), which contains discontinuous functions.
Completeness is the technical hypothesis that powers most existence theorems in analysis: the Banach contraction principle, Picard's theorem on ODE solutions, the open mapping theorem, and the construction of the real numbers from the rationals (as the equivalence classes of Cauchy sequences in ℚ).
Open sets and the induced topology
The collection τ_d of all open subsets of X (in the metric sense above) satisfies the three axioms of a topology:
- ∅ and X are open.
- Arbitrary unions of open sets are open.
- Finite intersections of open sets are open.
So every metric space carries an induced topology. Two metrics d₁, d₂ on the same set X are called topologically equivalent if τ_{d₁} = τ_{d₂} — they declare the same sets open. They are uniformly equivalent if there are constants c, C > 0 with c · d₁ ≤ d₂ ≤ C · d₁; this is stronger and preserves Cauchy sequences. On finite-dimensional ℝⁿ, all standard norms (Euclidean, taxicab, Chebyshev, any ℓ^p) are uniformly equivalent.
Not every topology comes from a metric. A topological space is metrizable if there exists some metric inducing its topology. The Urysohn metrization theorem gives a sufficient condition: a regular Hausdorff space with a countable basis is metrizable. But the long line, the Sorgenfrey plane, and the cofinite topology on an uncountable set are all non-metrizable.
Why it matters
- Foundation of analysis. Convergence, continuity, differentiability, and integration are all metric-space concepts long before they are calculus concepts. The clean axiomatic treatment generalizes calculus from ℝ to function spaces, infinite-dimensional vector spaces, and probability spaces.
- Functional analysis and PDEs. Banach and Hilbert spaces are complete metric spaces; the Picard-Lindelöf theorem proves existence of ODE solutions by applying the Banach fixed-point theorem in a complete metric space of continuous functions.
- Probability theory. Random variables form metric spaces under the L^p, total variation, Kolmogorov, or Wasserstein metrics, each tracking a different notion of "two distributions are close".
- Computer science. Edit distance, tree edit distance, graph metrics, embedding lemmas (Bourgain, Johnson-Lindenstrauss) are all metric-space results that drive algorithms in DNA alignment, approximate nearest neighbors, clustering, and dimensionality reduction.
- Number theory. The p-adic metric on ℚ leads to ℚ_p, ℤ_p, and the entire local-global perspective of arithmetic geometry. Ostrowski's theorem says the only nontrivial absolute values on ℚ are the usual one and the p-adic ones.
Open balls under different metrics
What does B((0, 0), 1) — the open unit ball — look like in ℝ² for different metrics? The shape is a Hasse-style ladder of how the metric is shaped:
- Euclidean (ℓ²): a round disk x² + y² < 1.
- Taxicab (ℓ¹): a rotated square (diamond) |x| + |y| < 1.
- Chebyshev (ℓ^∞): a square max(|x|, |y|) < 1.
- ℓ^p, 1 < p < 2: a "rounded diamond" interpolating between taxicab and Euclidean.
- ℓ^p, 2 < p < ∞: a "rounded square" interpolating between Euclidean and Chebyshev.
- Discrete: {(0, 0)} alone for r ≤ 1, the whole plane for r > 1.
Different ball shapes encode different penalties: Euclidean penalizes outliers smoothly; taxicab penalizes coordinates additively; Chebyshev penalizes only the largest coordinate. In machine learning, choosing a metric is choosing a notion of "small error".
Theorems that hold for every metric space
- Banach fixed-point theorem. A contraction T : X → X (Lipschitz constant < 1) on a complete metric space has a unique fixed point, found by iterating from any starting point.
- Baire category theorem. A complete metric space cannot be written as a countable union of nowhere-dense sets. Used to prove the uniform boundedness principle in functional analysis.
- Heine-Cantor theorem. A continuous function on a compact metric space is uniformly continuous.
- Bolzano-Weierstrass (metric form). A subset of a complete metric space is compact iff it is closed, totally bounded, and (equivalently) every sequence has a convergent subsequence.
- Stone-Weierstrass theorem. An algebra of continuous real functions on a compact metric space that separates points is dense in C(X) under the sup metric.
Common mistakes
- Assuming every metric is a norm. The discrete metric is the canonical counter-example: it's a metric on any set, not just a vector space, and it isn't homogeneous even when the set is a vector space.
- Forgetting the identity axiom (d(x, y) = 0 ⟹ x = y). A function satisfying only non-negativity, symmetry, and triangle inequality, with possibly d(x, y) = 0 for distinct x, y, is a pseudometric. The induced topology is non-Hausdorff and disasters quickly follow.
- Confusing equivalent metrics with equal metrics. Euclidean and taxicab give the same topology on ℝⁿ but very different geometries — angles, isometries, and ball shapes all differ.
- Believing ℚ is complete. The rationals are missing all irrational limits — every real number except the rationals is a hole. ℝ is the completion of ℚ.
- Treating the L¹ "metric" on continuous functions as already complete. It isn't — completing it gives L¹([a, b]), which requires Lebesgue integration and equivalence classes of functions equal almost everywhere.
- Confusing total boundedness with boundedness. In ℝⁿ they coincide, but in infinite dimensions a closed bounded set can fail to be totally bounded — the unit ball of ℓ² is bounded but not compact precisely because it isn't totally bounded.
Frequently asked questions
Why does the triangle inequality matter so much?
It is what makes a metric a usable notion of distance. Without it, "closeness" has no transitive structure: if d(x,y) and d(y,z) are small, you have no guarantee d(x,z) is small, and so convergence and continuity break down. The triangle inequality is also the linchpin of nearly every analytic estimate — error bounds, Lipschitz continuity, fixed-point theorems all rely on it.
Is every metric a norm?
No. A norm requires an underlying vector space and is homogeneous: ‖αv‖ = |α|‖v‖. The discrete metric d(x, y) = 1 if x ≠ y, 0 otherwise is a perfectly valid metric on any set, but it isn't a norm — it's defined on sets without algebraic structure and fails homogeneity. The implication runs only one way: every norm gives a metric d(x, y) = ‖x − y‖.
What is a complete metric space?
A metric space is complete if every Cauchy sequence (terms eventually arbitrarily close to each other) actually converges to a limit inside the space. ℝ is complete; ℚ is not (the sequence 3, 3.1, 3.14, 3.141, … is Cauchy in ℚ but converges to π ∉ ℚ). Completeness is what lets you do calculus: the Banach fixed-point theorem and the existence of solutions to ODEs both require it.
How is a metric different from a topology?
A topology is just a family of "open" subsets satisfying union and finite-intersection rules — no numerical distance. A metric induces a topology (the open balls form a basis), but topologies are more general: there are useful spaces (like the Zariski topology in algebraic geometry, or any non-first-countable space) that no metric can produce. Spaces whose topology comes from some metric are called "metrizable".
What is the p-adic metric?
On ℚ, fix a prime p. Write a non-zero rational as p^n · (a/b) with p not dividing a or b, and define |x|_p = p^{−n}. The p-adic metric is d_p(x, y) = |x − y|_p. It is an ultrametric — d(x, z) ≤ max(d(x, y), d(y, z)), strictly stronger than the triangle inequality — and completing ℚ under it produces the p-adic numbers ℚ_p, foundational in number theory.
Can a set carry more than one metric?
Yes, and they may give different topologies. On ℝ², the Euclidean, taxicab, and Chebyshev metrics yield the same topology (they're equivalent, with constants relating their values). The discrete metric on ℝ² gives a strictly finer topology in which every set is open. Choosing a metric is a modeling decision: "nearness" depends on what kind of nearness you mean.