Linear Algebra

Inner Product Space

Where geometry comes from algebra

An inner product space is a vector space equipped with an inner product ⟨·,·⟩ — a bilinear (or sesquilinear), symmetric, positive-definite pairing of vectors that produces a number. From this one structure we get lengths, angles, orthogonality, projections, and the geometric backbone of Hilbert spaces, Fourier analysis, and quantum mechanics.

  • Fieldℝ or ℂ
  • Induced norm‖v‖ = √⟨v,v⟩
  • Cauchy-Schwarz|⟨u,v⟩| ≤ ‖u‖·‖v‖
  • Orthogonality⟨u,v⟩ = 0
  • Complete ⇒Hilbert space

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The four axioms

An inner product space is a pair (V, ⟨·,·⟩) where V is a vector space over the field 𝔽 (either ℝ or ℂ) and ⟨·,·⟩ : V × V → 𝔽 is a map satisfying four axioms for all u, v, w ∈ V and all scalars α ∈ 𝔽:

  1. Linearity in the first argument — ⟨αu + w, v⟩ = α⟨u, v⟩ + ⟨w, v⟩.
  2. Conjugate symmetry — ⟨u, v⟩ = ⟨v, u⟩. Over ℝ the conjugate vanishes and the pairing is plainly symmetric; over ℂ the conjugate is essential.
  3. Positive-definiteness — ⟨v, v⟩ ≥ 0, with equality iff v = 0. This is what guarantees a real, non-negative length.
  4. Non-degeneracy — implied by positive-definiteness above; if ⟨v, w⟩ = 0 for all w, then v = 0.

Combining axioms 1 and 2 gives conjugate-linearity in the second slot: ⟨v, αw⟩ = ᾱ⟨v, w⟩. The pairing is then called sesquilinear over ℂ. Over ℝ it is plainly bilinear.

From algebra to geometry — what the pairing buys you

The four axioms look modest. Their consequences are not. Once you have an inner product, you immediately recover the entire toolbox of Euclidean geometry:

  • Length — define ‖v‖ ≔ √⟨v, v⟩. Positive-definiteness makes the square root real and the length zero only at the zero vector.
  • Angle — for non-zero u, v in a real inner product space, the Cauchy-Schwarz inequality bounds ⟨u, v⟩ / (‖u‖·‖v‖) within [−1, 1], so we can define θ ∈ [0, π] by cos θ = ⟨u, v⟩ / (‖u‖·‖v‖).
  • Orthogonality — write u ⊥ v when ⟨u, v⟩ = 0. The Pythagorean theorem follows directly: ‖u + v‖² = ‖u‖² + ‖v‖² whenever u ⊥ v.
  • Projection — the projection of u onto span(v) is proj_v(u) = (⟨u, v⟩ / ⟨v, v⟩) v. Subtracting it gives the orthogonal residual.
  • Orthonormal bases — a basis e₁, …, eₙ with ⟨eᵢ, eⱼ⟩ = δᵢⱼ; coordinates become inner products: v = Σ ⟨v, eᵢ⟩ eᵢ.

Proof of the Cauchy-Schwarz inequality

For all u, v in an inner product space, |⟨u, v⟩| ≤ ‖u‖ · ‖v‖, with equality iff u, v are linearly dependent.

Proof (real case). If v = 0 the inequality reads 0 ≤ 0; assume v ≠ 0. Define the function f(t) = ⟨u − tv, u − tv⟩ for t ∈ ℝ. By positive-definiteness, f(t) ≥ 0 for every t. Expanding:

f(t) = ‖u‖² − 2t⟨u, v⟩ + t²‖v‖² ≥ 0.

This is a quadratic in t with leading coefficient ‖v‖² > 0 that is everywhere non-negative. A non-negative quadratic has non-positive discriminant:

Δ = (2⟨u, v⟩)² − 4‖v‖²‖u‖² ≤ 0 ⟹ ⟨u, v⟩² ≤ ‖u‖²‖v‖².

Take square roots and you have |⟨u, v⟩| ≤ ‖u‖·‖v‖. Equality forces Δ = 0, meaning the quadratic has a (double) root t₀ where u − t₀v = 0, so u and v are linearly dependent. The complex case substitutes t = ⟨u, v⟩ / ‖v‖² and runs the same argument with conjugates.

The triangle inequality, almost for free

From Cauchy-Schwarz, the induced norm automatically satisfies the triangle inequality. Compute:

‖u + v‖² = ⟨u + v, u + v⟩ = ‖u‖² + 2 Re⟨u, v⟩ + ‖v‖²
≤ ‖u‖² + 2|⟨u, v⟩| + ‖v‖²
≤ ‖u‖² + 2‖u‖·‖v‖ + ‖v‖²
= (‖u‖ + ‖v‖)².

Square-rooting: ‖u + v‖ ≤ ‖u‖ + ‖v‖. So an inner product not only gives a length, it gives a length that obeys the geometric intuition that detours are longer than direct paths.

Inner product vs normed vs metric spaces

Inner product spaceNormed spaceMetric space
Primitive structure⟨u, v⟩ ∈ 𝔽‖v‖ ∈ ℝ≥0d(x, y) ∈ ℝ≥0
Defines lengthsYes (‖v‖ = √⟨v,v⟩)Yes (primitive)Indirectly via d(x, 0)
Defines anglesYes (cos θ = ⟨u,v⟩/(‖u‖‖v‖))NoNo
Defines orthogonalityYes (⟨u, v⟩ = 0)No general notionNo
Parallelogram law holdsAlwaysIff norm comes from an inner productN/A
Required vector-space structureYesYesNo (any set will do)
Strength orderingStrongest (richest)MiddleWeakest (most general)
Canonical complete exampleL²(ℝ), ℓ²L^p, ℓ^p (p ≠ 2)(ℝ, |·|)

The hierarchy is strict: every inner product space is a normed space (use the induced norm), and every normed space is a metric space (use d(x, y) = ‖x − y‖). Going the other way fails — see the next section.

ℓᵖ vs Lᵖ norms — which come from an inner product?

NormFormulaComes from inner product?Why / why not
ℓ¹Σ |xᵢ|NoFails parallelogram law: with x = (1,0), y = (0,1) the LHS is 4 but RHS is 4 — okay; try x = (1,1), y = (1,−1): LHS = 4 + 4 = 8, RHS = 2·4 + 2·4 = 16.
ℓ²√Σ |xᵢ|²YesInner product is the standard dot product Σ xᵢ ȳᵢ. The unique p for which ℓᵖ is Hilbert.
ℓᵖ (1 < p < ∞, p ≠ 2)(Σ |xᵢ|ᵖ)^{1/p}NoParallelogram law fails for any p ≠ 2; ℓᵖ is uniformly convex but not Hilbert.
ℓ^∞sup |xᵢ|NoSphere is a cube, not round; parallelogram law fails dramatically.
L¹([a,b])∫|f|NoSame parallelogram failure as ℓ¹.
L²([a,b])√∫|f|²YesInner product ⟨f, g⟩ = ∫ f(x) g(x) dx. The setting for Fourier series.
Lᵖ([a,b]) (p ≠ 2)(∫|f|ᵖ)^{1/p}NoBanach but not Hilbert.

Jordan-von Neumann theorem (1935). A norm comes from an inner product if and only if it satisfies the parallelogram law ‖u+v‖² + ‖u−v‖² = 2‖u‖² + 2‖v‖². This is the cleanest test: any single counter-example pair (u, v) rules out the inner product origin instantly.

Canonical examples

  • ℝⁿ with the dot product. ⟨u, v⟩ = u₁v₁ + … + uₙvₙ. The induced norm is the Euclidean length and angles match high school trigonometry.
  • ℂⁿ with the Hermitian dot product. ⟨u, v⟩ = Σ uᵢ vᵢ. The conjugate is essential — without it ⟨v, v⟩ wouldn't be real.
  • ℓ², the space of square-summable sequences. All real (or complex) sequences (xₙ) with Σ|xₙ|² < ∞, with ⟨x, y⟩ = Σ xₙ yₙ. The infinite-dimensional Hilbert space.
  • L²([a, b]). Square-integrable functions with ⟨f, g⟩ = ∫ₐᵇ f(x) g(x) dx. The orthonormal system {1, cos x, sin x, cos 2x, sin 2x, …} (suitably normalized) gives the Fourier expansion of any L² function.
  • Matrices with the Frobenius inner product. ⟨A, B⟩ = tr(A* B) = Σᵢⱼ Aᵢⱼ Bᵢⱼ. The induced norm ‖A‖_F is the entrywise ℓ² norm.
  • Polynomials of degree ≤ n with ⟨p, q⟩ = ∫₋₁¹ p(x) q(x) dx. Gram-Schmidt on 1, x, x², … gives the Legendre polynomials.

Why it matters in physics, statistics, and analysis

  • Quantum mechanics. The state space of a quantum system is a complex Hilbert space ℋ. A unit vector |ψ⟩ ∈ ℋ encodes a state; the inner product ⟨ψ|φ⟩ encodes a probability amplitude; |⟨ψ|φ⟩|² is the probability that measuring system in state |ψ⟩ produces outcome φ. The Pauli, Schrödinger, and Heisenberg pictures are all built on this single primitive.
  • Fourier analysis. Decomposing a periodic function into sines and cosines is exactly the projection of f ∈ L²([0, 2π]) onto an orthonormal basis. The Fourier coefficient ĉₙ = ⟨f, eⁱⁿˣ⟩/(2π) is an inner product.
  • Statistics and least squares. Linear regression minimizes ‖y − Xβ‖² in an inner product space; the optimal β̂ is the orthogonal projection of y onto the column space of X. Correlation coefficients are normalized inner products of mean-centered variables.
  • Machine learning. The kernel trick replaces explicit feature maps φ(x) with kernels K(x, y) = ⟨φ(x), φ(y)⟩, computing inner products in (sometimes infinite-dimensional) feature spaces without ever materializing the features.
  • Numerical analysis. Conjugate-gradient and Krylov methods rely on A-orthogonality: ⟨u, v⟩_A = uᵀAv with A symmetric positive-definite.

The parallelogram and polarization identities

Two identities tie norms to inner products. For any inner product space:

Parallelogram law: ‖u + v‖² + ‖u − v‖² = 2‖u‖² + 2‖v‖².

Polarization (real case): ⟨u, v⟩ = ¼ (‖u + v‖² − ‖u − v‖²).

Polarization (complex case): ⟨u, v⟩ = ¼ Σ_{k=0}^{3} iᵏ ‖u + iᵏv‖².

The parallelogram identity is geometric: in a flat Euclidean plane, the sum of the squared diagonals of a parallelogram equals twice the sum of the squared sides. Polarization is striking: it says the inner product is fully recoverable from the norm. So the norm carries all the inner-product data, but only when the parallelogram law holds.

Bessel's inequality and Parseval's identity

Suppose {e₁, e₂, …} is an orthonormal sequence in an inner product space V. For any v ∈ V:

Bessel: Σₙ |⟨v, eₙ⟩|² ≤ ‖v‖².

That is, the energy in the projections never exceeds the energy of v. If the orthonormal system is also complete (its span is dense in V), Bessel becomes an equality:

Parseval: Σₙ |⟨v, eₙ⟩|² = ‖v‖².

Parseval is the Pythagorean theorem in infinite dimensions and the conservation-of-energy statement of Fourier analysis: the L² norm of a signal equals the ℓ² norm of its Fourier coefficients.

Common mistakes

  • Assuming every norm comes from an inner product. Only the ℓ² and L² families do. The taxicab (ℓ¹) and supremum (ℓ^∞) norms are perfectly valid norms but generate no inner product, no angles, no orthogonality.
  • Forgetting the conjugate over ℂ. Without it, ⟨v, v⟩ is generally complex and you can't take a real square root. The conjugate is what makes positive-definiteness make sense over ℂ.
  • Treating "inner product space" and "Hilbert space" as synonyms. Every Hilbert space is an inner product space, but the converse fails in infinite dimensions: completeness is an extra hypothesis, and most natural pre-Hilbert constructions (continuous functions with the L² inner product) are not complete.
  • Using non-symmetric bilinear forms as inner products. The Minkowski form ⟨u, v⟩_M = u₀v₀ − u₁v₁ − u₂v₂ − u₃v₃ on Minkowski spacetime is symmetric but indefinite (some non-zero vectors have ⟨v, v⟩ ≤ 0). It is an "inner product" in the indefinite-form sense of relativity, not the positive-definite sense of this article.
  • Forgetting positive-definiteness when constructing weighted inner products. uᵀAv is an inner product iff A is symmetric and positive-definite. If A has a zero or negative eigenvalue, you get a degenerate or indefinite form, not an inner product.

Frequently asked questions

What's the difference between an inner product space and a normed space?

Every inner product induces a norm via ‖v‖ = √⟨v,v⟩, but not every norm comes from an inner product. The test is the parallelogram law: ‖u+v‖² + ‖u−v‖² = 2‖u‖² + 2‖v‖². The Euclidean norm satisfies it; the ℓ¹ taxicab norm and ℓ^∞ max norm do not, so they're normed but not inner-product spaces.

Why does the Cauchy-Schwarz inequality hold for every inner product?

Consider the non-negative quantity ⟨u−tv, u−tv⟩ ≥ 0 for any real t. Expanding gives ‖u‖² − 2t⟨u,v⟩ + t²‖v‖² ≥ 0, a quadratic in t that is non-negative for all t. Its discriminant must therefore be non-positive: 4⟨u,v⟩² − 4‖u‖²‖v‖² ≤ 0, which rearranges to |⟨u,v⟩| ≤ ‖u‖·‖v‖.

Is the dot product the only inner product on ℝⁿ?

No. Any symmetric positive-definite matrix A defines a valid inner product ⟨u,v⟩_A = uᵀAv. The standard dot product corresponds to A = I. Weighted inner products (diagonal A with positive entries) appear in statistics, where coordinates have different variances, and in physics, where the Minkowski form (which fails positive-definiteness) gives spacetime its non-Euclidean geometry.

What makes a Hilbert space different from a generic inner product space?

Completeness. A Hilbert space is an inner product space in which every Cauchy sequence converges to a limit inside the space. Finite-dimensional inner product spaces are automatically complete (and so automatically Hilbert), but infinite-dimensional ones — like the space of continuous functions with the L² inner product — usually aren't. You must complete them by adding limits, producing L²([a,b]) from C([a,b]).

What does ⟨f, g⟩ = ∫ f(x)g(x) dx actually compute?

It's the function-space analogue of the dot product. Two functions are orthogonal when their integrated product is zero — this is exactly why sin(nx) and cos(mx) form an orthonormal system on [0, 2π] and why Fourier series work. The L² inner product turns "how similar are these signals?" into a geometric question.

Why does quantum mechanics live on Hilbert spaces?

Quantum states are unit vectors in a complex Hilbert space, and the inner product ⟨ψ|φ⟩ encodes probability amplitudes — |⟨ψ|φ⟩|² is the probability of measuring state ψ in state φ. Orthogonality (⟨ψ|φ⟩ = 0) corresponds to perfectly distinguishable states, and orthonormal bases correspond to compatible measurement outcomes.