Linear Algebra
Inner Product Space
Where geometry comes from algebra
An inner product space is a vector space equipped with an inner product ⟨·,·⟩ — a bilinear (or sesquilinear), symmetric, positive-definite pairing of vectors that produces a number. From this one structure we get lengths, angles, orthogonality, projections, and the geometric backbone of Hilbert spaces, Fourier analysis, and quantum mechanics.
- Fieldℝ or ℂ
- Induced norm‖v‖ = √⟨v,v⟩
- Cauchy-Schwarz|⟨u,v⟩| ≤ ‖u‖·‖v‖
- Orthogonality⟨u,v⟩ = 0
- Complete ⇒Hilbert space
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The four axioms
An inner product space is a pair (V, ⟨·,·⟩) where V is a vector space over the field 𝔽 (either ℝ or ℂ) and ⟨·,·⟩ : V × V → 𝔽 is a map satisfying four axioms for all u, v, w ∈ V and all scalars α ∈ 𝔽:
- Linearity in the first argument — ⟨αu + w, v⟩ = α⟨u, v⟩ + ⟨w, v⟩.
- Conjugate symmetry — ⟨u, v⟩ = ⟨v, u⟩. Over ℝ the conjugate vanishes and the pairing is plainly symmetric; over ℂ the conjugate is essential.
- Positive-definiteness — ⟨v, v⟩ ≥ 0, with equality iff v = 0. This is what guarantees a real, non-negative length.
- Non-degeneracy — implied by positive-definiteness above; if ⟨v, w⟩ = 0 for all w, then v = 0.
Combining axioms 1 and 2 gives conjugate-linearity in the second slot: ⟨v, αw⟩ = ᾱ⟨v, w⟩. The pairing is then called sesquilinear over ℂ. Over ℝ it is plainly bilinear.
From algebra to geometry — what the pairing buys you
The four axioms look modest. Their consequences are not. Once you have an inner product, you immediately recover the entire toolbox of Euclidean geometry:
- Length — define ‖v‖ ≔ √⟨v, v⟩. Positive-definiteness makes the square root real and the length zero only at the zero vector.
- Angle — for non-zero u, v in a real inner product space, the Cauchy-Schwarz inequality bounds ⟨u, v⟩ / (‖u‖·‖v‖) within [−1, 1], so we can define θ ∈ [0, π] by cos θ = ⟨u, v⟩ / (‖u‖·‖v‖).
- Orthogonality — write u ⊥ v when ⟨u, v⟩ = 0. The Pythagorean theorem follows directly: ‖u + v‖² = ‖u‖² + ‖v‖² whenever u ⊥ v.
- Projection — the projection of u onto span(v) is proj_v(u) = (⟨u, v⟩ / ⟨v, v⟩) v. Subtracting it gives the orthogonal residual.
- Orthonormal bases — a basis e₁, …, eₙ with ⟨eᵢ, eⱼ⟩ = δᵢⱼ; coordinates become inner products: v = Σ ⟨v, eᵢ⟩ eᵢ.
Proof of the Cauchy-Schwarz inequality
For all u, v in an inner product space, |⟨u, v⟩| ≤ ‖u‖ · ‖v‖, with equality iff u, v are linearly dependent.
Proof (real case). If v = 0 the inequality reads 0 ≤ 0; assume v ≠ 0. Define the function f(t) = ⟨u − tv, u − tv⟩ for t ∈ ℝ. By positive-definiteness, f(t) ≥ 0 for every t. Expanding:
f(t) = ‖u‖² − 2t⟨u, v⟩ + t²‖v‖² ≥ 0.
This is a quadratic in t with leading coefficient ‖v‖² > 0 that is everywhere non-negative. A non-negative quadratic has non-positive discriminant:
Δ = (2⟨u, v⟩)² − 4‖v‖²‖u‖² ≤ 0 ⟹ ⟨u, v⟩² ≤ ‖u‖²‖v‖².
Take square roots and you have |⟨u, v⟩| ≤ ‖u‖·‖v‖. Equality forces Δ = 0, meaning the quadratic has a (double) root t₀ where u − t₀v = 0, so u and v are linearly dependent. The complex case substitutes t = ⟨u, v⟩ / ‖v‖² and runs the same argument with conjugates.
The triangle inequality, almost for free
From Cauchy-Schwarz, the induced norm automatically satisfies the triangle inequality. Compute:
‖u + v‖² = ⟨u + v, u + v⟩ = ‖u‖² + 2 Re⟨u, v⟩ + ‖v‖²
≤ ‖u‖² + 2|⟨u, v⟩| + ‖v‖²
≤ ‖u‖² + 2‖u‖·‖v‖ + ‖v‖²
= (‖u‖ + ‖v‖)².
Square-rooting: ‖u + v‖ ≤ ‖u‖ + ‖v‖. So an inner product not only gives a length, it gives a length that obeys the geometric intuition that detours are longer than direct paths.
Inner product vs normed vs metric spaces
| Inner product space | Normed space | Metric space | |
|---|---|---|---|
| Primitive structure | ⟨u, v⟩ ∈ 𝔽 | ‖v‖ ∈ ℝ≥0 | d(x, y) ∈ ℝ≥0 |
| Defines lengths | Yes (‖v‖ = √⟨v,v⟩) | Yes (primitive) | Indirectly via d(x, 0) |
| Defines angles | Yes (cos θ = ⟨u,v⟩/(‖u‖‖v‖)) | No | No |
| Defines orthogonality | Yes (⟨u, v⟩ = 0) | No general notion | No |
| Parallelogram law holds | Always | Iff norm comes from an inner product | N/A |
| Required vector-space structure | Yes | Yes | No (any set will do) |
| Strength ordering | Strongest (richest) | Middle | Weakest (most general) |
| Canonical complete example | L²(ℝ), ℓ² | L^p, ℓ^p (p ≠ 2) | (ℝ, |·|) |
The hierarchy is strict: every inner product space is a normed space (use the induced norm), and every normed space is a metric space (use d(x, y) = ‖x − y‖). Going the other way fails — see the next section.
ℓᵖ vs Lᵖ norms — which come from an inner product?
| Norm | Formula | Comes from inner product? | Why / why not |
|---|---|---|---|
| ℓ¹ | Σ |xᵢ| | No | Fails parallelogram law: with x = (1,0), y = (0,1) the LHS is 4 but RHS is 4 — okay; try x = (1,1), y = (1,−1): LHS = 4 + 4 = 8, RHS = 2·4 + 2·4 = 16. |
| ℓ² | √Σ |xᵢ|² | Yes | Inner product is the standard dot product Σ xᵢ ȳᵢ. The unique p for which ℓᵖ is Hilbert. |
| ℓᵖ (1 < p < ∞, p ≠ 2) | (Σ |xᵢ|ᵖ)^{1/p} | No | Parallelogram law fails for any p ≠ 2; ℓᵖ is uniformly convex but not Hilbert. |
| ℓ^∞ | sup |xᵢ| | No | Sphere is a cube, not round; parallelogram law fails dramatically. |
| L¹([a,b]) | ∫|f| | No | Same parallelogram failure as ℓ¹. |
| L²([a,b]) | √∫|f|² | Yes | Inner product ⟨f, g⟩ = ∫ f(x) g(x) dx. The setting for Fourier series. |
| Lᵖ([a,b]) (p ≠ 2) | (∫|f|ᵖ)^{1/p} | No | Banach but not Hilbert. |
Jordan-von Neumann theorem (1935). A norm comes from an inner product if and only if it satisfies the parallelogram law ‖u+v‖² + ‖u−v‖² = 2‖u‖² + 2‖v‖². This is the cleanest test: any single counter-example pair (u, v) rules out the inner product origin instantly.
Canonical examples
- ℝⁿ with the dot product. ⟨u, v⟩ = u₁v₁ + … + uₙvₙ. The induced norm is the Euclidean length and angles match high school trigonometry.
- ℂⁿ with the Hermitian dot product. ⟨u, v⟩ = Σ uᵢ vᵢ. The conjugate is essential — without it ⟨v, v⟩ wouldn't be real.
- ℓ², the space of square-summable sequences. All real (or complex) sequences (xₙ) with Σ|xₙ|² < ∞, with ⟨x, y⟩ = Σ xₙ yₙ. The infinite-dimensional Hilbert space.
- L²([a, b]). Square-integrable functions with ⟨f, g⟩ = ∫ₐᵇ f(x) g(x) dx. The orthonormal system {1, cos x, sin x, cos 2x, sin 2x, …} (suitably normalized) gives the Fourier expansion of any L² function.
- Matrices with the Frobenius inner product. ⟨A, B⟩ = tr(A* B) = Σᵢⱼ Aᵢⱼ Bᵢⱼ. The induced norm ‖A‖_F is the entrywise ℓ² norm.
- Polynomials of degree ≤ n with ⟨p, q⟩ = ∫₋₁¹ p(x) q(x) dx. Gram-Schmidt on 1, x, x², … gives the Legendre polynomials.
Why it matters in physics, statistics, and analysis
- Quantum mechanics. The state space of a quantum system is a complex Hilbert space ℋ. A unit vector |ψ⟩ ∈ ℋ encodes a state; the inner product ⟨ψ|φ⟩ encodes a probability amplitude; |⟨ψ|φ⟩|² is the probability that measuring system in state |ψ⟩ produces outcome φ. The Pauli, Schrödinger, and Heisenberg pictures are all built on this single primitive.
- Fourier analysis. Decomposing a periodic function into sines and cosines is exactly the projection of f ∈ L²([0, 2π]) onto an orthonormal basis. The Fourier coefficient ĉₙ = ⟨f, eⁱⁿˣ⟩/(2π) is an inner product.
- Statistics and least squares. Linear regression minimizes ‖y − Xβ‖² in an inner product space; the optimal β̂ is the orthogonal projection of y onto the column space of X. Correlation coefficients are normalized inner products of mean-centered variables.
- Machine learning. The kernel trick replaces explicit feature maps φ(x) with kernels K(x, y) = ⟨φ(x), φ(y)⟩, computing inner products in (sometimes infinite-dimensional) feature spaces without ever materializing the features.
- Numerical analysis. Conjugate-gradient and Krylov methods rely on A-orthogonality: ⟨u, v⟩_A = uᵀAv with A symmetric positive-definite.
The parallelogram and polarization identities
Two identities tie norms to inner products. For any inner product space:
Parallelogram law: ‖u + v‖² + ‖u − v‖² = 2‖u‖² + 2‖v‖².
Polarization (real case): ⟨u, v⟩ = ¼ (‖u + v‖² − ‖u − v‖²).
Polarization (complex case): ⟨u, v⟩ = ¼ Σ_{k=0}^{3} iᵏ ‖u + iᵏv‖².
The parallelogram identity is geometric: in a flat Euclidean plane, the sum of the squared diagonals of a parallelogram equals twice the sum of the squared sides. Polarization is striking: it says the inner product is fully recoverable from the norm. So the norm carries all the inner-product data, but only when the parallelogram law holds.
Bessel's inequality and Parseval's identity
Suppose {e₁, e₂, …} is an orthonormal sequence in an inner product space V. For any v ∈ V:
Bessel: Σₙ |⟨v, eₙ⟩|² ≤ ‖v‖².
That is, the energy in the projections never exceeds the energy of v. If the orthonormal system is also complete (its span is dense in V), Bessel becomes an equality:
Parseval: Σₙ |⟨v, eₙ⟩|² = ‖v‖².
Parseval is the Pythagorean theorem in infinite dimensions and the conservation-of-energy statement of Fourier analysis: the L² norm of a signal equals the ℓ² norm of its Fourier coefficients.
Common mistakes
- Assuming every norm comes from an inner product. Only the ℓ² and L² families do. The taxicab (ℓ¹) and supremum (ℓ^∞) norms are perfectly valid norms but generate no inner product, no angles, no orthogonality.
- Forgetting the conjugate over ℂ. Without it, ⟨v, v⟩ is generally complex and you can't take a real square root. The conjugate is what makes positive-definiteness make sense over ℂ.
- Treating "inner product space" and "Hilbert space" as synonyms. Every Hilbert space is an inner product space, but the converse fails in infinite dimensions: completeness is an extra hypothesis, and most natural pre-Hilbert constructions (continuous functions with the L² inner product) are not complete.
- Using non-symmetric bilinear forms as inner products. The Minkowski form ⟨u, v⟩_M = u₀v₀ − u₁v₁ − u₂v₂ − u₃v₃ on Minkowski spacetime is symmetric but indefinite (some non-zero vectors have ⟨v, v⟩ ≤ 0). It is an "inner product" in the indefinite-form sense of relativity, not the positive-definite sense of this article.
- Forgetting positive-definiteness when constructing weighted inner products. uᵀAv is an inner product iff A is symmetric and positive-definite. If A has a zero or negative eigenvalue, you get a degenerate or indefinite form, not an inner product.
Frequently asked questions
What's the difference between an inner product space and a normed space?
Every inner product induces a norm via ‖v‖ = √⟨v,v⟩, but not every norm comes from an inner product. The test is the parallelogram law: ‖u+v‖² + ‖u−v‖² = 2‖u‖² + 2‖v‖². The Euclidean norm satisfies it; the ℓ¹ taxicab norm and ℓ^∞ max norm do not, so they're normed but not inner-product spaces.
Why does the Cauchy-Schwarz inequality hold for every inner product?
Consider the non-negative quantity ⟨u−tv, u−tv⟩ ≥ 0 for any real t. Expanding gives ‖u‖² − 2t⟨u,v⟩ + t²‖v‖² ≥ 0, a quadratic in t that is non-negative for all t. Its discriminant must therefore be non-positive: 4⟨u,v⟩² − 4‖u‖²‖v‖² ≤ 0, which rearranges to |⟨u,v⟩| ≤ ‖u‖·‖v‖.
Is the dot product the only inner product on ℝⁿ?
No. Any symmetric positive-definite matrix A defines a valid inner product ⟨u,v⟩_A = uᵀAv. The standard dot product corresponds to A = I. Weighted inner products (diagonal A with positive entries) appear in statistics, where coordinates have different variances, and in physics, where the Minkowski form (which fails positive-definiteness) gives spacetime its non-Euclidean geometry.
What makes a Hilbert space different from a generic inner product space?
Completeness. A Hilbert space is an inner product space in which every Cauchy sequence converges to a limit inside the space. Finite-dimensional inner product spaces are automatically complete (and so automatically Hilbert), but infinite-dimensional ones — like the space of continuous functions with the L² inner product — usually aren't. You must complete them by adding limits, producing L²([a,b]) from C([a,b]).
What does ⟨f, g⟩ = ∫ f(x)g(x) dx actually compute?
It's the function-space analogue of the dot product. Two functions are orthogonal when their integrated product is zero — this is exactly why sin(nx) and cos(mx) form an orthonormal system on [0, 2π] and why Fourier series work. The L² inner product turns "how similar are these signals?" into a geometric question.
Why does quantum mechanics live on Hilbert spaces?
Quantum states are unit vectors in a complex Hilbert space, and the inner product ⟨ψ|φ⟩ encodes probability amplitudes — |⟨ψ|φ⟩|² is the probability of measuring state ψ in state φ. Orthogonality (⟨ψ|φ⟩ = 0) corresponds to perfectly distinguishable states, and orthonormal bases correspond to compatible measurement outcomes.