Linear Algebra

Spectral Theorem

Every Hermitian (or real symmetric) matrix has an orthonormal eigenbasis with real eigenvalues

The spectral theorem: every Hermitian matrix H ∈ ℂ^(n×n) (i.e. H = H*) has an orthonormal basis of eigenvectors with real eigenvalues — equivalently, H = UDU* where U is unitary and D is real diagonal. For real symmetric matrices, U is orthogonal (U^T = U⁻¹). Generalizes to compact self-adjoint operators on Hilbert spaces (Hilbert-Schmidt theorem, 1907) and to bounded normal operators (von Neumann 1929). The theorem underlies Principal Component Analysis (eigendecomposition of covariance matrices), the discrete Fourier transform (eigenvalues of circulant matrices = DFT coefficients), and quantum mechanics (every observable is Hermitian, eigenvalues = measured values).

  • StatementH = UDU* (U unitary, D real diagonal)
  • Real symmetricorthogonal diagonalization
  • Eigenvaluesalways real
  • Generalizationcompact self-adjoint operators (Hilbert)
  • ApplicationPCA, DFT, QM
  • Min-maxCourant-Fischer

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why spectral theorem matters

  • Principal Component Analysis. Covariance matrix is real symmetric — spectral theorem guarantees orthonormal principal components ranked by variance. Without it, "uncorrelated directions" might not exist; PCA's geometric guarantee comes directly from the theorem.
  • Quantum mechanics. Observables are Hermitian operators; spectral theorem says they have real eigenvalues (the only possible measurement outcomes) and orthonormal eigenstates (the post-measurement states). The whole measurement formalism rests on this result.
  • Vibration analysis. Mass-stiffness systems M ẍ + K x = 0 with M, K symmetric positive-definite have natural frequencies = square roots of generalized eigenvalues. The spectral theorem says the modes are M-orthogonal — they decouple into independent oscillators.
  • Signal processing. Circulant matrices (constant on diagonals modulo n) are diagonalized by the discrete Fourier transform: their eigenvalues are exactly the DFT of the first row. This is why convolution becomes multiplication in the frequency domain.
  • Optimization. Quadratic forms x^T A x with A symmetric have a maximum on the unit sphere equal to the largest eigenvalue (Courant-Fischer min-max theorem). Newton's method needs the Hessian's eigenvalues to determine convergence — symmetric Hessian + spectral theorem give the analysis.
  • Statistics. Multivariate Gaussian's density depends on Σ^(−1) where Σ is the covariance — diagonalize Σ via the spectral theorem to get independent standard normals; this is how Mahalanobis distance and whitening transforms work.
  • Numerical linear algebra. Symmetric eigenvalue problems are easier than general ones — Lanczos algorithm, divide-and-conquer, MRRR all exploit the orthogonality guarantee from the spectral theorem to achieve accurate decompositions in O(n³) time with backward stability.

Common misconceptions

  • All matrices diagonalize. No — only normal matrices (those with AA* = A*A) are unitarily diagonalizable. Hermitian and unitary matrices are normal; general matrices are not. The Jordan canonical form handles non-normal matrices but with non-orthogonal bases and possibly off-diagonal 1s.
  • Real eigenvalues = real eigenvectors. A complex Hermitian matrix has real eigenvalues but eigenvectors generally lie in ℂⁿ. Only real symmetric matrices guarantee real eigenvectors. Example: i·[[0,1],[−1,0]] is Hermitian-like but the eigenvectors are (1, ±i)/√2.
  • Spectral = eigen. "Spectrum" is the set of eigenvalues for matrices, but extends to "values for which (A − λI) fails to be invertible" for unbounded operators. Continuous spectrum exists for some operators on infinite-dim Hilbert spaces (e.g. multiplication by x on L²(ℝ)) where there are no eigenvectors at all but plenty of "spectral values."
  • Spectral theorem is one statement. It's a family — finite-dim Hermitian, compact self-adjoint, bounded normal, unbounded self-adjoint. Each requires more sophisticated tools (projection-valued measures for bounded; Stone's theorem for unbounded). The infinite-dimensional cases were a major project of 20th-century analysis.
  • Symmetric ≠ Hermitian for complex. A complex symmetric matrix (A = A^T without conjugation) is generally not diagonalizable. The Hermitian condition A = A* (with conjugation) is what makes spectral theorem work over ℂ. Confusing the two is one of the most common bugs in numerical code.
  • Diagonalization requires distinct eigenvalues. Distinct eigenvalues guarantee diagonalizability (any matrix), but Hermitian matrices are diagonalizable even with repeated eigenvalues — the eigenspace for a repeated eigenvalue is multi-dimensional and Gram-Schmidt provides the orthonormal basis.

Worked example

Take A = [[2, 1], [1, 2]], a real symmetric matrix. Characteristic polynomial: (2−λ)² − 1 = λ² − 4λ + 3 = (λ−1)(λ−3). Eigenvalues: λ₁ = 1, λ₂ = 3 — both real, as the spectral theorem guarantees.

For λ₁ = 1: solve (A − I)v = 0, giving v₁ = (1, −1)/√2. For λ₂ = 3: solve (A − 3I)v = 0, giving v₂ = (1, 1)/√2. Check orthogonality: v₁·v₂ = (1·1 + (−1)·1)/2 = 0. The spectral decomposition is A = QDQ^T where Q has columns v₁, v₂ and D = diag(1, 3). Verify: QDQ^T = (1/2)[[1,1],[−1,1]] · diag(1,3) · (1/2)[[1,−1],[1,1]] = [[2,1],[1,2]]. The unitary U is just orthogonal Q here because A is real.

Application: x^T A x = 2x₁² + 2x₁x₂ + 2x₂² is a quadratic form. Substitute y = Q^T x: x^T A x = y^T D y = y₁² + 3y₂². Now the quadratic form is diagonal — a sum of independent squares. The level sets x^T A x = c are ellipses with axes along v₁, v₂ and semi-axes √c, √(c/3). This is what "diagonalizing a quadratic form" means geometrically.

Frequently asked questions

What does Hermitian mean exactly?

A complex matrix H is Hermitian if H = H*, where H* is the conjugate transpose: take the transpose, then complex-conjugate each entry. So H_ij = conj(H_ji). Diagonal entries must be real (since H_ii = conj(H_ii) means imaginary part is zero). Real symmetric matrices are the special case where all entries are real and H = H^T. Hermitian matrices are the matrix analog of real numbers — they correspond to real-valued quadratic forms x*Hx and to physical observables in quantum mechanics.

Why are eigenvalues of Hermitian matrices real?

Suppose Hv = λv with v nonzero. Then v*Hv = λ v*v = λ‖v‖². But also v*Hv = (Hv)*v / wait, more carefully: take conjugate transpose of Hv = λv to get v*H* = conj(λ) v*. Multiply on the right by v: v*H*v = conj(λ) ‖v‖². Since H* = H (Hermitian), v*Hv = conj(λ) ‖v‖². Combining: λ‖v‖² = conj(λ)‖v‖², so λ = conj(λ), meaning λ is real. The same argument works for any self-adjoint operator on an inner product space.

Why are eigenvectors orthogonal?

Suppose Hv = λv and Hw = μw with λ ≠ μ. Then v*Hw = μ v*w. But also v*Hw = (Hv)*w = λ* v*w = λ v*w (eigenvalues real). So λ v*w = μ v*w, meaning (λ−μ)v*w = 0. Since λ ≠ μ, v*w = 0 — orthogonal. For repeated eigenvalues, the eigenspace is multi-dimensional but you can apply Gram-Schmidt within it to get an orthonormal basis. The full result: an orthonormal basis of ℂⁿ consisting entirely of H-eigenvectors.

How does PCA use the spectral theorem?

Given data matrix X (rows = observations, columns = features), the covariance matrix C = X^T X / (n−1) is real symmetric. Spectral theorem gives C = QΛQ^T where Q is orthogonal, Λ diagonal with sorted eigenvalues. Columns of Q are orthonormal — the principal components. Largest eigenvalue = direction of maximum variance; smallest = direction of minimum variance. Project data onto top k columns of Q to get the best k-dimensional linear approximation in least-squares sense. Without spectral theorem there'd be no guarantee that orthonormal directions even exist.

How does it extend to operators on Hilbert space?

Three increasingly general versions. Compact self-adjoint operators on a Hilbert space have a discrete spectrum of real eigenvalues accumulating only at zero, with an orthonormal eigenbasis (Hilbert-Schmidt 1907). Bounded self-adjoint operators may have continuous spectrum; the eigendecomposition becomes an integral against a projection-valued measure (the spectral measure). Unbounded self-adjoint operators (like position and momentum in quantum mechanics) require careful domain considerations and use the same projection-valued-measure formalism (von Neumann 1929).

What is the singular value decomposition's relation?

SVD generalizes spectral decomposition to non-square, non-Hermitian matrices. For any A ∈ ℂ^(m×n): A = UΣV* where U, V are unitary and Σ has nonnegative real values (singular values) on the diagonal. The singular values are square roots of eigenvalues of A*A — and A*A is Hermitian positive-semidefinite, so the spectral theorem applies. Columns of V are eigenvectors of A*A; columns of U are eigenvectors of AA*. SVD = spectral theorem applied twice. For Hermitian A, SVD and spectral coincide up to sign of eigenvalues.