Linear Algebra

Rank–Nullity Theorem

The conservation law every linear map obeys

For any linear map T : V → W, dim(ker T) + dim(im T) = dim V. For matrices, rank + nullity equals the number of columns. The conservation law of dimension that pins down every linear transformation.

  • Statementdim(ker T) + dim(im T) = dim V
  • Matrix formrank(A) + nullity(A) = n (columns of A)
  • Computed viaRow-reduce to RREF — pivots count rank, free columns count nullity
  • CostO(m·n·min(m,n)) for an m×n matrix
  • Numerical rankCount of singular values σᵢ > σ₁ · 10⁻¹² (SVD-based)
  • Used inSolvability of Ax = b, degrees of freedom, system identification, control theory

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What it says, in one sentence

Take any linear map T from a finite-dimensional vector space V into another vector space W. Look at two subspaces:

  • The kernel ker T = { v ∈ V : T(v) = 0 } — everything that gets crushed to zero.
  • The image im T = { T(v) : v ∈ V } — everything that comes out the other side.

The rank–nullity theorem says:

dim(ker T) + dim(im T) = dim V

Or in matrix language, for an m × n matrix A:

rank(A) + nullity(A) = n   (the number of columns)

The rank is dim(im A) — the dimension of the column space. The nullity is dim(ker A) — the dimension of the null space.

Why it is a conservation law

Pick any basis v₁, …, v_n of V. After T acts, each basis vector either goes to zero (and contributes to the kernel) or contributes a non-trivial direction in the image. There is no third option. Every input direction is accounted for, and no output direction appears that wasn't already in the image. The bookkeeping is exact — dim V dimensions go in, exactly dim V dimensions are accounted for as kernel-plus-image.

This is why a "wide" matrix (more columns than rows) always has a non-trivial kernel — rank ≤ m < n forces nullity ≥ n − m. And a "tall" matrix (more rows than columns) can have nullity zero, but its image cannot fill the codomain.

Worked example — a 3×5 matrix

Find rank and nullity of

A = [ 1   2   1   3   1 ]
    [ 2   4   3   8   3 ]
    [ 3   6   2   7   2 ]

Row-reduce. Subtract 2·R1 from R2, 3·R1 from R3:

[ 1   2   1   3   1 ]
[ 0   0   1   2   1 ]
[ 0   0  −1  −2  −1 ]

Add R2 to R3:

[ 1   2   1   3   1 ]
[ 0   0   1   2   1 ]
[ 0   0   0   0   0 ]

Clean up R1 ← R1 − R2 to get RREF:

[ 1   2   0   1   0 ]
[ 0   0   1   2   1 ]
[ 0   0   0   0   0 ]

Read off. Pivots sit in columns 1 and 3 — rank = 2. Free columns are 2, 4, 5 — nullity = 3. Check: 2 + 3 = 5 = number of columns. ✓

Basis of the kernel (set each free variable to 1 in turn and read the pivot rows):

[ −2,  1,  0,  0,  0 ]    (x₂ = 1)
[ −1,  0, −2,  1,  0 ]    (x₄ = 1)
[  0,  0, −1,  0,  1 ]    (x₅ = 1)

Three vectors, dimension 3 — matches the nullity. The image is spanned by columns 1 and 3 of the original A — two vectors, dimension 2, matches the rank.

Variants of the statement

  • Row rank = column rank. A separate theorem, but a corollary in disguise — rank(A) = rank(Aᵀ). Together with rank–nullity on A and Aᵀ, it pins down all four fundamental subspaces.
  • For linear maps between modules. Over a field, the statement is clean. Over a ring (Z-modules, for example), one must use the rank of a finitely generated module and torsion appears — the identity needs modification.
  • For Fredholm operators on infinite-dimensional Hilbert spaces. Kernel is finite-dimensional, image has finite codimension, and the integer-valued index dim(ker T) − codim(im T) replaces the naive identity. The index is invariant under compact perturbations — the keystone of Atiyah–Singer index theory.
  • For exact sequences. A short exact sequence 0 → A → B → C → 0 of finite-dimensional vector spaces gives dim B = dim A + dim C. Rank–nullity is the special case for one map T : V → W, written as 0 → ker T → V → im T → 0.

Algorithm — compute rank and nullity

function rankAndNullity(A):
    m, n = dimensions of A
    R    = row_reduce_to_RREF(A)
    pivot_cols = []
    j = 0
    for i in 0 .. m-1:
        while j < n and R[i][j] == 0:
            j += 1
        if j == n: break
        pivot_cols.append(j)
        j += 1
    rank    = length(pivot_cols)
    nullity = n - rank
    return rank, nullity

Row reduction is O(m·n·min(m,n)). For numerical stability use a rank-revealing factorization: SVD (most reliable; O(mn²)), QR with column pivoting (cheaper; O(mn²) with a smaller constant), or rank-revealing LU. Plain Gaussian elimination gives the right answer only in exact arithmetic; in floating point it can over- or under-estimate the numerical rank when singular values are close to machine epsilon.

Useful corollaries

StatementWhy it follows
A wide matrix (n > m) always has nontrivial kernelrank ≤ m < n, so nullity = n − rank ≥ n − m > 0
A square matrix is invertible iff nullity = 0nullity 0 ⇒ rank = n ⇒ columns span ⇒ A is bijective
Ax = b solvable iff b ∈ col(A); solution unique iff nullity = 0solutions form a particular + ker(A) coset
If T : V → V and dim V is finite, T injective ⇔ T surjectiveboth equivalent to nullity = 0 ⇔ rank = dim V
rank(AB) ≤ min(rank A, rank B)im(AB) ⊆ im(A) and ker(B) ⊆ ker(AB)
rank(A) + rank(B) − n ≤ rank(A + B) ≤ rank(A) + rank(B)Sylvester rank inequality, proven via rank–nullity on stacked maps

Common mistakes

  • Counting on the wrong side. Rank + nullity is always the dimension of the domain (the number of columns), not the codomain. Beginners write rank + nullity = m and get spurious answers for non-square matrices.
  • Confusing the column space with the row space. They have the same dimension (rank), but they live in different ambient spaces — column space in R^m, row space in R^n. The kernel is orthogonal to the row space, not to the column space.
  • Trusting Gaussian elimination on noisy data. Round-off in finite-precision arithmetic can create spurious nonzero entries that look like extra pivots, inflating the apparent rank. For numerical rank, always use SVD with a tolerance.
  • Forgetting nullity counts free variables, not zero rows. The number of zero rows in RREF is m − rank, not the nullity. Free columns are what give the kernel its dimension.
  • Reading the kernel basis from the wrong columns. The pivot rows of RREF tell you the dependence of pivot variables on free variables. The kernel basis is built one free variable at a time, with the corresponding negative entries in the pivot positions.
  • Applying it over rings instead of fields. Z-modules (lattices) have torsion, so the naive count fails — rank of a Z-module is defined only after modding out torsion. The theorem is a vector-space identity first, ring-module generalisations need care.

Where the theorem shows up

  • Solvability of linear systems. Existence: b ∈ col(A), a space of dimension rank(A). Uniqueness: solutions differ by a vector in ker(A), a space of dimension nullity(A). Both questions are answered by the same identity.
  • Degrees of freedom in mechanical and circuit systems. A constraint matrix A specifies which configurations are allowed; nullity(A) counts the free degrees of motion. Mechanical engineers literally compute "DOF = 6N − rank(constraint matrix)" for a system of N rigid bodies.
  • Control theory — observability and controllability. A state-space system (A, B, C) is controllable iff the controllability matrix [B, AB, A²B, …, A^(n−1)B] has rank n; rank–nullity tells you exactly how many uncontrollable directions remain.
  • Compressed sensing and sparse recovery. If A is an m × n sensing matrix with m < n, the kernel has dimension at least n − m, so x is non-unique without prior structure. The job of compressed sensing is to use sparsity to pick out a unique element of an enormous-dimensional null space.
  • Singular value decomposition. The rank of A is the number of nonzero singular values, the image is spanned by the first rank columns of U, and the kernel is spanned by the last n − rank columns of V. SVD is the rank–nullity decomposition made constructive and orthogonal.
  • Algebraic topology. The Euler characteristic of a chain complex is an alternating sum of ranks of boundary maps; rank–nullity is the engine that turns the alternating sum into Betti numbers via the formula β_k = dim(ker ∂_k) − dim(im ∂_(k+1)).

Frequently asked questions

What does the rank–nullity theorem actually say?

For a linear map T : V → W on a finite-dimensional V, dim(ker T) + dim(im T) = dim V. For matrices it reads rank(A) + nullity(A) = number of columns. The input dimensions split cleanly into a piece that gets crushed to zero (the kernel) and a piece that survives as the image — none get lost, none appear out of nowhere.

Why is it called a conservation law?

Because dim V is fixed and the theorem says no dimension is ever created or destroyed by a linear map — every input direction is either flattened into the kernel or survives in the image, in exactly the right proportions to sum to dim V. It is the linear-algebra analogue of conservation of energy: a bookkeeping identity that holds regardless of how complicated the map looks.

How do I compute rank and nullity in practice?

Row-reduce the matrix to reduced row echelon form. The number of pivot columns is the rank; the number of free columns (those without a pivot) is the nullity. They always sum to the number of columns. For numerical work, the rank is the number of singular values above a small tolerance — typically σ_i > σ_1 · 10^(-12) — because exact zero rarely shows up in floating point.

Does the theorem hold for non-square matrices?

Yes — the count is always over the number of columns, so it applies to any m × n matrix. A 3 × 5 matrix maps R^5 → R^3 and has rank + nullity = 5, regardless of the codomain dimension. Tall matrices (m > n) can have full column rank (nullity 0); wide matrices (n > m) always have nullity ≥ n − m.

What does it tell me about solutions of Ax = b?

Two things at once. Existence: Ax = b has a solution if and only if b lies in the column space — a space of dimension rank(A). Uniqueness: if it has a solution, the full set is one particular solution plus the entire kernel — a space of dimension nullity(A). So nullity = 0 means at most one solution, and rank = m means at least one solution; together they give the unique-solution case.

Is there an infinite-dimensional version?

It needs care. For finite-dimensional V the theorem is clean. For infinite-dimensional Banach or Hilbert spaces, the right replacement is the Fredholm alternative: a Fredholm operator has finite-dimensional kernel and finite-codimensional image, and the index dim(ker T) − codim(im T) is a topological invariant. The rank–nullity identity becomes a statement about that index.

How is it related to the four fundamental subspaces?

The four subspaces are kernel and column space of A, and kernel and column space of A^T. Rank–nullity gives two identities — rank(A) + nullity(A) = n (columns) and rank(A^T) + nullity(A^T) = m (rows). Combined with rank(A) = rank(A^T), they fully account for every direction in R^n and R^m and decompose both spaces into pairs of orthogonal complements.