Linear Algebra
Cramer's Rule
Solve Ax = b with one big determinant per unknown
Cramer's rule solves Ax = b for an invertible square matrix A by expressing each unknown as a ratio of determinants: xi = det(Ai) / det(A), where Ai is A with column i replaced by b. Elegant and closed-form — useful in symbolic computation and theoretical proofs — but O(n!) by naive cofactor expansion and O(n⁴) even with optimisation, so Gaussian elimination at O(n³) wins for numerical work.
- Formulaxi = det(Ai) / det(A)
- Ai isA with column i replaced by b
- RequiresA square, det A ≠ 0
- CostO(n⁴) with LU on each det; O(n!) naive
- Compare toGaussian elimination: O(n³)
- Named afterGabriel Cramer (1750)
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
What Cramer's rule says
Take a system of n linear equations in n unknowns, written as the matrix equation Ax = b with A square. If det A ≠ 0 the system has a unique solution, and Cramer's rule gives a closed-form expression for each component:
xi = det(Ai) / det(A)
where Ai is the matrix obtained from A by replacing its i-th column with the vector b. To find n unknowns you compute n + 1 determinants — det A in the denominator (computed once) plus det Ai for each i.
The rule is elegant because every unknown gets the same form. It is theoretically beautiful because it expresses the solution explicitly in the entries of A and b. It is practically slow because computing many determinants is much more work than running Gaussian elimination once.
Worked example — 2×2 system
Solve
3x + 2y = 7
5x + 4y = 11
which is Ax = b with A = [[3, 2], [5, 4]] and b = [7, 11]T.
Step 1 — det A. det A = 3·4 − 2·5 = 12 − 10 = 2.
Step 2 — det A₁. Replace column 1 of A with b: A₁ = [[7, 2], [11, 4]]. det A₁ = 7·4 − 2·11 = 28 − 22 = 6.
Step 3 — det A₂. Replace column 2 of A with b: A₂ = [[3, 7], [5, 11]]. det A₂ = 3·11 − 7·5 = 33 − 35 = −2.
Step 4 — apply Cramer.
x = det A₁ / det A = 6 / 2 = 3, y = det A₂ / det A = −2 / 2 = −1.
Verify: 3·3 + 2·(−1) = 9 − 2 = 7 ✓ and 5·3 + 4·(−1) = 15 − 4 = 11 ✓.
Worked example — 3×3 system
Solve
x + 2y + 3z = 5
2x + 5y + 3z = 3
x + 8z = 17
with A = [[1, 2, 3], [2, 5, 3], [1, 0, 8]] and b = [5, 3, 17]T.
det A: Expanding along the first row, det A = 1·(5·8 − 3·0) − 2·(2·8 − 3·1) + 3·(2·0 − 5·1) = 40 − 2·13 + 3·(−5) = 40 − 26 − 15 = −1.
det A₁ (column 1 = b): det of [[5, 2, 3], [3, 5, 3], [17, 0, 8]] = 5·(40 − 0) − 2·(24 − 51) + 3·(0 − 85) = 200 + 54 − 255 = −1. So x = −1 / −1 = 1.
det A₂ (column 2 = b): det of [[1, 5, 3], [2, 3, 3], [1, 17, 8]] = 1·(24 − 51) − 5·(16 − 3) + 3·(34 − 3) = −27 − 65 + 93 = 1. So y = 1 / −1 = −1.
det A₃ (column 3 = b): det of [[1, 2, 5], [2, 5, 3], [1, 0, 17]] = 1·(85 − 0) − 2·(34 − 3) + 5·(0 − 5) = 85 − 62 − 25 = −2. So z = −2 / −1 = 2.
Solution: x = 1, y = −1, z = 2. Verify: 1 − 2 + 6 = 5 ✓; 2 − 5 + 6 = 3 ✓; 1 + 0 + 16 = 17 ✓.
When the rule fails — det A = 0
If det A = 0 the matrix A is singular, the system has either no solution or infinitely many, and Cramer's rule cannot apply because the denominator is zero. Three diagnostic cases:
- det A = 0 and every det Ai = 0: System is consistent but underdetermined — infinitely many solutions parameterised by the kernel of A.
- det A = 0 and some det Ai ≠ 0: System is inconsistent — no solution.
- Numerically near-singular: det A is small but nonzero. Cramer technically applies but the answer is dominated by floating-point error. Use Gaussian elimination with partial pivoting instead.
For the singular case, Cramer's rule gives no information about the solution structure. Gaussian elimination, by contrast, produces the row-reduced echelon form which exposes both the inconsistency or the parametric family directly.
Cramer's rule vs Gaussian elimination
| Cramer's rule | Gaussian elimination | |
|---|---|---|
| Output form | Closed-form ratio of determinants | Algorithmic — row reduction to echelon form |
| Cost (n×n system) | O(n⁴) using LU per determinant; O(n!) naive | O(n³) |
| Numerical stability | Poor — division by det amplifies error | Good with partial pivoting |
| Handles singular A | No — denominator is zero | Yes — exposes inconsistency or parametric family |
| Symbolic computation | Excellent — entries can be formulas | Possible but answers are harder to compare |
| Many right-hand sides (same A) | Recompute n det Ai per b | Factor A once, solve cheaply for each b |
| By-hand for 2×2 / 3×3 | Fast and clean | Slightly more bookkeeping |
| Use in proofs | Common — gives explicit formula for inverse-function and Jacobian arguments | Rare |
The trade-off is closed-form elegance against computational efficiency. For a 2×2 system Cramer wins on simplicity. For a 1000×1000 system Gaussian elimination is roughly 1000× faster and dramatically more accurate.
Why it works — the cofactor adjugate connection
Cramer's rule is not a fluke. It is the cofactor adjugate formula for A⁻¹ written out component by component.
The inverse satisfies A⁻¹ = (1/det A) · adj(A), where adj(A) is the transpose of the cofactor matrix. Then x = A⁻¹b expands as:
xi = (1/det A) · Σj [adj(A)]ij·bj = (1/det A) · Σj Cji·bj
The sum Σj Cji·bj is exactly the cofactor expansion of det(Ai) along its i-th column — the column that has been replaced by b. So
xi = det(Ai) / det(A).
The rule and the inverse formula are the same fact stated differently.
Classical applications
- Symbolic computer algebra. Systems of equations whose entries are polynomials, rational functions, or algebraic expressions are easier to solve symbolically with Cramer than with Gaussian pivoting (which can introduce ugly conditional branches when pivots become zero).
- Implicit function theorem. Cramer's rule appears explicitly in the proof: when ∂F/∂y is invertible, the partial derivatives of the implicit function are ratios of Jacobian determinants — Cramer's rule applied to the linearised system.
- Cramer–Rao bound (statistics). The lower bound on the variance of an unbiased estimator involves the inverse Fisher information matrix and inherits the determinant ratio structure.
- Geometric proofs. Volume-ratio arguments in n-dimensional geometry often use Cramer's rule to express coordinates of a point relative to n vectors as volume fractions.
- Engineering hand calculations. Any time an exam or textbook asks to solve a 2×2 or 3×3 system on paper, Cramer is often the cleanest option.
- Bezout-style theorems. Some classical proofs in algebraic geometry leverage Cramer's rule when reducing intersection problems to linear ones.
Common mistakes
- Replacing rows instead of columns. Ai is A with the i-th column swapped for b — not the i-th row. Doing rows produces a different determinant and the wrong answer.
- Forgetting to divide by det A. The numerator det Ai is not the answer by itself. Always divide.
- Applying Cramer when det A = 0. The rule does not generalise to the singular case. Switch to Gaussian elimination.
- Using Cramer numerically for large n. O(n⁴) cost and amplified rounding error make it the wrong choice past about n = 4 unless you are doing symbolic algebra.
- Sign errors in cofactor expansion. The (−1)i+j factor in cofactor expansion is the most common source of arithmetic mistakes; double-check the checkerboard pattern.
- Confusing Cramer's rule with the matrix inverse formula. They are equivalent but distinct in usage: Cramer gives x directly, the adjugate formula gives A⁻¹ for any future right-hand side.
Frequently asked questions
Why does Cramer's rule actually work?
It is the cofactor adjugate formula for A⁻¹ applied to Ax = b. Writing x = A⁻¹b = (1/det A)·adj(A)·b and expanding the i-th component gives xi = (1/det A) · Σj Cji·bj. The sum on the right is exactly the cofactor expansion of det(Ai) along its i-th column — the column that has been replaced by b. So xi = det(Ai) / det(A).
What happens when det A = 0?
Cramer's rule fails. If det A = 0 the system Ax = b is either inconsistent (no solution) or underdetermined (infinitely many solutions), and there is no unique x to compute. The denominator division is undefined and the numerator det(Ai) does not encode the right answer either. To handle the singular case use Gaussian elimination, which reveals which rows become zero and exposes the kernel.
Why is Cramer's rule slow numerically?
Solving an n×n system by Cramer requires computing n + 1 determinants (det A and det Ai for each i). Naive cofactor expansion is O(n!), and even using LU each determinant is O(n³), so total cost is O(n⁴). Gaussian elimination solves the same system in O(n³). For n = 100, Cramer is roughly 100× slower; for n = 1000, 1000× slower.
When is Cramer's rule still useful?
Three places. First, symbolic computation — when entries are formulas rather than numbers, having a closed form for each xi is more useful than an algorithmic LU. Second, theoretical proofs — Cramer's rule appears in derivations of inverse-function theorems, integration formulas (Jacobian determinants), and Cramer–Rao bounds in statistics. Third, very small systems (2×2 or 3×3) by hand, where it is faster than full elimination.
Is there a geometric interpretation?
Yes. det A is the volume of the parallelepiped spanned by the columns of A. det Ai is the volume of the same parallelepiped after column i is replaced by b. So xi = det(Ai) / det(A) is the ratio of these two volumes — the factor by which b "looks like" the i-th basis direction in the column-space coordinates. This is sometimes called the volume-ratio interpretation of Cramer.
Does Cramer's rule generalise beyond linear systems?
The pattern shows up wherever inverses are expressed as adjugate-over-determinant. The implicit function theorem and inverse function theorem in multivariable calculus use Jacobian determinants in essentially the Cramer pattern. The Cramer–Rao lower bound in statistics uses the inverse of the Fisher information matrix and inherits the determinant ratio in its expression. The name attaches to the linear case but the idea is broader.