Linear Algebra

Cramer's Rule

Solve Ax = b with one big determinant per unknown

Cramer's rule solves Ax = b for an invertible square matrix A by expressing each unknown as a ratio of determinants: x_i = det(A_i) / det(A), where A_i is A with column i replaced by b. Elegant and closed-form — useful in symbolic computation and theoretical proofs — but O(n!) by naive cofactor expansion and O(n⁴) even with optimisation, so Gaussian elimination at O(n³) wins for numerical work.

Formulax_i = det(A_i) / det(A)
A_i isA with column i replaced by b
RequiresA square, det A ≠ 0
CostO(n⁴) with LU on each det; O(n!) naive
Compare toGaussian elimination: O(n³)
Named afterGabriel Cramer (1750)

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What Cramer's rule says

Take a system of n linear equations in n unknowns, written as the matrix equation Ax = b with A square. If det A ≠ 0 the system has a unique solution, and Cramer's rule gives a closed-form expression for each component:

x_i = det(A_i) / det(A)

where A_i is the matrix obtained from A by replacing its i-th column with the vector b. To find n unknowns you compute n + 1 determinants — det A in the denominator (computed once) plus det A_i for each i.

The rule is elegant because every unknown gets the same form. It is theoretically beautiful because it expresses the solution explicitly in the entries of A and b. It is practically slow because computing many determinants is much more work than running Gaussian elimination once.

Worked example — 2×2 system

Solve

3x + 2y = 7
5x + 4y = 11

which is Ax = b with A = [[3, 2], [5, 4]] and b = [7, 11]^T.

Step 1 — det A. det A = 3·4 − 2·5 = 12 − 10 = 2.

Step 2 — det A₁. Replace column 1 of A with b: A₁ = [[7, 2], [11, 4]]. det A₁ = 7·4 − 2·11 = 28 − 22 = 6.

Step 3 — det A₂. Replace column 2 of A with b: A₂ = [[3, 7], [5, 11]]. det A₂ = 3·11 − 7·5 = 33 − 35 = −2.

Step 4 — apply Cramer.

x = det A₁ / det A = 6 / 2 = 3, y = det A₂ / det A = −2 / 2 = −1.

Verify: 3·3 + 2·(−1) = 9 − 2 = 7 ✓ and 5·3 + 4·(−1) = 15 − 4 = 11 ✓.

Worked example — 3×3 system

Solve

x + 2y + 3z = 5
2x + 5y + 3z = 3
x + 8z = 17

with A = [[1, 2, 3], [2, 5, 3], [1, 0, 8]] and b = [5, 3, 17]^T.

det A: Expanding along the first row, det A = 1·(5·8 − 3·0) − 2·(2·8 − 3·1) + 3·(2·0 − 5·1) = 40 − 2·13 + 3·(−5) = 40 − 26 − 15 = −1.

det A₁ (column 1 = b): det of [[5, 2, 3], [3, 5, 3], [17, 0, 8]] = 5·(40 − 0) − 2·(24 − 51) + 3·(0 − 85) = 200 + 54 − 255 = −1. So x = −1 / −1 = 1.

det A₂ (column 2 = b): det of [[1, 5, 3], [2, 3, 3], [1, 17, 8]] = 1·(24 − 51) − 5·(16 − 3) + 3·(34 − 3) = −27 − 65 + 93 = 1. So y = 1 / −1 = −1.

det A₃ (column 3 = b): det of [[1, 2, 5], [2, 5, 3], [1, 0, 17]] = 1·(85 − 0) − 2·(34 − 3) + 5·(0 − 5) = 85 − 62 − 25 = −2. So z = −2 / −1 = 2.

Solution: x = 1, y = −1, z = 2. Verify: 1 − 2 + 6 = 5 ✓; 2 − 5 + 6 = 3 ✓; 1 + 0 + 16 = 17 ✓.

When the rule fails — det A = 0

If det A = 0 the matrix A is singular, the system has either no solution or infinitely many, and Cramer's rule cannot apply because the denominator is zero. Three diagnostic cases:

det A = 0 and every det A_i = 0: System is consistent but underdetermined — infinitely many solutions parameterised by the kernel of A.
det A = 0 and some det A_i ≠ 0: System is inconsistent — no solution.
Numerically near-singular: det A is small but nonzero. Cramer technically applies but the answer is dominated by floating-point error. Use Gaussian elimination with partial pivoting instead.

For the singular case, Cramer's rule gives no information about the solution structure. Gaussian elimination, by contrast, produces the row-reduced echelon form which exposes both the inconsistency or the parametric family directly.

Cramer's rule vs Gaussian elimination

	Cramer's rule	Gaussian elimination
Output form	Closed-form ratio of determinants	Algorithmic — row reduction to echelon form
Cost (n×n system)	O(n⁴) using LU per determinant; O(n!) naive	O(n³)
Numerical stability	Poor — division by det amplifies error	Good with partial pivoting
Handles singular A	No — denominator is zero	Yes — exposes inconsistency or parametric family
Symbolic computation	Excellent — entries can be formulas	Possible but answers are harder to compare
Many right-hand sides (same A)	Recompute n det A_i per b	Factor A once, solve cheaply for each b
By-hand for 2×2 / 3×3	Fast and clean	Slightly more bookkeeping
Use in proofs	Common — gives explicit formula for inverse-function and Jacobian arguments	Rare

The trade-off is closed-form elegance against computational efficiency. For a 2×2 system Cramer wins on simplicity. For a 1000×1000 system Gaussian elimination is roughly 1000× faster and dramatically more accurate.

Why it works — the cofactor adjugate connection

Cramer's rule is not a fluke. It is the cofactor adjugate formula for A⁻¹ written out component by component.

The inverse satisfies A⁻¹ = (1/det A) · adj(A), where adj(A) is the transpose of the cofactor matrix. Then x = A⁻¹b expands as:

x_i = (1/det A) · Σ_j [adj(A)]_ij·b_j = (1/det A) · Σ_j C_ji·b_j

The sum Σ_j C_ji·b_j is exactly the cofactor expansion of det(A_i) along its i-th column — the column that has been replaced by b. So

x_i = det(A_i) / det(A).

The rule and the inverse formula are the same fact stated differently.

Classical applications

Symbolic computer algebra. Systems of equations whose entries are polynomials, rational functions, or algebraic expressions are easier to solve symbolically with Cramer than with Gaussian pivoting (which can introduce ugly conditional branches when pivots become zero).
Implicit function theorem. Cramer's rule appears explicitly in the proof: when ∂F/∂y is invertible, the partial derivatives of the implicit function are ratios of Jacobian determinants — Cramer's rule applied to the linearised system.
Cramer–Rao bound (statistics). The lower bound on the variance of an unbiased estimator involves the inverse Fisher information matrix and inherits the determinant ratio structure.
Geometric proofs. Volume-ratio arguments in n-dimensional geometry often use Cramer's rule to express coordinates of a point relative to n vectors as volume fractions.
Engineering hand calculations. Any time an exam or textbook asks to solve a 2×2 or 3×3 system on paper, Cramer is often the cleanest option.
Bezout-style theorems. Some classical proofs in algebraic geometry leverage Cramer's rule when reducing intersection problems to linear ones.

Common mistakes

Replacing rows instead of columns. A_i is A with the i-th column swapped for b — not the i-th row. Doing rows produces a different determinant and the wrong answer.
Forgetting to divide by det A. The numerator det A_i is not the answer by itself. Always divide.
Applying Cramer when det A = 0. The rule does not generalise to the singular case. Switch to Gaussian elimination.
Using Cramer numerically for large n. O(n⁴) cost and amplified rounding error make it the wrong choice past about n = 4 unless you are doing symbolic algebra.
Sign errors in cofactor expansion. The (−1)^i+j factor in cofactor expansion is the most common source of arithmetic mistakes; double-check the checkerboard pattern.
Confusing Cramer's rule with the matrix inverse formula. They are equivalent but distinct in usage: Cramer gives x directly, the adjugate formula gives A⁻¹ for any future right-hand side.

Frequently asked questions

Why does Cramer's rule actually work?

It is the cofactor adjugate formula for A⁻¹ applied to Ax = b. Writing x = A⁻¹b = (1/det A)·adj(A)·b and expanding the i-th component gives x_i = (1/det A) · Σ_j C_ji·b_j. The sum on the right is exactly the cofactor expansion of det(A_i) along its i-th column — the column that has been replaced by b. So x_i = det(A_i) / det(A).

What happens when det A = 0?

Cramer's rule fails. If det A = 0 the system Ax = b is either inconsistent (no solution) or underdetermined (infinitely many solutions), and there is no unique x to compute. The denominator division is undefined and the numerator det(A_i) does not encode the right answer either. To handle the singular case use Gaussian elimination, which reveals which rows become zero and exposes the kernel.

Why is Cramer's rule slow numerically?

Solving an n×n system by Cramer requires computing n + 1 determinants (det A and det A_i for each i). Naive cofactor expansion is O(n!), and even using LU each determinant is O(n³), so total cost is O(n⁴). Gaussian elimination solves the same system in O(n³). For n = 100, Cramer is roughly 100× slower; for n = 1000, 1000× slower.

When is Cramer's rule still useful?

Three places. First, symbolic computation — when entries are formulas rather than numbers, having a closed form for each x_i is more useful than an algorithmic LU. Second, theoretical proofs — Cramer's rule appears in derivations of inverse-function theorems, integration formulas (Jacobian determinants), and Cramer–Rao bounds in statistics. Third, very small systems (2×2 or 3×3) by hand, where it is faster than full elimination.

Is there a geometric interpretation?

Yes. det A is the volume of the parallelepiped spanned by the columns of A. det A_i is the volume of the same parallelepiped after column i is replaced by b. So x_i = det(A_i) / det(A) is the ratio of these two volumes — the factor by which b "looks like" the i-th basis direction in the column-space coordinates. This is sometimes called the volume-ratio interpretation of Cramer.

Does Cramer's rule generalise beyond linear systems?

The pattern shows up wherever inverses are expressed as adjugate-over-determinant. The implicit function theorem and inverse function theorem in multivariable calculus use Jacobian determinants in essentially the Cramer pattern. The Cramer–Rao lower bound in statistics uses the inverse of the Fisher information matrix and inherits the determinant ratio in its expression. The name attaches to the linear case but the idea is broader.