Linear Algebra

Matrix Inverse

The matrix that undoes A — when one exists

The inverse of a square matrix A is the unique matrix A⁻¹ satisfying A·A⁻¹ = A⁻¹·A = I. It exists if and only if det A ≠ 0 (equivalently, A's columns are linearly independent). Computed via the cofactor adjugate formula A⁻¹ = (1/det A)·adj(A), via Gauss–Jordan elimination on [A | I], or via LU decomposition. The inverse undoes the linear transformation A and is the abstract analogue of dividing by A.

Defining propertyA·A⁻¹ = A⁻¹·A = I
Exists iffA is square and det A ≠ 0
2×2 formula(1/(ad − bc)) · [[d, −b], [−c, a]]
General formulaA⁻¹ = (1/det A) · adj(A)
Numerical costO(n³) via LU or Gauss–Jordan
Order rule(AB)⁻¹ = B⁻¹A⁻¹

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What an inverse means

A matrix A acts on vectors by left multiplication: x ↦ A·x. The inverse A⁻¹ is the matrix that reverses that action. If A rotates the plane by 30°, A⁻¹ rotates it back by −30°. If A doubles the x-component and halves the y-component, A⁻¹ halves the x-component and doubles the y-component.

The defining equation is A·A⁻¹ = A⁻¹·A = I, where I is the identity matrix (1s on the diagonal, 0s elsewhere). For this equation to make sense, A must be square. For an inverse to exist, A must be nonsingular — det A ≠ 0.

The geometric reason singular matrices have no inverse: if det A = 0, the transformation A collapses some direction to zero, so multiple input vectors map to the same output. There is no way to recover the input uniquely; no inverse function exists.

Worked example — 2×2 inverse

Take A = [[2, 1], [3, 4]]. Compute det A = 2·4 − 1·3 = 5.

The 2×2 inverse formula is A⁻¹ = (1/det A) · [[d, −b], [−c, a]] (swap diagonal, negate off-diagonal, divide by det):

A⁻¹ = (1/5) · [[4, −1], [−3, 2]] = [[4/5, −1/5], [−3/5, 2/5]].

Verify: A·A⁻¹ = [[2, 1], [3, 4]] · [[4/5, −1/5], [−3/5, 2/5]] = [[8/5 − 3/5, −2/5 + 2/5], [12/5 − 12/5, −3/5 + 8/5]] = [[1, 0], [0, 1]]. ✓

Worked example — 3×3 inverse via cofactor adjugate

Take A = [[1, 2, 3], [0, 1, 4], [5, 6, 0]].

Step 1 — det A. Expand along the first row:

det A = 1·(1·0 − 4·6) − 2·(0·0 − 4·5) + 3·(0·6 − 1·5) = 1·(−24) − 2·(−20) + 3·(−5) = −24 + 40 − 15 = 1.

Step 2 — cofactor matrix C. C_ij = (−1)^i+j·M_ij, where M_ij is the minor (determinant of the 2×2 submatrix from deleting row i, column j).

C = [[−24, 20, −5], [18, −15, 4], [5, −4, 1]]

Step 3 — adjugate. adj(A) = C^T (transpose of the cofactor matrix):

adj(A) = [[−24, 18, 5], [20, −15, −4], [−5, 4, 1]]

Step 4 — divide by det. det A = 1, so

A⁻¹ = adj(A) = [[−24, 18, 5], [20, −15, −4], [−5, 4, 1]].

Quick sanity check: row 1 of A times column 1 of A⁻¹ = 1·(−24) + 2·20 + 3·(−5) = −24 + 40 − 15 = 1. ✓

Methods of finding A⁻¹

	Cofactor adjugate	Gauss–Jordan elimination	LU decomposition
Idea	A⁻¹ = (1/det A) · adj(A)	Row-reduce [A \| I] until A becomes I; the right block becomes A⁻¹	Factor A = LU; solve LU·X = I column by column
Best for	Symbolic 2×2 and 3×3 by hand	4×4 and larger by hand	Numerical computation, especially for many right-hand sides
Asymptotic cost	O(n!) recursive, O(n⁴) with caching	O(n³)	O(n³) factorisation + O(n²) per solve
Numerical stability	Poor — division by det amplifies errors	Decent with partial pivoting	Excellent with partial pivoting (PA = LU)
Closed-form output	Yes — explicit formula in entries	No — algorithmic	Yes (factors), but not a single closed form
Reusable for many right-hand sides?	Yes (compute A⁻¹ once)	Yes	Yes — most efficient: factor once, solve many
Detects singularity	det = 0 obviously	Pivot becomes zero	U has zero on diagonal

Practical advice: by hand on small symbolic matrices, use cofactors. Numerically, almost never compute A⁻¹ explicitly — solve the system Ax = b directly via LU, which is roughly twice as fast and far more stable.

Algebraic properties

(A⁻¹)⁻¹ = A. Inverting twice is the identity operation.
(AB)⁻¹ = B⁻¹A⁻¹. Reverse order — undo the most recent transformation first.
(Aᵀ)⁻¹ = (A⁻¹)ᵀ. Transpose and inverse commute.
(cA)⁻¹ = (1/c)·A⁻¹ for scalar c ≠ 0.
det(A⁻¹) = 1 / det A. Inverse scales volumes by the reciprocal factor.
Eigenvalues of A⁻¹ are reciprocals of eigenvalues of A. Same eigenvectors.
A is orthogonal ⇔ A⁻¹ = Aᵀ. Hugely useful — orthogonal matrices invert by transposition.

Classical applications

Solving linear systems. Ax = b has the formal solution x = A⁻¹b. In practice you do not compute A⁻¹; you solve directly.
Linear regression — normal equations. The least-squares estimator is β̂ = (XᵀX)⁻¹·Xᵀy. Every regression textbook displays the inverse, but real software computes it via QR or SVD for stability.
Change of basis. If P is the change-of-basis matrix from one coordinate system to another, the same linear map represented in the new basis is P⁻¹·A·P. Diagonalisation A = P·D·P⁻¹ is the canonical example.
Computer graphics. Camera and model matrices are inverted to map between world, view, and screen coordinates. Orthogonal rotation matrices are inverted just by transposing.
Cryptography (Hill cipher). Encrypts blocks of letters as vectors multiplied by an invertible matrix; decryption uses the inverse modulo 26.
Markov chain analysis. Stationary distributions and fundamental matrices are computed by inverting (I − Q) where Q is the transient submatrix.

Common mistakes

Writing (AB)⁻¹ = A⁻¹B⁻¹. Wrong order. The correct identity is (AB)⁻¹ = B⁻¹A⁻¹. Drop this and almost everything in derivations breaks.
Trying to invert a non-square matrix. Only square matrices have classical inverses. For rectangular matrices, use the Moore–Penrose pseudoinverse A⁺.
Computing A⁻¹ explicitly when you only need A⁻¹b. Slower and less stable than solving Ax = b directly.
Forgetting to check det ≠ 0. If det A = 0, no inverse exists. Numerical software returns garbage or warnings rather than refusing outright.
Using the cofactor formula for large matrices. O(n!) growth makes it impractical beyond about 5×5. Switch to Gauss–Jordan or LU.
Confusing A⁻¹ with 1/A. 1/A is not a defined operation on matrices — there is no entry-wise division that makes algebraic sense. Always write A⁻¹.
Mixing right and left inverses. For square matrices, left inverse = right inverse = the inverse. For rectangular matrices, one-sided inverses can exist without the other.

Frequently asked questions

Why does A⁻¹ exist iff det A ≠ 0?

Det A measures how A scales volumes. Det = 0 means A collapses some direction to zero — multiple inputs map to the same output, so A is not one-to-one and cannot be undone. Algebraically, the cofactor formula A⁻¹ = (1/det A)·adj(A) divides by det, which is undefined when det = 0. The condition det ≠ 0 is equivalent to: columns linearly independent, rank n, only kernel is 0, and the equation Ax = b has a unique solution for every b.

What is the formula for the inverse of a 2×2 matrix?

For A = [[a, b], [c, d]] with det A = ad − bc ≠ 0, the inverse is A⁻¹ = (1 / (ad − bc)) · [[d, −b], [−c, a]]. Swap a and d on the diagonal, negate b and c off the diagonal, divide by det. This is the cofactor adjugate formula specialised to 2×2.

Is (AB)⁻¹ = A⁻¹B⁻¹?

No. The correct identity is (AB)⁻¹ = B⁻¹A⁻¹. You unwind composed transformations in reverse order — like taking off shoes and socks. Verify by multiplying: AB·B⁻¹A⁻¹ = A·(BB⁻¹)·A⁻¹ = A·I·A⁻¹ = AA⁻¹ = I. The wrong order gives AB·A⁻¹B⁻¹, which only equals I when A and B commute.

Which method should I use to compute A⁻¹ in practice?

For 2×2 matrices, the cofactor formula is fastest by hand. For 3×3, cofactor still works but Gauss–Jordan on [A | I] starts to be competitive. For n ≥ 4 by hand, Gauss–Jordan wins. Numerically, a computer almost never explicitly inverts — it computes an LU or QR decomposition once and uses forward/back substitution to solve Ax = b. Cofactor expansion is O(n!), Gauss–Jordan and LU are O(n³).

What is the difference between an inverse and a pseudoinverse?

The classical inverse only exists for square nonsingular matrices. The Moore–Penrose pseudoinverse A⁺ generalises it to any rectangular or singular matrix and reduces to A⁻¹ when A is invertible. The pseudoinverse gives the least-squares solution to Ax = b: x = A⁺b minimises ‖Ax − b‖. It is the workhorse of linear regression and is computed via the singular value decomposition.

Why do numerical libraries warn against explicit inversion?

Computing A⁻¹ explicitly and then multiplying A⁻¹·b loses precision and is roughly twice as expensive as solving Ax = b directly with LU decomposition. For nearly singular A, the explicit inverse can have huge entries that swamp b in finite precision. Both NumPy and MATLAB documentation explicitly recommend solve(A, b) over inv(A)·b for this reason.