Linear Algebra

QR Decomposition

An orthonormal basis for the columns, and a triangular record of how to get there

QR decomposition factors a matrix into an orthogonal Q and an upper-triangular R, exposing an orthonormal basis for the column space. It is the backbone of least-squares fitting and the QR algorithm for computing eigenvalues.

  • FactorizationA = Q · R
  • QOrthogonal — QTQ = I
  • RUpper-triangular
  • Householder cost (m×n, m ≥ n)2mn² − 2n³/3
  • StabilityBackward-stable; preserves conditioning
  • Used inLeast squares, eigensolvers, orthonormal bases, Gram–Schmidt

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

What QR is doing

Given an m×n matrix A with linearly independent columns, QR finds an orthonormal basis q1, q2, …, qn for the same column space and records each original column as a triangular combination of those basis vectors:

aₖ = R₁ₖ · q₁ + R₂ₖ · q₂ + ⋯ + Rₖₖ · qₖ

The rows above the diagonal of R describe how each new column overlaps previously orthogonalized directions; the diagonal entry Rkk is the length of whatever residue is left orthogonal to those directions. Stacked into matrix form, that bookkeeping is exactly A = Q·R.

Method 1 — Gram–Schmidt orthogonalization

The textbook construction. For k = 1, 2, …, n:

  1. Project ak onto each previous qj: Rjk = qjT·ak.
  2. Subtract those projections: uk = ak − Σj<k Rjk·qj.
  3. Normalize: Rkk = ‖uk‖ and qk = uk / Rkk.

This is intuitive but numerically fragile in finite precision. Modified Gram–Schmidt fixes it by re-orthogonalizing one column at a time as new vectors come in, so accumulated rounding errors do not pile up. Use modified, never classical, in code.

Method 2 — Householder reflections

Householder is the production-grade method. The idea: design a single reflection matrix Hk that maps the column-k subdiagonal entries to zero in one shot, then compose those reflections.

For column k, let x be the part of column k from row k downward. Define

v = x + sign(x₁) · ‖x‖ · e₁
H = I − 2 · v vᵀ / (vᵀv)

H is a reflection across the hyperplane orthogonal to v. By construction H·x = ∓‖x‖·e1 — every entry below the first becomes zero. Apply H to the trailing block of A and you have killed column k below the diagonal in one matrix-block operation. After n−1 such reflections, A is upper-triangular = R, and the product Hn−1 ⋯ H1 = QT.

Method 3 — Givens rotations

Where Householder zeros a whole column at a time, a Givens rotation zeros one entry at a time. A Givens rotation is a 2×2 plane rotation embedded in an n×n identity:

[ c  s ]   acts on rows i, j to mix them and zero
[ −s c ]   one chosen entry of the matrix.

Choose c = aii/r and s = aji/r where r = √(aii² + aji²); applying the rotation zeroes aji. Givens is more expensive on dense matrices but ideal when the matrix is sparse or banded — only the rows involved in the rotation are touched, leaving structural zeros undisturbed.

Worked example — Gram–Schmidt on three columns

Factor A whose columns are a₁ = [1, 1, 0]T, a₂ = [1, 0, 1]T, a₃ = [0, 1, 1]T.

Column 1. R₁₁ = ‖a₁‖ = √2. q₁ = [1/√2, 1/√2, 0]T.

Column 2. R₁₂ = q₁T·a₂ = (1/√2)·1 + (1/√2)·0 + 0·1 = 1/√2.

u₂ = a₂ − R₁₂·q₁ = [1, 0, 1]T − (1/√2)·[1/√2, 1/√2, 0]T = [1/2, −1/2, 1]T.

R₂₂ = ‖u₂‖ = √(1/4 + 1/4 + 1) = √(3/2). q₂ = [1/√6, −1/√6, 2/√6]T.

Column 3. R₁₃ = q₁T·a₃ = 0·(1/√2) + 1·(1/√2) + 1·0 = 1/√2.

R₂₃ = q₂T·a₃ = 0·(1/√6) + 1·(−1/√6) + 1·(2/√6) = 1/√6.

u₃ = a₃ − R₁₃·q₁ − R₂₃·q₂ = [0, 1, 1]T − (1/√2)·[1/√2, 1/√2, 0]T − (1/√6)·[1/√6, −1/√6, 2/√6]T.

That works out to [−1/2 − 1/6, 1/2 + 1/6, 1 − 2/6]T = [−2/3, 2/3, 2/3]T.

R₃₃ = ‖u₃‖ = √(4/9 + 4/9 + 4/9) = 2/√3. q₃ = [−1/√3, 1/√3, 1/√3]T.

Result.

Q = [ 1/√2    1/√6   −1/√3 ]
    [ 1/√2  −1/√6    1/√3 ]
    [ 0      2/√6    1/√3 ]

R = [ √2    1/√2   1/√2 ]
    [  0   √(3/2)  1/√6 ]
    [  0     0    2/√3  ]

Sanity check: Q has orthonormal columns, R is upper triangular with positive diagonal, and Q·R reproduces A.

QR by Gram–Schmidt vs Householder vs Givens

Modified Gram–SchmidtHouseholderGivens
OperationProject and subtract column-by-columnReflect a whole column to e1Plane rotation zeroing one entry
Cost (dense, m ≥ n)2mn²2mn² − 2n³/33mn² − n³ (denser work per entry)
MemoryStores Q explicitlyStores reflections compactly in placeStores rotation pairs (c, s)
Numerical stabilityGood with re-orthogonalization, bad classicalExcellent — backward stableExcellent — backward stable
Best forTheory, teaching, building intuitionDense LAPACK-style factorizationsSparse/banded matrices and rank-1 updates
Forms Q explicitlyYesOptional — Q is implicit in the v vectorsOptional — Q is implicit in the (c, s) pairs
ParallelizableHard (sequential dependencies)Yes — block Householder is BLAS-3 friendlyYes — independent rotations can run together

Where QR shows up

  • Linear least squares. Min ‖Ax − b‖. Compute A = QR, then solve Rx = QTb by back-substitution. The numerical workhorse for regression, calibration, and data fitting.
  • The QR algorithm for eigenvalues. Iterating Ak+1 = RkQk converges to a quasi-triangular form whose diagonal blocks reveal the eigenvalues. Modern variants with shifts and Hessenberg reduction power LAPACK's geev.
  • Orthonormal basis construction. Q's first k columns are an orthonormal basis for span(a1, …, ak) — the workhorse of subspace methods like Krylov, GMRES, and Lanczos.
  • Updating regressions. When a new data row arrives, a few Givens rotations update the existing QR factor in O(n²) without redoing the whole factorization.
  • Numerically robust rank determination. Pivoted QR (which permutes columns to push the largest pivot to position k) gives a rank-revealing factorization useful for ill-conditioned problems.
  • Computer graphics and robotics. Orthonormalizing a slightly drifted rotation matrix back into SO(3) is a one-step QR in disguise.

Common mistakes

  • Using classical Gram–Schmidt in code. It looks beautiful in lecture notes and produces a non-orthogonal Q in finite precision. Modified Gram–Schmidt or Householder is what you ship.
  • Treating Q as cheap to materialize. Householder stores Q implicitly as a sequence of reflectors. Asking for the explicit Q matrix wastes memory and arithmetic; apply Q or QT as an operator instead.
  • Forgetting the sign convention. R's diagonal is signed — pick "positive diagonal" to make QR unique. Different libraries pick different conventions, so test before relying on signs.
  • Ignoring rank deficiency. If A has linearly dependent columns, R has zeros (or near-zeros) on the diagonal. Plain QR cannot tell which columns; use column-pivoted QR or fall back to SVD.
  • Computing the normal equations to solve least squares. Forming ATA squares the condition number and erases information. Always go through QR (or SVD) instead of ATAx = ATb.
  • Confusing QR with eigendecomposition. QR's R is triangular, not diagonal, and Q is not made of eigenvectors. The QR algorithm repeatedly applies QR factorizations to compute eigenvalues, but a single QR is not itself an eigendecomposition.

Frequently asked questions

What is Q and what is R?

Q is an orthogonal (or, in the complex case, unitary) matrix — its columns are orthonormal, meaning each has unit length and they are pairwise perpendicular. R is upper triangular, with the diagonal entries equal to the lengths of the orthogonalized columns. Together they reconstruct A as A = Q·R: a rotation/reflection followed by a triangular scaling.

Why use QR instead of LU?

QR is the right tool for least squares (overdetermined systems) and for problems where preserving orthogonality matters more than minimizing arithmetic. LU is faster (n³/3 vs 2n³/3 for Householder QR) but multiplies the condition number when the matrix is ill-conditioned; QR's orthogonal Q is perfectly conditioned, so the residual error is bounded by the original conditioning of A, not its square.

Why is classical Gram–Schmidt numerically unstable?

Each new vector is orthogonalized against an old Q in one shot, but rounding errors mean the old columns are not exactly orthogonal. The new vector inherits and amplifies those errors, and after a few steps Q is far from orthogonal. Modified Gram–Schmidt re-orthogonalizes against each previous column one at a time and dramatically reduces the loss of orthogonality.

When should I use Householder vs Givens?

Householder reflections zero a whole column at a time and are the default for dense factorizations — that is what LAPACK's geqrf does. Givens rotations zero one entry at a time and are the right choice for sparse or banded matrices, or when updating an existing QR after adding a single row.

How does QR solve a least-squares problem?

Given A x ≈ b for tall A, write A = QR. Multiplying both sides by Q^T gives R x = Q^T b, and R is upper triangular so back-substitution finishes the job in O(n²). The orthogonality of Q means it does not change vector lengths, so the residual minimization carries through exactly.

Is QR unique?

Once you require R to have positive diagonal entries (and A to have linearly independent columns), QR is unique. Without that sign convention, you can flip the sign of any column of Q and the corresponding row of R, so any library that returns QR will document its sign convention.