What is Q and what is R?

Q is an orthogonal (or, in the complex case, unitary) matrix — its columns are orthonormal, meaning each has unit length and they are pairwise perpendicular. R is upper triangular, with the diagonal entries equal to the lengths of the orthogonalized columns. Together they reconstruct A as A = Q·R: a rotation/reflection followed by a triangular scaling.

Why use QR instead of LU?

QR is the right tool for least squares (overdetermined systems) and for problems where preserving orthogonality matters more than minimizing arithmetic. LU is faster (n³/3 vs 2n³/3 for Householder QR) but multiplies the condition number when the matrix is ill-conditioned; QR's orthogonal Q is perfectly conditioned, so the residual error is bounded by the original conditioning of A, not its square.

Why is classical Gram–Schmidt numerically unstable?

Each new vector is orthogonalized against an old Q in one shot, but rounding errors mean the old columns are not exactly orthogonal. The new vector inherits and amplifies those errors, and after a few steps Q is far from orthogonal. Modified Gram–Schmidt re-orthogonalizes against each previous column one at a time and dramatically reduces the loss of orthogonality.

When should I use Householder vs Givens?

Householder reflections zero a whole column at a time and are the default for dense factorizations — that is what LAPACK's geqrf does. Givens rotations zero one entry at a time and are the right choice for sparse or banded matrices, or when updating an existing QR after adding a single row.

How does QR solve a least-squares problem?

Given A x ≈ b for tall A, write A = QR. Multiplying both sides by Q^T gives R x = Q^T b, and R is upper triangular so back-substitution finishes the job in O(n²). The orthogonality of Q means it does not change vector lengths, so the residual minimization carries through exactly.

Once you require R to have positive diagonal entries (and A to have linearly independent columns), QR is unique. Without that sign convention, you can flip the sign of any column of Q and the corresponding row of R, so any library that returns QR will document its sign convention.

QR Decomposition — Gram–Schmidt, Householder & Worked Example

What QR is doing

Given an m×n matrix A with linearly independent columns, QR finds an orthonormal basis q₁, q₂, …, q_n for the same column space and records each original column as a triangular combination of those basis vectors:

aₖ = R₁ₖ · q₁ + R₂ₖ · q₂ + ⋯ + Rₖₖ · qₖ

The rows above the diagonal of R describe how each new column overlaps previously orthogonalized directions; the diagonal entry R_kk is the length of whatever residue is left orthogonal to those directions. Stacked into matrix form, that bookkeeping is exactly A = Q·R.

Method 1 — Gram–Schmidt orthogonalization

The textbook construction. For k = 1, 2, …, n:

Project a_k onto each previous q_j: R_jk = q_j^T·a_k.
Subtract those projections: u_k = a_k − Σ_j<k R_jk·q_j.
Normalize: R_kk = ‖u_k‖ and q_k = u_k / R_kk.

This is intuitive but numerically fragile in finite precision. Modified Gram–Schmidt fixes it by re-orthogonalizing one column at a time as new vectors come in, so accumulated rounding errors do not pile up. Use modified, never classical, in code.

Method 2 — Householder reflections

Householder is the production-grade method. The idea: design a single reflection matrix H_k that maps the column-k subdiagonal entries to zero in one shot, then compose those reflections.

For column k, let x be the part of column k from row k downward. Define

v = x + sign(x₁) · ‖x‖ · e₁
H = I − 2 · v vᵀ / (vᵀv)

H is a reflection across the hyperplane orthogonal to v. By construction H·x = ∓‖x‖·e₁ — every entry below the first becomes zero. Apply H to the trailing block of A and you have killed column k below the diagonal in one matrix-block operation. After n−1 such reflections, A is upper-triangular = R, and the product H_n−1 ⋯ H₁ = Q^T.

Method 3 — Givens rotations

Where Householder zeros a whole column at a time, a Givens rotation zeros one entry at a time. A Givens rotation is a 2×2 plane rotation embedded in an n×n identity:

[ c  s ]   acts on rows i, j to mix them and zero
[ −s c ]   one chosen entry of the matrix.

Choose c = a_ii/r and s = a_ji/r where r = √(a_ii² + a_ji²); applying the rotation zeroes a_ji. Givens is more expensive on dense matrices but ideal when the matrix is sparse or banded — only the rows involved in the rotation are touched, leaving structural zeros undisturbed.

Worked example — Gram–Schmidt on three columns

Factor A whose columns are a₁ = [1, 1, 0]^T, a₂ = [1, 0, 1]^T, a₃ = [0, 1, 1]^T.

Column 1. R₁₁ = ‖a₁‖ = √2. q₁ = [1/√2, 1/√2, 0]^T.

Column 2. R₁₂ = q₁^T·a₂ = (1/√2)·1 + (1/√2)·0 + 0·1 = 1/√2.

u₂ = a₂ − R₁₂·q₁ = [1, 0, 1]^T − (1/√2)·[1/√2, 1/√2, 0]^T = [1/2, −1/2, 1]^T.

R₂₂ = ‖u₂‖ = √(1/4 + 1/4 + 1) = √(3/2). q₂ = [1/√6, −1/√6, 2/√6]^T.

Column 3. R₁₃ = q₁^T·a₃ = 0·(1/√2) + 1·(1/√2) + 1·0 = 1/√2.

R₂₃ = q₂^T·a₃ = 0·(1/√6) + 1·(−1/√6) + 1·(2/√6) = 1/√6.

u₃ = a₃ − R₁₃·q₁ − R₂₃·q₂ = [0, 1, 1]^T − (1/√2)·[1/√2, 1/√2, 0]^T − (1/√6)·[1/√6, −1/√6, 2/√6]^T.

That works out to [−1/2 − 1/6, 1/2 + 1/6, 1 − 2/6]^T = [−2/3, 2/3, 2/3]^T.

R₃₃ = ‖u₃‖ = √(4/9 + 4/9 + 4/9) = 2/√3. q₃ = [−1/√3, 1/√3, 1/√3]^T.

Result.

Q = [ 1/√2    1/√6   −1/√3 ]
    [ 1/√2  −1/√6    1/√3 ]
    [ 0      2/√6    1/√3 ]

R = [ √2    1/√2   1/√2 ]
    [  0   √(3/2)  1/√6 ]
    [  0     0    2/√3  ]

Sanity check: Q has orthonormal columns, R is upper triangular with positive diagonal, and Q·R reproduces A.

QR by Gram–Schmidt vs Householder vs Givens

	Modified Gram–Schmidt	Householder	Givens
Operation	Project and subtract column-by-column	Reflect a whole column to e₁	Plane rotation zeroing one entry
Cost (dense, m ≥ n)	2mn²	2mn² − 2n³/3	3mn² − n³ (denser work per entry)
Memory	Stores Q explicitly	Stores reflections compactly in place	Stores rotation pairs (c, s)
Numerical stability	Good with re-orthogonalization, bad classical	Excellent — backward stable	Excellent — backward stable
Best for	Theory, teaching, building intuition	Dense LAPACK-style factorizations	Sparse/banded matrices and rank-1 updates
Forms Q explicitly	Yes	Optional — Q is implicit in the v vectors	Optional — Q is implicit in the (c, s) pairs
Parallelizable	Hard (sequential dependencies)	Yes — block Householder is BLAS-3 friendly	Yes — independent rotations can run together

Where QR shows up

Linear least squares. Min ‖Ax − b‖. Compute A = QR, then solve Rx = Q^Tb by back-substitution. The numerical workhorse for regression, calibration, and data fitting.
The QR algorithm for eigenvalues. Iterating A_k+1 = R_kQ_k converges to a quasi-triangular form whose diagonal blocks reveal the eigenvalues. Modern variants with shifts and Hessenberg reduction power LAPACK's geev.
Orthonormal basis construction. Q's first k columns are an orthonormal basis for span(a₁, …, a_k) — the workhorse of subspace methods like Krylov, GMRES, and Lanczos.
Updating regressions. When a new data row arrives, a few Givens rotations update the existing QR factor in O(n²) without redoing the whole factorization.
Numerically robust rank determination. Pivoted QR (which permutes columns to push the largest pivot to position k) gives a rank-revealing factorization useful for ill-conditioned problems.
Computer graphics and robotics. Orthonormalizing a slightly drifted rotation matrix back into SO(3) is a one-step QR in disguise.

Common mistakes

Using classical Gram–Schmidt in code. It looks beautiful in lecture notes and produces a non-orthogonal Q in finite precision. Modified Gram–Schmidt or Householder is what you ship.
Treating Q as cheap to materialize. Householder stores Q implicitly as a sequence of reflectors. Asking for the explicit Q matrix wastes memory and arithmetic; apply Q or Q^T as an operator instead.
Forgetting the sign convention. R's diagonal is signed — pick "positive diagonal" to make QR unique. Different libraries pick different conventions, so test before relying on signs.
Ignoring rank deficiency. If A has linearly dependent columns, R has zeros (or near-zeros) on the diagonal. Plain QR cannot tell which columns; use column-pivoted QR or fall back to SVD.
Computing the normal equations to solve least squares. Forming A^TA squares the condition number and erases information. Always go through QR (or SVD) instead of A^TAx = A^Tb.
Confusing QR with eigendecomposition. QR's R is triangular, not diagonal, and Q is not made of eigenvectors. The QR algorithm repeatedly applies QR factorizations to compute eigenvalues, but a single QR is not itself an eigendecomposition.

QR Decomposition

Watch the 60-second explainer

What QR is doing

Method 1 — Gram–Schmidt orthogonalization

Method 2 — Householder reflections

Method 3 — Givens rotations

Worked example — Gram–Schmidt on three columns

QR by Gram–Schmidt vs Householder vs Givens

Where QR shows up

Common mistakes

Frequently asked questions

Watch the 60-second explainer

What QR is doing

Method 1 — Gram–Schmidt orthogonalization

Method 2 — Householder reflections

Method 3 — Givens rotations

Worked example — Gram–Schmidt on three columns

QR by Gram–Schmidt vs Householder vs Givens

Where QR shows up

Common mistakes

Frequently asked questions

Related concepts