Linear Algebra

Linear Transformations

Functions that preserve addition and scalar multiplication — every matrix is one

A linear transformation is a function T: V → W satisfying T(u + v) = T(u) + T(v) and T(c·v) = c·T(v). Geometrically, it sends lines through the origin to lines through the origin — preserves origin, scales straight lines, and never bends. Every matrix represents a linear transformation, and every finite-dimensional linear transformation is represented by a matrix.

  • Defining propertiesT(u + v) = T(u) + T(v); T(c·v) = c·T(v)
  • Equivalent in one lineT(c₁v₁ + c₂v₂) = c₁T(v₁) + c₂T(v₂)
  • Origin preservationT(0) = 0 always
  • Standard examplesRotation, scaling, shearing, projection, reflection
  • Matrix representationColumns are images of standard basis vectors
  • Affine transformationsLinear + translation — translations are NOT linear (don't preserve origin)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The definition

A linear transformation is a function T from one vector space V to another vector space W satisfying two properties:

  1. Additivity. T(u + v) = T(u) + T(v) for all u, v in V.
  2. Homogeneity. T(c·v) = c·T(v) for all scalars c and vectors v.

Combined into one rule — T(c₁v₁ + c₂v₂) = c₁·T(v₁) + c₂·T(v₂) for all scalars and vectors. Linearity says — distributing over linear combinations is preserved.

From these — plug in c = 0 to get T(0) = 0. The origin always maps to the origin under a linear transformation. This is what distinguishes linear from affine.

Standard examples

Transformation2×2 MatrixEffect
Identity[[1,0],[0,1]]Does nothing
Scaling by k[[k,0],[0,k]]Stretches all directions by k
Rotation by θ (CCW)[[cos θ,−sin θ],[sin θ,cos θ]]Rotates around origin by angle θ
Reflection across x-axis[[1,0],[0,−1]]Flips y-coordinate
Reflection across y = x[[0,1],[1,0]]Swaps x and y
Horizontal shear by k[[1,k],[0,1]]Slants vertical lines
Projection onto x-axis[[1,0],[0,0]]Drops y component
Zero map[[0,0],[0,0]]Maps everything to zero

Composition of linear transformations is linear — and corresponds to matrix multiplication. Rotate then scale = scale matrix × rotation matrix.

From transformation to matrix

To find the matrix of a linear transformation T : ℝⁿ → ℝᵐ:

  1. Apply T to each standard basis vector e₁, e₂, ..., eₙ.
  2. Each image T(eⱼ) is a vector in ℝᵐ — write it as a column.
  3. The matrix is the m × n matrix with these columns.

So if T(e₁) = (3, 1, 0)ᵀ and T(e₂) = (−1, 2, 4)ᵀ, the matrix is:

M = [3  −1]
    [1   2]
    [0   4]

To compute T(v), multiply M · v. Linearity guarantees this gives the right answer for any v.

Kernel and image

Two key subspaces associated with any linear transformation T : V → W:

  • Kernel (also "null space") — kernel(T) = {v ∈ V : T(v) = 0}. The vectors that get crushed to zero.
  • Image (also "range") — image(T) = {T(v) : v ∈ V}. The set of all possible outputs.

The dimension of kernel is "nullity"; dimension of image is "rank." The fundamental rank-nullity theorem:

rank(T) + nullity(T) = dim(V)

Intuitively — every dimension of input either contributes to output (rank) or gets squashed (nullity). They sum to the input dimension.

T is invertible iff rank = dim(V) (equivalently, nullity = 0). If anything gets squashed, it's not invertible.

Composition and matrix multiplication

If S : U → V and T : V → W are linear transformations, the composition T ∘ S : U → W defined by (T ∘ S)(u) = T(S(u)) is also linear. The matrix of the composition is:

M_{T ∘ S} = M_T · M_S

This is the conceptual foundation of matrix multiplication. The order matters — apply S first, then T, so T's matrix multiplies S's on the left. This is why matrix multiplication is non-commutative.

Worked examples

Example 1 — verify linearity

Is T(x, y) = (2x + y, x − 3y) linear? Check the two properties:

T((x₁, y₁) + (x₂, y₂)) = T(x₁+x₂, y₁+y₂) = (2(x₁+x₂) + (y₁+y₂), (x₁+x₂) − 3(y₁+y₂))
                       = (2x₁+y₁ + 2x₂+y₂, x₁−3y₁ + x₂−3y₂)
                       = T(x₁, y₁) + T(x₂, y₂)  ✓

T(c·(x, y)) = T(cx, cy) = (2cx + cy, cx − 3cy) = c·(2x + y, x − 3y) = c·T(x, y)  ✓

Both axioms hold. T is linear. Its matrix — apply to e₁ = (1, 0) gives (2, 1); apply to e₂ = (0, 1) gives (1, −3). Matrix:

M = [2   1]
    [1  −3]

Example 2 — non-linear transformation

Is T(x, y) = (x², y) linear? Check additivity:

T((1, 0) + (1, 0)) = T(2, 0) = (4, 0)
T(1, 0) + T(1, 0) = (1, 0) + (1, 0) = (2, 0)

Different — T is not linear. Squaring is non-linear; it breaks additivity.

Example 3 — affine transformation

Is T(x, y) = (x + 1, y) linear? Check origin preservation — T(0, 0) = (1, 0) ≠ (0, 0). Not linear (it's affine — linear plus translation).

JavaScript implementation

// Apply a linear transformation (represented as matrix) to a vector
function apply(M, v) {
  return M.map(row => row.reduce((s, a, i) => s + a * v[i], 0));
}

// Check linearity by sampling
function isLinear(T) {
  // Test additivity and homogeneity at random points
  for (let i = 0; i < 10; i++) {
    const u = [Math.random() * 10, Math.random() * 10];
    const v = [Math.random() * 10, Math.random() * 10];
    const c = Math.random() * 10;

    const sum = T(u.map((x, i) => x + v[i]));
    const indiv = T(u).map((x, i) => x + T(v)[i]);
    if (Math.abs(sum[0] - indiv[0]) > 1e-10) return false;

    const scaled = T(u.map(x => c * x));
    const expected = T(u).map(x => c * x);
    if (Math.abs(scaled[0] - expected[0]) > 1e-10) return false;
  }
  return true;
}

const linear = ([x, y]) => [2*x + y, x - 3*y];
const nonlinear = ([x, y]) => [x*x, y];

console.log(isLinear(linear));     // true
console.log(isLinear(nonlinear));  // false (probably)

Where linear transformations appear

  • Computer graphics. Rotation, scaling, shearing, perspective projection — all linear (or affine). Composing them by matrix multiplication is the heart of the rendering pipeline.
  • Solving linear systems. Ax = b is "find x mapped to b under the linear transformation A." Matrix inverse undoes the transformation when possible.
  • Signal processing. Filtering, Fourier transforms — all linear. Discrete-time systems are linear iff convolution-based.
  • Differential equations. Linear ODEs and linear PDEs have superposition — the sum of solutions is a solution. Solution methods (separation of variables, eigenfunctions) all rely on linearity.
  • Quantum mechanics. Operators are linear. Schrödinger evolution, observables, and quantum gates are all linear transformations.
  • Statistics. Linear regression, multivariate normal distributions, principal component analysis — all linear.
  • Machine learning. Single neural network layers (without activation) are linear. Activation functions are what break linearity; without them, deep networks would collapse to single linear transformations.

Common mistakes

  • Confusing linear with continuous or smooth. Linearity is much stricter than continuity. f(x) = x is linear; f(x) = x² is smooth and continuous but NOT linear.
  • Treating affine as linear. Translation (adding a constant) is NOT linear — it doesn't preserve the origin. Use homogeneous coordinates to handle affine transformations as matrix operations.
  • Forgetting basis-dependence of the matrix. The matrix representation depends on the chosen basis. Two matrices can represent the same linear transformation in different bases — they're "similar" matrices (related by P⁻¹AP).
  • Computing matrix in the wrong direction. The j-th column is the image of the j-th input basis vector. Easy to mix up rows and columns.
  • Assuming all linear transformations are invertible. Projections, zero maps, transformations that crush dimensions — all linear, all not invertible. Check rank or determinant.
  • Confusing linear transformation with linear function (high school sense). "Linear function" in high school means y = mx + b — actually affine, since the +b term breaks linearity. The mathematician's "linear" requires y = mx (no constant term).

Frequently asked questions

Why do linear transformations preserve the origin?

From T(c·v) = c·T(v) with c = 0 — T(0) = T(0·v) = 0·T(v) = 0. Plug in any vector v; the rule forces T(0) = 0. So translations (which move the origin) are not linear. This is why "linear" is stricter than "straight-line-preserving" — the origin must stay put.

How is a linear transformation represented as a matrix?

Pick a basis for V and W. The j-th column of the matrix is the image of the j-th basis vector. For T: ℝ² → ℝ², the matrix is [T(e₁) | T(e₂)] where e₁ = [1, 0]ᵀ and e₂ = [0, 1]ᵀ. To compute T(v), write v in basis coordinates and multiply by the matrix. Every linear transformation between finite-dim spaces has such a matrix; the matrix depends on the basis chosen.

What's the kernel of a linear transformation?

The set of vectors that map to 0 — kernel(T) = {v : T(v) = 0}. Geometrically, what gets crushed to the origin. For rotations, kernel = {0}. For projections onto a line, kernel = perpendicular line. The dimension of the kernel is the "nullity"; the dimension of the image is the "rank." rank + nullity = dim of input space (rank-nullity theorem).

What does it mean for a transformation to be invertible?

T is invertible if there's another linear transformation S such that S(T(v)) = v for all v. Equivalently, T is bijective — every output has exactly one input. For matrices, this means the matrix is invertible (det ≠ 0). Non-invertible transformations crush some directions to zero (kernel is nontrivial); they can't be undone.

How do I tell if T is linear?

Two checks. (1) T(u + v) = T(u) + T(v) for all u, v. (2) T(c·v) = c·T(v) for all scalars c and vectors v. If both hold, T is linear. Common non-linear functions — T(v) = v + (1, 0) (translation; fails origin preservation), T(v) = |v| (absolute value; fails additivity), any squared term. Check both axioms; failing either kills linearity.

What's the difference between a linear transformation and an affine transformation?

Linear preserves the origin and scales linearly — T(c₁v₁ + c₂v₂) = c₁T(v₁) + c₂T(v₂). Affine adds a translation — T(v) = Av + b. Affine maps lines to lines but doesn't fix the origin. In computer graphics, "linear" is rotation + scaling + shear; "affine" adds translation. Homogeneous coordinates encode affine as 4×4 matrices to allow matrix-multiplication composition.

How do change-of-basis transformations work?

Given a vector expressed in basis A, transform to basis B by left-multiplying by the change-of-basis matrix P (whose columns are the new basis expressed in old). To express a linear transformation T's matrix in a different basis — M_new = P⁻¹ · M_old · P. This "similarity transformation" is how diagonalization works — find a basis where T's matrix is diagonal.