Linear Algebra
Linear Transformations
Functions that preserve addition and scalar multiplication — every matrix is one
A linear transformation is a function T: V → W satisfying T(u + v) = T(u) + T(v) and T(c·v) = c·T(v). Geometrically, it sends lines through the origin to lines through the origin — preserves origin, scales straight lines, and never bends. Every matrix represents a linear transformation, and every finite-dimensional linear transformation is represented by a matrix.
- Defining propertiesT(u + v) = T(u) + T(v); T(c·v) = c·T(v)
- Equivalent in one lineT(c₁v₁ + c₂v₂) = c₁T(v₁) + c₂T(v₂)
- Origin preservationT(0) = 0 always
- Standard examplesRotation, scaling, shearing, projection, reflection
- Matrix representationColumns are images of standard basis vectors
- Affine transformationsLinear + translation — translations are NOT linear (don't preserve origin)
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The definition
A linear transformation is a function T from one vector space V to another vector space W satisfying two properties:
- Additivity. T(u + v) = T(u) + T(v) for all u, v in V.
- Homogeneity. T(c·v) = c·T(v) for all scalars c and vectors v.
Combined into one rule — T(c₁v₁ + c₂v₂) = c₁·T(v₁) + c₂·T(v₂) for all scalars and vectors. Linearity says — distributing over linear combinations is preserved.
From these — plug in c = 0 to get T(0) = 0. The origin always maps to the origin under a linear transformation. This is what distinguishes linear from affine.
Standard examples
| Transformation | 2×2 Matrix | Effect |
|---|---|---|
| Identity | [[1,0],[0,1]] | Does nothing |
| Scaling by k | [[k,0],[0,k]] | Stretches all directions by k |
| Rotation by θ (CCW) | [[cos θ,−sin θ],[sin θ,cos θ]] | Rotates around origin by angle θ |
| Reflection across x-axis | [[1,0],[0,−1]] | Flips y-coordinate |
| Reflection across y = x | [[0,1],[1,0]] | Swaps x and y |
| Horizontal shear by k | [[1,k],[0,1]] | Slants vertical lines |
| Projection onto x-axis | [[1,0],[0,0]] | Drops y component |
| Zero map | [[0,0],[0,0]] | Maps everything to zero |
Composition of linear transformations is linear — and corresponds to matrix multiplication. Rotate then scale = scale matrix × rotation matrix.
From transformation to matrix
To find the matrix of a linear transformation T : ℝⁿ → ℝᵐ:
- Apply T to each standard basis vector e₁, e₂, ..., eₙ.
- Each image T(eⱼ) is a vector in ℝᵐ — write it as a column.
- The matrix is the m × n matrix with these columns.
So if T(e₁) = (3, 1, 0)ᵀ and T(e₂) = (−1, 2, 4)ᵀ, the matrix is:
M = [3 −1]
[1 2]
[0 4]
To compute T(v), multiply M · v. Linearity guarantees this gives the right answer for any v.
Kernel and image
Two key subspaces associated with any linear transformation T : V → W:
- Kernel (also "null space") — kernel(T) = {v ∈ V : T(v) = 0}. The vectors that get crushed to zero.
- Image (also "range") — image(T) = {T(v) : v ∈ V}. The set of all possible outputs.
The dimension of kernel is "nullity"; dimension of image is "rank." The fundamental rank-nullity theorem:
rank(T) + nullity(T) = dim(V)
Intuitively — every dimension of input either contributes to output (rank) or gets squashed (nullity). They sum to the input dimension.
T is invertible iff rank = dim(V) (equivalently, nullity = 0). If anything gets squashed, it's not invertible.
Composition and matrix multiplication
If S : U → V and T : V → W are linear transformations, the composition T ∘ S : U → W defined by (T ∘ S)(u) = T(S(u)) is also linear. The matrix of the composition is:
M_{T ∘ S} = M_T · M_S
This is the conceptual foundation of matrix multiplication. The order matters — apply S first, then T, so T's matrix multiplies S's on the left. This is why matrix multiplication is non-commutative.
Worked examples
Example 1 — verify linearity
Is T(x, y) = (2x + y, x − 3y) linear? Check the two properties:
T((x₁, y₁) + (x₂, y₂)) = T(x₁+x₂, y₁+y₂) = (2(x₁+x₂) + (y₁+y₂), (x₁+x₂) − 3(y₁+y₂))
= (2x₁+y₁ + 2x₂+y₂, x₁−3y₁ + x₂−3y₂)
= T(x₁, y₁) + T(x₂, y₂) ✓
T(c·(x, y)) = T(cx, cy) = (2cx + cy, cx − 3cy) = c·(2x + y, x − 3y) = c·T(x, y) ✓
Both axioms hold. T is linear. Its matrix — apply to e₁ = (1, 0) gives (2, 1); apply to e₂ = (0, 1) gives (1, −3). Matrix:
M = [2 1]
[1 −3]
Example 2 — non-linear transformation
Is T(x, y) = (x², y) linear? Check additivity:
T((1, 0) + (1, 0)) = T(2, 0) = (4, 0)
T(1, 0) + T(1, 0) = (1, 0) + (1, 0) = (2, 0)
Different — T is not linear. Squaring is non-linear; it breaks additivity.
Example 3 — affine transformation
Is T(x, y) = (x + 1, y) linear? Check origin preservation — T(0, 0) = (1, 0) ≠ (0, 0). Not linear (it's affine — linear plus translation).
JavaScript implementation
// Apply a linear transformation (represented as matrix) to a vector
function apply(M, v) {
return M.map(row => row.reduce((s, a, i) => s + a * v[i], 0));
}
// Check linearity by sampling
function isLinear(T) {
// Test additivity and homogeneity at random points
for (let i = 0; i < 10; i++) {
const u = [Math.random() * 10, Math.random() * 10];
const v = [Math.random() * 10, Math.random() * 10];
const c = Math.random() * 10;
const sum = T(u.map((x, i) => x + v[i]));
const indiv = T(u).map((x, i) => x + T(v)[i]);
if (Math.abs(sum[0] - indiv[0]) > 1e-10) return false;
const scaled = T(u.map(x => c * x));
const expected = T(u).map(x => c * x);
if (Math.abs(scaled[0] - expected[0]) > 1e-10) return false;
}
return true;
}
const linear = ([x, y]) => [2*x + y, x - 3*y];
const nonlinear = ([x, y]) => [x*x, y];
console.log(isLinear(linear)); // true
console.log(isLinear(nonlinear)); // false (probably)
Where linear transformations appear
- Computer graphics. Rotation, scaling, shearing, perspective projection — all linear (or affine). Composing them by matrix multiplication is the heart of the rendering pipeline.
- Solving linear systems. Ax = b is "find x mapped to b under the linear transformation A." Matrix inverse undoes the transformation when possible.
- Signal processing. Filtering, Fourier transforms — all linear. Discrete-time systems are linear iff convolution-based.
- Differential equations. Linear ODEs and linear PDEs have superposition — the sum of solutions is a solution. Solution methods (separation of variables, eigenfunctions) all rely on linearity.
- Quantum mechanics. Operators are linear. Schrödinger evolution, observables, and quantum gates are all linear transformations.
- Statistics. Linear regression, multivariate normal distributions, principal component analysis — all linear.
- Machine learning. Single neural network layers (without activation) are linear. Activation functions are what break linearity; without them, deep networks would collapse to single linear transformations.
Common mistakes
- Confusing linear with continuous or smooth. Linearity is much stricter than continuity. f(x) = x is linear; f(x) = x² is smooth and continuous but NOT linear.
- Treating affine as linear. Translation (adding a constant) is NOT linear — it doesn't preserve the origin. Use homogeneous coordinates to handle affine transformations as matrix operations.
- Forgetting basis-dependence of the matrix. The matrix representation depends on the chosen basis. Two matrices can represent the same linear transformation in different bases — they're "similar" matrices (related by P⁻¹AP).
- Computing matrix in the wrong direction. The j-th column is the image of the j-th input basis vector. Easy to mix up rows and columns.
- Assuming all linear transformations are invertible. Projections, zero maps, transformations that crush dimensions — all linear, all not invertible. Check rank or determinant.
- Confusing linear transformation with linear function (high school sense). "Linear function" in high school means y = mx + b — actually affine, since the +b term breaks linearity. The mathematician's "linear" requires y = mx (no constant term).
Frequently asked questions
Why do linear transformations preserve the origin?
From T(c·v) = c·T(v) with c = 0 — T(0) = T(0·v) = 0·T(v) = 0. Plug in any vector v; the rule forces T(0) = 0. So translations (which move the origin) are not linear. This is why "linear" is stricter than "straight-line-preserving" — the origin must stay put.
How is a linear transformation represented as a matrix?
Pick a basis for V and W. The j-th column of the matrix is the image of the j-th basis vector. For T: ℝ² → ℝ², the matrix is [T(e₁) | T(e₂)] where e₁ = [1, 0]ᵀ and e₂ = [0, 1]ᵀ. To compute T(v), write v in basis coordinates and multiply by the matrix. Every linear transformation between finite-dim spaces has such a matrix; the matrix depends on the basis chosen.
What's the kernel of a linear transformation?
The set of vectors that map to 0 — kernel(T) = {v : T(v) = 0}. Geometrically, what gets crushed to the origin. For rotations, kernel = {0}. For projections onto a line, kernel = perpendicular line. The dimension of the kernel is the "nullity"; the dimension of the image is the "rank." rank + nullity = dim of input space (rank-nullity theorem).
What does it mean for a transformation to be invertible?
T is invertible if there's another linear transformation S such that S(T(v)) = v for all v. Equivalently, T is bijective — every output has exactly one input. For matrices, this means the matrix is invertible (det ≠ 0). Non-invertible transformations crush some directions to zero (kernel is nontrivial); they can't be undone.
How do I tell if T is linear?
Two checks. (1) T(u + v) = T(u) + T(v) for all u, v. (2) T(c·v) = c·T(v) for all scalars c and vectors v. If both hold, T is linear. Common non-linear functions — T(v) = v + (1, 0) (translation; fails origin preservation), T(v) = |v| (absolute value; fails additivity), any squared term. Check both axioms; failing either kills linearity.
What's the difference between a linear transformation and an affine transformation?
Linear preserves the origin and scales linearly — T(c₁v₁ + c₂v₂) = c₁T(v₁) + c₂T(v₂). Affine adds a translation — T(v) = Av + b. Affine maps lines to lines but doesn't fix the origin. In computer graphics, "linear" is rotation + scaling + shear; "affine" adds translation. Homogeneous coordinates encode affine as 4×4 matrices to allow matrix-multiplication composition.
How do change-of-basis transformations work?
Given a vector expressed in basis A, transform to basis B by left-multiplying by the change-of-basis matrix P (whose columns are the new basis expressed in old). To express a linear transformation T's matrix in a different basis — M_new = P⁻¹ · M_old · P. This "similarity transformation" is how diagonalization works — find a basis where T's matrix is diagonal.