Computer Graphics

Perlin Noise

The function that makes computers draw mountains, clouds, and marble

Perlin noise is a gradient-noise function that turns a lattice of random gradient vectors into smooth, natural-looking randomness — the math behind procedural terrain, clouds, marble textures, and fire, computed in O(1) per sample.

  • InventedKen Perlin, 1983
  • Cost per sample (3D)O(1), 8 corners
  • Fade curve6t⁵ − 15t⁴ + 10t³
  • Output range (2D)±0.707 unit grads · ~±1 std
  • WonTechnical Oscar, 1997

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Perlin noise works

Before 1983, "random" in computer graphics meant calling a pseudo-random number generator per pixel. That gives you television static — every value independent of its neighbours, harsh and useless for anything natural. Real mountains, clouds, and wood grain are not white noise; they vary smoothly, with structure at many scales at once. Ken Perlin built the function that captures exactly that while working on the 1982 film Tron, and it has been the workhorse of procedural graphics ever since.

The trick is to make the randomness live on a coarse integer lattice and then interpolate smoothly between lattice points. Perlin's specific insight — what makes it gradient noise rather than value noise — is that each lattice point stores a random gradient vector, not a random value. To evaluate the noise at a continuous point p:

  1. Find the cell. Take floor(p) to get the integer corner; the four corners (in 2D) of the unit square containing p are your influence points.
  2. Look up a gradient at each corner. Hash the corner's integer coordinates through a fixed permutation table to pick one of a small fixed set of gradient vectors (Perlin's improved 2002 set points to the edges of the unit cube). Same corner, same gradient — always.
  3. Take the dot product. For each corner, compute the dot product of its gradient with the offset vector from that corner to p. This is the "ramp" each gradient contributes: zero at its own corner, rising or falling as you move away.
  4. Smoothly blend. Interpolate the four dot products together using the fade curve, weighting by how close p is to each corner.

Because every gradient's contribution is exactly zero at its own lattice point, the noise passes through zero on the whole integer grid. Only the slopes at those crossings differ, which is why gradient noise has no blocky, axis-aligned look — the feature that separates it from the cheaper value noise.

The fade curve, and why linear interpolation fails

The obvious way to blend the corner contributions is linear interpolation (lerp). It does not work. Lerp is continuous in value but its first derivative is discontinuous at every cell boundary, so wherever a sample crosses a lattice line you get a visible crease — a faceted, low-poly look that ruins clouds and terrain.

Perlin's fix is to remap the fractional offset t ∈ [0,1] through a smoothing curve before lerping. His original 1983 curve was the smoothstep 3t² − 2t³, which zeroes the first derivative at the endpoints. In 2002 he upgraded to the quintic fade:

fade(t) = 6t⁵ − 15t⁴ + 10t³

This curve has zero first and second derivatives at t = 0 and t = 1. Matching the second derivative too means the noise is C² continuous across cell boundaries — important the moment you compute analytic normals or curvature from the noise field, as lighting and bump-mapping do. The quintic costs a handful of extra multiplies per axis and is universally worth it.

When to use Perlin noise — and when not to

  • Procedural terrain and heightmaps — sum a few octaves and you have infinite, seamless, seed-reproducible landscapes. This is Minecraft's world generation, simplified.
  • Volumetric clouds, smoke, and fire — 3D noise advected over time gives believable turbulence without a fluid simulation.
  • Natural surface textures — marble, wood, granite, and rust all come from running noise through a remap (e.g. sin(x + turbulence) gives marble veins).
  • Organic motion and displacement — wobbling a vertex or a camera by noise reads as "alive" where a sine wave reads as "mechanical."

Reach for something else when you need true cryptographic or statistical randomness (Perlin is deterministic and visibly correlated), when you need cellular / cracked patterns (use Worley/Voronoi noise instead), or when you need more than 2–3 dimensions cheaply (use simplex noise, whose cost grows linearly rather than as 2ⁿ).

Perlin vs other procedural noise

Perlin (gradient)Value noiseSimplexOpenSimplex2Worley (cellular)
What's stored per lattice pointrandom gradient vectorrandom scalar valuerandom gradient vectorrandom gradient vectorrandom feature points
Corners sampled (3D)8 (2ⁿ)8 (2ⁿ)4 (n+1)4 (n+1)27 neighbour cells searched
Cost scaling with dimensionO(2ⁿ)O(2ⁿ)O(n²) per eval, no 2ⁿ blowupO(n²)O(3ⁿ) cells
Directional (grid) artifactsslight, axis-alignedstrong, blockynonenonen/a
Looksmooth rollingsmooth but blockysmooth rollingsmooth rollingcells, cracks, scales
Patent historyfreefree3D+ patented until 2022free (clean-room)free
Typical useterrain, clouds, marblequick prototypesmodern terrain enginesGodot, FastNoiseLite defaultstone, water, scales, cracks

The headline trade-off: value noise is the easiest to implement but looks blocky; classic Perlin fixed the blockiness with gradients but pays 2ⁿ corners; simplex fixed the dimensional cost but carried a patent for two decades. Today most engines default to a simplex variant and keep Perlin around because everyone already understands it.

What the numbers actually say

  • Cost per sample is O(1) in the lattice size — it does not matter whether your world is 64 units or 64 million units across. A 2D Perlin sample touches 4 corners; 3D touches 8; 4D touches 16. This is why you can sample an infinite world lazily, one chunk at a time.
  • Know your actual output range. With unit gradient vectors the theoretical bound is ±√(n)/2 — ±0.707 in 2D and ±0.866 in 3D. But the standard reference implementation (used in the code below) picks gradients pointing to the cube edges, which have length √2, so its real range stretches to roughly ±1. Either way, the practical values cluster well inside the bound, so blindly assuming a fixed range and dividing by it leaves you with washed-out, low-contrast noise — measure or normalize empirically.
  • fBm cost is linear in octaves. Each octave doubles frequency and roughly halves amplitude; 6 octaves means 6× the per-sample work. Past ~8 octaves the high frequencies sit below one pixel and add nothing but cost — a classic over-spend.
  • The permutation table is 256 entries, doubled to 512 to avoid an index-wrap branch. That's 512 bytes, fits in L1 cache, and is shared across every sample — the entire "randomness" of a planet-sized world is half a kilobyte.
  • Simplex saves real time at high dimension. For 4D noise (3D + animation time), simplex evaluates 5 corners against Perlin's 16 — roughly 3× fewer gradient computations per sample, which matters when you call it once per voxel per frame.

JavaScript implementation

A complete 2D classic Perlin noise, permutation table and all. This is close to Perlin's reference Java, transcribed to JS:

// Permutation table (Perlin's reference 256 values), doubled to 512.
const PERM = (() => {
  const p = [151,160,137,91,90,15,131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,
    8,99,37,240,21,10,23,190,6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,
    57,177,33,88,237,149,56,87,174,20,125,136,171,168,68,175,74,165,71,134,139,48,27,166,77,
    146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,102,143,54,65,25,
    63,161,1,216,80,73,209,76,132,187,208,89,18,169,200,196,135,130,116,188,159,86,164,100,109,
    198,173,186,3,64,52,217,226,250,124,123,5,202,38,147,118,126,255,82,85,212,207,206,59,227,
    47,16,58,17,182,189,28,42,223,183,170,213,119,248,152,2,44,154,163,70,221,153,101,155,167,
    43,172,9,129,22,39,253,19,98,108,110,79,113,224,232,178,185,112,104,218,246,97,228,251,34,
    242,193,238,210,144,12,191,179,162,241,81,51,145,235,249,14,239,107,49,192,214,31,181,199,
    106,157,184,84,204,176,115,121,50,45,127,4,150,254,138,236,205,93,222,114,67,29,24,72,243,
    141,128,195,78,66,215,61,156,180];
  return p.concat(p);                  // 512 entries — no index-wrap branch
})();

const fade = t => t * t * t * (t * (t * 6 - 15) + 10);   // 6t^5 - 15t^4 + 10t^3
const lerp = (a, b, t) => a + t * (b - a);

// Map a hash to one of 8 gradient directions and dot it with (x, y).
function grad(hash, x, y) {
  switch (hash & 7) {
    case 0: return  x + y;  case 1: return -x + y;
    case 2: return  x - y;  case 3: return -x - y;
    case 4: return  x;      case 5: return -x;
    case 6: return  y;      default: return -y;
  }
}

function perlin2(x, y) {
  const X = Math.floor(x) & 255, Y = Math.floor(y) & 255;
  x -= Math.floor(x); y -= Math.floor(y);          // fractional offset in cell
  const u = fade(x), v = fade(y);

  const aa = PERM[PERM[X]     + Y    ];
  const ab = PERM[PERM[X]     + Y + 1];
  const ba = PERM[PERM[X + 1] + Y    ];
  const bb = PERM[PERM[X + 1] + Y + 1];

  // Blend the 4 corner gradient ramps with the faded weights.
  const x1 = lerp(grad(aa, x,     y    ), grad(ba, x - 1, y    ), u);
  const x2 = lerp(grad(ab, x,     y - 1), grad(bb, x - 1, y - 1), u);
  return lerp(x1, x2, v);              // raw range ~ [-1, 1] (these gradients are not unit length)
}

// Fractional Brownian motion: stack octaves for fractal detail.
function fbm(x, y, octaves = 6, lacunarity = 2, gain = 0.5) {
  let sum = 0, amp = 1, freq = 1, norm = 0;
  for (let o = 0; o < octaves; o++) {
    sum  += amp * perlin2(x * freq, y * freq);
    norm += amp;
    amp  *= gain;                       // each octave quieter
    freq *= lacunarity;                 // ...and finer
  }
  return sum / norm;                    // keep result in a stable range
}

Two details people miss. First, the double-width PERM table lets PERM[X] + Y + 1 reach up to 511 without a modulo — drop the doubling and you index out of bounds. Second, fbm divides by the summed amplitude so the result stays in a predictable range no matter how many octaves you stack; skip that and adding octaves silently brightens the field.

Python implementation

import math

# Build the doubled permutation table from a seed (reproducible worlds).
def make_perm(seed=0):
    import random
    p = list(range(256))
    random.Random(seed).shuffle(p)
    return p + p                         # 512 entries

PERM = make_perm(seed=1337)

def fade(t):  return t * t * t * (t * (t * 6 - 15) + 10)
def lerp(a, b, t): return a + t * (b - a)

def grad(h, x, y):
    h &= 7
    if h == 0: return  x + y
    if h == 1: return -x + y
    if h == 2: return  x - y
    if h == 3: return -x - y
    if h == 4: return  x
    if h == 5: return -x
    if h == 6: return  y
    return -y

def perlin2(x, y):
    X, Y = int(math.floor(x)) & 255, int(math.floor(y)) & 255
    x -= math.floor(x); y -= math.floor(y)
    u, v = fade(x), fade(y)
    aa = PERM[PERM[X]     + Y]
    ab = PERM[PERM[X]     + Y + 1]
    ba = PERM[PERM[X + 1] + Y]
    bb = PERM[PERM[X + 1] + Y + 1]
    x1 = lerp(grad(aa, x,     y),     grad(ba, x - 1, y),     u)
    x2 = lerp(grad(ab, x,     y - 1), grad(bb, x - 1, y - 1), u)
    return lerp(x1, x2, v)

def fbm(x, y, octaves=6, lacunarity=2.0, gain=0.5):
    total = amp = norm = 0.0
    freq = 1.0
    amp = 1.0
    for _ in range(octaves):
        total += amp * perlin2(x * freq, y * freq)
        norm  += amp
        amp   *= gain
        freq  *= lacunarity
    return total / norm

Note that the Python version seeds the permutation table from an integer, which is the production pattern: ship the seed (one int), regenerate the identical world anywhere. The famous "Minecraft seed" you type in is precisely this — the input that builds the permutation tables behind the terrain noise.

Variants worth knowing

Simplex noise (2001). Perlin's own successor. It tiles space with simplices (triangles in 2D, tetrahedra in 3D) instead of a cube grid, so it samples only n + 1 corners and has no directional artifacts. A patent on its gradient-selection scheme kept many open-source engines on classic Perlin until the patent expired in 2022.

OpenSimplex / OpenSimplex2. Clean-room, patent-free replacements for simplex with even fewer artifacts. OpenSimplex2 is the default in FastNoiseLite and Godot's noise tools.

Fractional Brownian motion (fBm). Not a different noise — a way of using it. Sum N octaves at increasing frequency and decreasing amplitude. The lacunarity (frequency multiplier, usually 2) and gain / persistence (amplitude multiplier, usually 0.5) control roughness.

Turbulence and ridged noise. Take abs(noise) before summing octaves and you get sharp creases — turbulence — used for fire and marble. Invert it (1 − abs(noise)) and you get ridged multifractal noise, the look of eroded mountain ridges.

Domain warping. Feed the output of one noise field into the input coordinates of another: noise(x + noise(x,y), y + noise(...)). This produces the swirling, flow-like patterns of Inigo Quilez's famous terrain renders, at the cost of 2–3× the noise calls.

Common bugs and edge cases

  • Sampling at integer coordinates. Perlin noise is exactly zero on every integer lattice point. Loop for (let x = 0; x < 100; x++) perlin2(x, y) and you get a field of zeros. Always sample at a fractional scale, e.g. perlin2(x * 0.05, y * 0.05).
  • Assuming a fixed output range. The bound depends on the gradient set: ±0.707 in 2D for unit gradients, but ~±1 for the standard cube-edge gradients used here. Remap with the bound your implementation actually has, or normalize empirically, or your textures lose contrast.
  • Forgetting to double the permutation table. Indexing PERM[PERM[X+1] + Y + 1] can exceed 255; without the doubled table you read undefined / out of bounds.
  • Too many octaves. Octaves whose wavelength is below one pixel add cost and aliasing, not detail. Cap octaves so the highest frequency stays above your sample spacing (frequency clamping / "octave fade").
  • Animating by scaling, not translating. To animate noise over time, move through it (perlin3(x, y, t)) — don't multiply coordinates by time, which speeds the noise up unboundedly and looks like a zoom, not a flow.
  • Tiling seams. Plain Perlin does not tile. To make a seamless texture, sample the noise on the surface of a circle (1D), torus (2D), or by blending wrapped copies — or use a noise library's built-in periodic mode.

Frequently asked questions

What is the difference between Perlin noise and value noise?

Value noise stores a random value at each lattice point and interpolates between those values. Perlin noise stores a random gradient (direction) at each lattice point and interpolates the dot products of those gradients with the offset to the sample point. Because gradient noise is zero at every lattice point and only the slopes vary, it has no visible axis-aligned grid bias and looks far more organic than value noise.

Why does Perlin noise need a fade function instead of plain linear interpolation?

Linear interpolation between cells leaves a visible crease wherever a sample crosses a lattice boundary, because the first derivative is discontinuous. Ken Perlin's fade curve 6t⁵ − 15t⁴ + 10t³ has zero first and second derivatives at t=0 and t=1, so noise values and their slopes match across every cell boundary and the surface looks smooth — no grid artifacts.

What is fractional Brownian motion (fBm) and how does it relate to Perlin noise?

fBm sums several octaves of the same noise function, each at double the frequency and roughly half the amplitude of the last. One octave gives smooth rolling hills; adding octaves layers in finer and finer detail, producing the fractal roughness of real mountains and clouds. Most terrain generators run 4 to 8 octaves of Perlin or simplex noise as fBm.

Why did Ken Perlin invent simplex noise to replace Perlin noise?

Classic Perlin noise samples 2ⁿ lattice corners per point, so cost grows exponentially with dimension — 4 corners in 2D, 8 in 3D, 16 in 4D. Simplex noise (2001) uses a simplex grid that needs only n+1 corners — 3 in 2D, 4 in 3D — making it scale linearly, and it has no directional artifacts. A patent on simplex's gradient selection led many engines to keep using classic Perlin until 2022.

Is Perlin noise actually random?

No — it is fully deterministic. The same coordinate always returns the same value because the gradients come from a fixed permutation table, not a random number generator. That determinism is the whole point: a game can regenerate the exact same terrain on any machine from a single integer seed, and you can sample any point in any order without storing the result.

What output range does Perlin noise produce?

It depends on the gradient set. With unit gradient vectors the theoretical bound is ±0.707 (1/√2) in 2D and ±0.866 (√3/2) in 3D, because the dot products are bounded by the gradient and offset magnitudes. The standard reference implementation instead uses gradients pointing to the cube edges (length √2), which stretches the real range to roughly ±1. Most implementations remap the output to [0, 1] before use, and a common bug is assuming a fixed range without checking which gradients you use, which clips contrast or wastes dynamic range.