Computer Graphics

Z-Buffer (Depth Buffer)

Q: Why does the z-buffer beat the painter's algorithm?

The painter's algorithm sorts whole polygons back-to-front and paints them in order, but cyclically overlapping or interpenetrating triangles have no valid sort order — A is in front of B, B in front of C, C in front of A. The z-buffer resolves visibility independently at every pixel, so it never needs a global sort and handles intersecting geometry for free.

Q: What causes z-fighting and how do you fix it?

Z-fighting is flickering on two coplanar or nearly-coplanar surfaces whose stored depths round to the same buffer value, so the depth test result depends on tiny floating-point noise. The standard fixes are to push the near plane out (depth precision is dominated by the near/far ratio), use a 24-bit or floating-point depth buffer, apply a polygon depth offset, or — best — switch to a reversed-Z buffer with a floating-point depth format.

Q: Why is depth precision so uneven across the scene?

A standard perspective projection stores 1/z, not z, in the depth buffer. That hyperbolic mapping crowds almost all of the buffer's resolution into the first few percent of the view frustum near the camera, leaving very little for distant geometry. With a near plane at 0.1 and far at 1000, roughly 90% of a 24-bit buffer's values describe the first meter of depth — and half of them the first 20 centimeters.

Q: Does the z-buffer handle transparency?

Not by itself. The z-buffer keeps exactly one nearest fragment per pixel, but a transparent surface must blend with whatever is behind it, which means keeping more than one fragment. The usual workaround is to render opaque geometry with the z-buffer, then draw transparent surfaces sorted back-to-front with depth writes disabled. Order-independent transparency techniques like depth peeling or per-pixel linked lists generalize this at extra cost.

Q: How much memory does a z-buffer cost?

One depth value per pixel. At 1920×1080 with a 32-bit depth/stencil format that is about 8.3 MB; at 4K it is roughly 33 MB. Multisampling multiplies that by the sample count, so 4× MSAA at 4K needs about 130 MB just for depth — which is why modern GPUs compress the depth buffer and store a hierarchical low-resolution copy (Hi-Z) for early rejection.

Q: What is early-Z and why does draw order still matter?

Early-Z runs the depth test before the fragment shader instead of after, so occluded fragments are killed without ever paying for their shading. It only works when the shader does not write its own depth or use discard, and it pays off most when you draw roughly front-to-back, because each early fragment raises the depth bar and rejects more of what follows. Drawing back-to-front defeats it — every triangle still gets shaded before being overwritten.

One depth value per pixel — and the nearest surface always wins

A z-buffer is a per-pixel depth array that solves hidden-surface removal: as each triangle is rasterized, a fragment is drawn only if its depth beats the value already stored, so the nearest surface wins at every pixel.

Invented byEdwin Catmull, 1974
Per-pixel cost1 read + compare + maybe write
Memory1 depth value / pixel
Sorting requiredNone
Stored value1/z (perspective)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How the z-buffer decides what you see

Point a camera at two overlapping triangles and you have a question every 3D renderer must answer: at this exact pixel, which surface is in front? Getting it wrong means a wall draws over the character standing in front of it, or a distant mountain punches through a near hill. This is the hidden-surface removal problem, and before 1974 the dominant answer was to sort polygons and paint them back-to-front — the painter's algorithm. It works until two polygons interpenetrate, or three of them overlap in a cycle, at which point no sort order is correct.

Edwin Catmull's z-buffer, introduced in his 1974 PhD thesis, sidesteps sorting entirely. Alongside the color framebuffer, the renderer keeps a second array of the same dimensions — one floating-point or fixed-point depth value per pixel. It is cleared to "infinitely far" (typically 1.0 in normalized device coordinates). Then triangles are rasterized in any order. For each pixel a triangle covers, the rasterizer interpolates a depth from the triangle's vertices and runs the depth test:

if (fragmentDepth < depthBuffer[x][y]) {   // closer than what's there
    depthBuffer[x][y] = fragmentDepth;     // remember the new nearest
    colorBuffer[x][y] = fragmentColor;     // and draw it
}                                          // else: discard, it's hidden

That is the whole idea. Each pixel independently remembers the closest thing seen so far. Draw order does not affect the final image — a triangle drawn last can still lose to one drawn first if it is farther away. The cost is constant per fragment: one buffer read, one comparison, and a conditional write. The genius is that a global, image-wide sort collapses into a local, per-pixel decision.

What actually gets stored: 1/z, not z

A subtle and consequential detail: the depth buffer does not store the linear camera-space distance z. After the perspective projection and the divide by w, the value that lands in the buffer is a function of 1/z. The reason is that perspective-correct interpolation across a triangle is linear in 1/z but nonlinear in z — so the GPU interpolates the quantity it can interpolate cheaply and correctly across screen space.

The mapping from camera depth to a stored value in [0, 1] for a standard OpenGL-style projection is:

z_ndc = (far + near)/(far − near) + (−2·far·near)/((far − near)·z_eye)
z_buffer = (z_ndc + 1) / 2

The practical consequence is that depth resolution is wildly nonuniform. Because the relationship is hyperbolic, the overwhelming majority of the buffer's distinguishable values describe geometry close to the camera, and almost none is left for the far distance. This is exactly why two distant coplanar surfaces flicker while two equally-close ones look fine — and it is the root cause of nearly every depth artifact you will ever debug.

When the z-buffer is the right tool — and when it isn't

Real-time rasterized 3D — games, CAD, simulation, every GPU pipeline. It is hardware-accelerated, order-independent, and handles interpenetrating geometry with zero special-casing.
Scenes with unpredictable overlap — where a correct polygon sort is expensive or impossible to compute.
Deferred shading and shadow mapping — both reuse depth: deferred rendering stores a depth/G-buffer, and a shadow map is a depth buffer rendered from the light's point of view.

Where it struggles: transparency (it keeps only one fragment per pixel, but blending needs many), analytic ray tracing (which computes exact ray-surface intersections and needs no depth buffer at all), and very high dynamic depth ranges where precision runs out. For order-independent transparency you reach for depth peeling or per-pixel linked lists; for offline photorealism you reach for a ray tracer.

Z-buffer vs other visibility methods

	Z-buffer	Painter's algorithm	BSP tree	Ray tracing	A-buffer
Resolution of visibility	Per pixel	Per polygon	Per polygon (exact order)	Per ray / sub-pixel	Per pixel, multi-fragment
Needs a sort?	No	Yes (every frame)	Precomputed once	No	Per-pixel sort
Interpenetrating polygons	Correct	Wrong	Needs splitting	Correct	Correct
Transparency	Single fragment only	Natural (back-to-front)	Natural	Natural	Correct (stores list)
Extra memory	1 depth / pixel	Polygon list	Tree of splits	Acceleration structure	Variable per-pixel lists
Hardware support	Universal (every GPU)	None needed	None	Recent GPUs (RT cores)	Software / extensions
Typical use	Real-time rasterization	Simple/2.5D scenes	Classic Doom/Quake	Offline / film, reflections	Anti-aliased transparency

The headline trade is sorting against memory. The painter's algorithm needs no extra buffer but pays a per-frame sort and breaks on cycles; the z-buffer spends a full screen of memory to make the sort disappear and never breaks. Once VRAM was cheap, the z-buffer won the real-time market outright.

What the numbers actually say

Memory: one depth value per pixel. At 1920×1080 with a 32-bit depth/stencil buffer that is 1920·1080·4 ≈ 8.3 MB. At 3840×2160 (4K) it is ≈ 33 MB. With 4× MSAA the depth buffer is sampled four times per pixel, so 4K MSAA needs ≈ 130 MB for depth alone.
Precision is front-loaded. With near = 0.1 and far = 1000 on a 24-bit fixed-point buffer, about 50% of all representable depth values fall within the first ~0.2 meter of the frustum, and roughly 90% within the first meter. Near the far plane, adjacent buffer values are spaced about half a meter apart on this 24-bit buffer (and tens of meters apart on a 16-bit one) — which is why distant terrain z-fights.
The near plane dominates, not the far plane. Moving the near plane from 0.1 to 1.0 improves usable far-distance precision roughly 10×; pushing the far plane from 1000 to 10000 barely changes anything. The ratio far/near is what matters.
Reversed-Z with a 32-bit float buffer gives near-uniform precision. Mapping near→1.0 and far→0.0 cancels the float exponent against the 1/z curve, turning a notoriously precision-starved scheme into one accurate enough for planetary-scale scenes with a single buffer.
Early-Z can save the entire fragment shader cost for occluded pixels — often the dominant cost in a modern frame, since a fragment shader can run hundreds of instructions while the depth test is a single compare.

JavaScript implementation

A software rasterizer for a single triangle with a z-buffer. Depth is interpolated with barycentric coordinates; a fragment is written only when it is nearer than the stored value.

function createRenderer(width, height) {
  const color = new Uint32Array(width * height);          // packed RGBA
  const depth = new Float32Array(width * height).fill(Infinity); // start "far"
  return { width, height, color, depth, clear() {
    color.fill(0);
    depth.fill(Infinity);
  }, rasterize(tri, rgba) { rasterTriangle(this, tri, rgba); } };
}

// tri = [[x,y,z], [x,y,z], [x,y,z]] in screen space; z is the depth to test
function rasterTriangle(r, tri, rgba) {
  const [a, b, c] = tri;
  const minX = Math.max(0, Math.floor(Math.min(a[0], b[0], c[0])));
  const maxX = Math.min(r.width  - 1, Math.ceil(Math.max(a[0], b[0], c[0])));
  const minY = Math.max(0, Math.floor(Math.min(a[1], b[1], c[1])));
  const maxY = Math.min(r.height - 1, Math.ceil(Math.max(a[1], b[1], c[1])));

  const area = edge(a, b, c);                 // signed area ×2
  if (area === 0) return;                     // degenerate triangle

  for (let y = minY; y <= maxY; y++) {
    for (let x = minX; x <= maxX; x++) {
      const p = [x + 0.5, y + 0.5];
      // barycentric weights
      let w0 = edge(b, c, p), w1 = edge(c, a, p), w2 = edge(a, b, p);
      if (w0 < 0 || w1 < 0 || w2 < 0) continue;   // outside the triangle
      w0 /= area; w1 /= area; w2 /= area;
      const z = w0 * a[2] + w1 * b[2] + w2 * c[2];  // interpolated depth

      const i = y * r.width + x;
      if (z < r.depth[i]) {        // ← the depth test: nearer wins
        r.depth[i] = z;            //   remember new nearest
        r.color[i] = rgba;         //   and draw
      }
    }
  }
}

function edge(p, q, r) {           // 2D cross product of (q−p) and (r−p)
  return (q[0] - p[0]) * (r[1] - p[1]) - (q[1] - p[1]) * (r[0] - p[0]);
}

Two things to notice. First, the triangles can be passed in any order — the z < depth[i] guard makes draw order irrelevant. Second, this interpolates z linearly in screen space, which is fine for a teaching example but technically wrong for perspective: a real GPU interpolates 1/z and other attributes divided by w, then divides back per pixel.

Python implementation

The same algorithm in NumPy, vectorized over the triangle's bounding box so the depth test runs on whole pixel grids at once.

import numpy as np

class ZBuffer:
    def __init__(self, w, h):
        self.w, self.h = w, h
        self.color = np.zeros((h, w, 3), dtype=np.uint8)
        self.depth = np.full((h, w), np.inf, dtype=np.float32)

    def edge(self, ax, ay, bx, by, px, py):
        return (bx - ax) * (py - ay) - (by - ay) * (px - ax)

    def triangle(self, verts, rgb):
        (ax, ay, az), (bx, by, bz), (cx, cy, cz) = verts
        area = self.edge(ax, ay, bx, by, cx, cy)
        if area == 0:
            return
        x0, x1 = max(0, int(min(ax, bx, cx))), min(self.w - 1, int(max(ax, bx, cx)))
        y0, y1 = max(0, int(min(ay, by, cy))), min(self.h - 1, int(max(ay, by, cy)))
        if x1 < x0 or y1 < y0:
            return

        xs = np.arange(x0, x1 + 1) + 0.5
        ys = np.arange(y0, y1 + 1) + 0.5
        px, py = np.meshgrid(xs, ys)                         # pixel centers

        w0 = self.edge(bx, by, cx, cy, px, py) / area
        w1 = self.edge(cx, cy, ax, ay, px, py) / area
        w2 = self.edge(ax, ay, bx, by, px, py) / area
        inside = (w0 >= 0) & (w1 >= 0) & (w2 >= 0)

        z = w0 * az + w1 * bz + w2 * cz                      # interpolated depth
        sub = self.depth[y0:y1 + 1, x0:x1 + 1]
        nearer = inside & (z < sub)                          # ← the depth test
        sub[nearer] = z[nearer]                              # write depth
        self.color[y0:y1 + 1, x0:x1 + 1][nearer] = rgb       # write color

The nearer = inside & (z < sub) mask is the entire visibility decision, applied to thousands of pixels per call. Render a near red triangle and a far blue one in either order, and the overlap region always comes out red — the buffer enforces it.

Variants worth knowing

Reversed-Z. Swap the depth range so the near plane maps to 1.0 and the far plane to 0.0, and use a 32-bit floating-point depth buffer. The float format packs more precision near 0.0, which now corresponds to far geometry — cancelling the 1/z crowding almost perfectly. This is the single most effective fix for depth precision and is standard in modern engines.

W-buffer. Stores linear eye-space depth (the w component) instead of 1/z. Precision is uniform across the frustum, which avoids near-camera over-allocation, but it is less hardware-friendly and largely superseded by reversed-Z.

Hierarchical Z (Hi-Z). Keep a low-resolution depth pyramid of conservative maximum depths per tile. Before rasterizing a triangle, test it against the coarse level; if the whole tile is already nearer, reject the triangle without touching individual pixels. Every modern GPU does this internally for early rejection.

Depth prepass. Render the scene once writing depth only (no shading), then render again with color and the depth test set to "equal." The second pass shades each visible pixel exactly once, eliminating overdraw in shading. Common in deferred and forward+ renderers.

A-buffer (anti-aliased, area-averaged). Instead of one fragment per pixel, store a per-pixel list of fragments with coverage masks. This recovers correct transparency and high-quality edge anti-aliasing at the cost of unbounded per-pixel memory — the conceptual ancestor of order-independent transparency.

Common bugs and edge cases

Z-fighting. Two coplanar surfaces flicker because their depths round to the same buffer value. Fixes: push the near plane out, use reversed-Z float depth, add a polygon offset, or merge the surfaces.
Forgetting to clear the depth buffer. If you clear color but not depth between frames, last frame's nearest values survive and silently reject this frame's geometry — the screen appears frozen or full of holes.
Near plane set too small. Setting near = 0.001 "to be safe" destroys depth precision everywhere because far/near explodes. Use the largest near plane your scene tolerates.
Transparency drawn with depth writes on. A transparent surface that writes depth blocks everything behind it from blending. Render transparent geometry back-to-front with depth writes off but depth test on.
Depth test direction mismatched with the projection. Reversed-Z needs the comparison flipped to "greater" and the clear value set to 0.0; leaving it at "less" with a 1.0 clear hides the whole scene.
Writing depth in the fragment shader. Outputting gl_FragDepth or calling discard disables early-Z, forcing full shading on fragments that will be thrown away. Only do it when you genuinely need it.

Frequently asked questions

Why does the z-buffer beat the painter's algorithm?