Data Structures

Sparse Table

Precompute power-of-2 intervals; idempotent operations let two intervals cover any range in constant time

A sparse table is a data structure that answers idempotent range queries (min, max, gcd, bitwise-and, bitwise-or) in O(1) per query after O(n log n) preprocessing. It stores, for each position i and each k from 0 to ⌊log₂ n⌋, the answer for the range [i, i + 2ᵏ − 1]. To query a range [l, r], take k = ⌊log₂(r − l + 1)⌋ and combine the two precomputed intervals [l, l+2ᵏ−1] and [r−2ᵏ+1, r] — they overlap, but for idempotent operations that's fine. Static structure: no updates allowed.

  • PreprocessingO(n log n)
  • QueryO(1)
  • MemoryO(n log n)
  • Operationsmin, max, gcd, AND, OR
  • UpdatesNot supported (static)
  • Beat bySegment tree if updates needed

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How a sparse table works

Build a 2D table indexed by (k, i) where:

table[k][i] = combine of a[i], a[i+1], …, a[i + 2^k − 1]

So table[0][i] = a[i] (intervals of length 1), table[1][i] covers [i, i+1] (length 2), table[2][i] covers [i, i+3] (length 4), and so on up to k = ⌊log₂ n⌋.

Build with the doubling recurrence:

table[k][i] = combine(table[k−1][i], table[k−1][i + 2^(k−1)])

Each cell is filled in O(1) given the previous level. Total cells: ⌊log₂ n⌋ × n. Total preprocessing: O(n log n).

Constant-time query

To answer a query [l, r], find the largest k such that 2ᵏ ≤ r − l + 1 — that is k = ⌊log₂(r − l + 1)⌋. Two power-of-2-length intervals starting at l and ending at r cover [l, r] (possibly with overlap):

answer = combine(table[k][l], table[k][r − 2^k + 1])

For idempotent ⊕ (min, max, gcd, AND, OR), the overlap region contributes its values twice, but x ⊕ x = x, so the result is correct.

JavaScript implementation — range min

class SparseTableMin {
  constructor(a) {
    const n = a.length;
    const K = Math.floor(Math.log2(n)) + 1;
    this.log = new Array(n + 1).fill(0);
    for (let i = 2; i <= n; i++) this.log[i] = this.log[i >> 1] + 1;
    this.table = Array.from({ length: K }, () => new Array(n));
    for (let i = 0; i < n; i++) this.table[0][i] = a[i];
    for (let k = 1; k < K; k++) {
      for (let i = 0; i + (1 << k) <= n; i++) {
        this.table[k][i] = Math.min(
          this.table[k - 1][i],
          this.table[k - 1][i + (1 << (k - 1))]
        );
      }
    }
  }
  // Range min over [l, r] inclusive
  query(l, r) {
    const k = this.log[r - l + 1];
    return Math.min(this.table[k][l], this.table[k][r - (1 << k) + 1]);
  }
}

The precomputed log table avoids Math.log2 calls per query — saving a factor of 5-10× on hot inner loops.

Python implementation — range gcd

from math import gcd, log2

class SparseTableGcd:
    def __init__(self, a):
        n = len(a)
        K = int(log2(n)) + 1
        self.log = [0] * (n + 1)
        for i in range(2, n + 1):
            self.log[i] = self.log[i >> 1] + 1
        self.table = [[0] * n for _ in range(K)]
        for i, v in enumerate(a):
            self.table[0][i] = v
        for k in range(1, K):
            half = 1 << (k - 1)
            for i in range(n - (1 << k) + 1):
                self.table[k][i] = gcd(self.table[k - 1][i],
                                       self.table[k - 1][i + half])

    def query(self, l, r):
        k = self.log[r - l + 1]
        return gcd(self.table[k][l], self.table[k][r - (1 << k) + 1])

Sparse table vs segment tree vs prefix sums

StructureBuildQueryUpdateOperations
Sparse tableO(n log n)O(1)Not supportedIdempotent (min, max, gcd, AND, OR)
Disjoint sparse tableO(n log n)O(1)Not supportedAny associative
Segment treeO(n)O(log n)O(log n)Any associative
Fenwick treeO(n log n)O(log n)O(log n)Sum (or invertible)
Prefix sumO(n)O(1)RebuildSum, XOR (invertible)

For n = 10⁶ static range-min queries: sparse table builds in 10⁶ × 20 = 2 × 10⁷ ops, queries in 1 op. Segment tree builds in 10⁶ ops, queries in 20 ops. Sparse table wins on query latency by 20× when no updates are needed.

Famous applications

Range Minimum Query (RMQ)

The canonical sparse-table problem. Static array, q queries each asking the minimum value in a range. O(n log n) build + O(q) total query time. For n = 10⁵, q = 10⁵: 2 × 10⁶ build + 10⁵ query = ~2 ms wall-clock in C++.

Lowest Common Ancestor (LCA) via Euler tour

The classic LCA-via-RMQ reduction. Compute the Euler tour of a tree (length 2n − 1) and the depth of each visited node. LCA(u, v) is the node of minimum depth in the Euler-tour range between u's and v's first occurrences. RMQ on a length-2n array → O(n log n) preprocessing, O(1) per LCA query.

// Sketch: tour = [v0, v1, ...], depth = [d0, d1, ...], first[v] = index in tour
function lca(u, v) {
  let a = first[u], b = first[v];
  if (a > b) [a, b] = [b, a];
  const k = log[b - a + 1];
  // RMQ over depth[a..b], return the vertex at the minimizing position
  const i1 = sparseTable.argmin(a, b);
  return tour[i1];
}

Suffix array LCP queries

The Kasai LCP array gives the longest common prefix between adjacent suffixes in suffix-array order. To compute LCP between any two suffixes, take the minimum of LCP values over the corresponding range — a classic sparse-table application. Used in string-matching and bioinformatics for substring queries.

Path-min queries on static trees

Combine LCA with auxiliary depth-prefix arrays to answer "minimum edge weight on path (u, v)" in O(1) per query after O(n log n) preprocessing. Used in network reliability and competitive programming problems where edges are immutable.

Why sparse table matters

  • Truly static RMQ. Geometric and stringology problems often build immutable data once and query many times. Sparse table is the right tool: O(1) queries, no log factor.
  • LCA in trees. The reduction LCA → RMQ via Euler tour gives O(1) per query LCA — used in tree path queries, generic offline tree algorithms, and persistent-segment-tree-on-tree problems.
  • Suffix array LCP. The full string toolbox (suffix arrays, LCP arrays, sparse-table RMQ) gives O(n log n + q) total time for q suffix-prefix-comparison queries — fundamental in bioinformatics and search engines.
  • Hot inner loops. When millions of range-min queries dominate runtime (computational geometry sweep lines, simulation snapshots), the constant-factor advantage over segment-tree (1 vs 20 ops per query) compounds.
  • Disjoint sparse table for any associative op. Generalize to sum, product, matrix multiply via the disjoint variant — same O(1) query, same O(n log n) build, no idempotency requirement.
  • Plus-Minus-1 RMQ → O(1) build and query. The Euler-tour depth sequence has differences ±1 (adjacent in DFS), enabling Fischer-Heun structures: O(n) build, O(1) query — true linear-time RMQ for this special case.
  • Memory cost is acceptable. O(n log n) is about 20× n for n = 10⁶ — usually fits in cache or RAM. Compare to segment tree's 4n memory.

Common misconceptions

  • "Works for any operation." Only idempotent operations (x ⊕ x = x). Sum, product, average all fail because the overlap doubles their contribution.
  • "Supports point updates." No — a single update potentially affects O(log n) levels at multiple positions; rebuild from scratch when data changes.
  • "O(n²) memory." No — only powers of 2 are stored, giving O(n log n) total. The naive every-(l, r)-pair table would be O(n²); the sparse-table insight is that idempotency lets you skip non-power-of-2 lengths.
  • "Faster than segment tree always." Build is slower (O(n log n) vs O(n)). Query is faster (O(1) vs O(log n)). Pick based on whether you need updates, and whether queries dominate build.
  • "log table is optional." Computing ⌊log₂(r − l + 1)⌋ via floating-point log2 is slow and may have rounding errors. Always precompute log[i] = log[i/2] + 1 for i = 1..n.
  • "Replaces Fenwick tree." No — Fenwick tree handles dynamic prefix sums in O(log n) per op. Sparse table is for static idempotent range queries — different problems.
  • "Returns the index, not the value." Standard sparse table stores values. To get the index of the min/max, store (value, index) pairs and break ties by index — small extra constant.

Frequently asked questions

Why does idempotency matter?

An operation ⊕ is idempotent if x ⊕ x = x for any x. Min, max, gcd, bitwise-AND, bitwise-OR all qualify. The sparse-table query overlaps two precomputed intervals — covering position p in both — and combines them with ⊕. For idempotent ⊕, double-counting position p makes no difference because (value at p) ⊕ (value at p) = (value at p). Sum and product are not idempotent: 5 + 5 ≠ 5. So sparse table cannot answer sum or product range queries; for those, use a Fenwick tree or segment tree.

How does the doubling work?

Build table[k][i] = answer for range [i, i + 2^k − 1]. Base case k = 0: table[0][i] = a[i]. Recurrence: table[k][i] = combine(table[k−1][i], table[k−1][i + 2^(k−1)]). Each level k covers ranges of length 2^k by combining two ranges of length 2^(k−1). Total levels: ⌊log₂ n⌋ + 1. Total cost: O(n log n) cells, each filled in O(1). Query [l, r]: take k = ⌊log₂(r − l + 1)⌋. Two intervals of length 2^k starting at l and ending at r cover [l, r] (with overlap when 2^(k+1) > r − l + 1).

What if I need updates?

Sparse table is purely static. Any point update at position i potentially invalidates O(log n) cells per level for each k that touches i — total O(log² n) updates needed, but the cells are interdependent in ways that make the rebuild costly. The textbook answer: use a segment tree. It supports O(log n) point updates and O(log n) range queries — slightly slower per query than sparse table's O(1), but correct under updates. For mostly-static data with a few updates, rebuild the sparse table from scratch in O(n log n) when a batch of updates is ready.

Why O(n log n) memory and not O(n²)?

A naive RMQ structure storing the answer for every (l, r) pair would take O(n²) cells — wasteful. The sparse-table insight: for idempotent operations, you don't need every length, only powers of two. ⌊log₂ n⌋ + 1 ≈ 20 levels for n = 10⁶ — so the table has roughly 20n cells, not n². Memory: O(n log n) bits or O(n log n) words depending on cell size. For n = 10⁶, that's ~20 million int32 cells = 80 MB — manageable but not free.

Can it answer sum queries?

No — sum is not idempotent. The query trick relies on double-counting positions in the overlap of two power-of-two intervals; for sum, those positions get added twice, giving the wrong answer. Use a prefix-sum array (O(n) preprocessing, O(1) query) for static sum queries, or a Fenwick tree (O(log n) update, O(log n) query) for dynamic. For static range-sum, prefix sums beat sparse table on every metric.

How does Disjoint Sparse Table extend it?

A disjoint sparse table partitions queries differently: for each query [l, r], find the highest bit position where l and r differ; combine prefix from that bit's midpoint going right with suffix going left. This requires non-overlapping segments, so non-idempotent operations like sum, product, modular product, and matrix multiplication work correctly. Preprocessing: O(n log n). Query: O(1). Used for general associative range queries on static arrays — drop-in upgrade from sparse table when idempotency is missing.