Algorithms

Mo's Algorithm

Sort queries by √n-block, sweep with two pointers — answers k offline range queries in batch

Mo's algorithm answers q offline range queries on an array of size n in total time O((n+q)√n) by reordering queries cleverly. Sort queries by (left/√n, right) — Mo's order — then maintain two pointers L and R and incrementally update an answer state (count, sum, distinct values, etc.) as L and R move. Pointer movements total O((n+q)√n) thanks to the bucket structure. Used heavily in competitive programming — particularly powerful for distinct-value range queries, mode queries, and inverse counting where segment-tree solutions are awkward.

  • TimeO((n+q)√n)
  • MemoryO(n + q)
  • Block size√n
  • TypeOffline
  • Pointer cost per queryAmortized √n
  • Notable variantMo's on tree (Euler tour)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Mo's algorithm works

Maintain two pointers L and R covering the active subarray a[L..R]. Maintain an answer state — frequency table, current sum, distinct-count, whatever the query needs. Two operations:

  • add(i): include a[i] in the active range. Update state.
  • remove(i): exclude a[i] from the active range. Update state.

Process queries in a clever order. For each query (l, r), move L and R toward the target range, calling add/remove on each step:

while (R < r) add(++R);
while (L > l) add(--L);
while (R > r) remove(R--);
while (L < l) remove(L++);

The wrong order — L first, then R — would extend a range past valid bounds and might require remove on uninitialized state. Always extend before shrinking when both move outward.

Mo's sort order

Bucket queries by L-block. Within each block, sort by R. Concretely, with block size B = √n:

const B = Math.floor(Math.sqrt(n));
queries.sort((a, b) => {
  const ab = Math.floor(a.l / B), bb = Math.floor(b.l / B);
  if (ab !== bb) return ab - bb;
  // Tie-break by R; alternate direction per block for slight speedup
  return ab % 2 === 0 ? a.r - b.r : b.r - a.r;
});

The alternating-direction tie-break (Hilbert-style) saves about half the pointer crossings between blocks — a 30-40% wall-clock speedup with the same asymptotic.

Why √n? Pointer-movement analysis

Total pointer movement, with block size B:

  • L pointer: within one block, queries share L's bucket of size B. L moves at most B per query → q queries × B = qB total.
  • R pointer: sorted by R within each block. R moves monotonically right within a block → at most n per block. There are n/B blocks → total n × n/B = n²/B.

Sum: qB + n²/B. Minimize: differentiate w.r.t. B, set qB = n²/B → B² = n²/q → B = n/√q. For q ≈ n, B = √n. Optimal cost = O((n + q)√n).

JavaScript implementation — distinct count

// Answer "how many distinct values in a[l..r]" for q queries
function moDistinct(a, queries) {
  const n = a.length;
  const B = Math.max(1, Math.floor(Math.sqrt(n)));
  const indexed = queries.map((q, i) => ({ ...q, idx: i }));
  indexed.sort((x, y) => {
    const xb = Math.floor(x.l / B), yb = Math.floor(y.l / B);
    if (xb !== yb) return xb - yb;
    return xb % 2 === 0 ? x.r - y.r : y.r - x.r;
  });

  const cnt = new Map();
  let distinct = 0;
  const add = i => {
    const v = a[i];
    const c = cnt.get(v) ?? 0;
    if (c === 0) distinct++;
    cnt.set(v, c + 1);
  };
  const remove = i => {
    const v = a[i];
    const c = cnt.get(v);
    if (c === 1) distinct--;
    cnt.set(v, c - 1);
  };

  let L = 0, R = -1;
  const ans = new Array(queries.length);
  for (const { l, r, idx } of indexed) {
    while (R < r) add(++R);
    while (L > l) add(--L);
    while (R > r) remove(R--);
    while (L < l) remove(L++);
    ans[idx] = distinct;
  }
  return ans;
}

Python implementation — sum of frequencies-squared

import math

def mo_freq_sq(a, queries):
    n = len(a)
    B = max(1, int(math.sqrt(n)))
    indexed = [(l, r, i) for i, (l, r) in enumerate(queries)]
    indexed.sort(key=lambda x: (x[0] // B, x[1] if (x[0] // B) % 2 == 0 else -x[1]))

    cnt = {}
    cur = 0
    def add(i):
        nonlocal cur
        c = cnt.get(a[i], 0)
        cur += 2 * c + 1   # (c+1)^2 - c^2
        cnt[a[i]] = c + 1
    def remove(i):
        nonlocal cur
        c = cnt[a[i]]
        cur -= 2 * c - 1   # c^2 - (c-1)^2
        cnt[a[i]] = c - 1

    L, R = 0, -1
    ans = [0] * len(queries)
    for l, r, idx in indexed:
        while R < r: R += 1; add(R)
        while L > l: L -= 1; add(L)
        while R > r: remove(R); R -= 1
        while L < l: remove(L); L += 1
        ans[idx] = cur
    return ans

Mo's vs segment tree vs sqrt decomposition

StructureBuildQueryUpdateBest for
Mo's algorithmO(n + q log q)O((n+q)√n) totalOffline onlyDistinct, mode, inverse counts
Segment treeO(n)O(log n)O(log n)Sum, min, gcd, associative merges
Sqrt decompositionO(n)O(√n)O(√n)Mixed query/update, simple aggregates
Persistent segment treeO(n log n)O(log n)O(log n) — copy versionOnline distinct-count, kth element

For n = 10⁵ and q = 10⁵: Mo's ≈ 2 × 10⁵ × √(10⁵) ≈ 6 × 10⁷ operations — fits in 1 second. Segment tree on the same: 10⁵ × 17 ≈ 2 × 10⁶ — instant, but only if the aggregate is mergeable.

Famous Mo's-algorithm problems

SPOJ DQUERY — distinct values in range

Given array a, q queries (l, r) — count distinct values in a[l..r]. Classic Mo's problem. Each add/remove is O(1) on a frequency table. Total O((n+q)√n).

CF Powerful Array

For each query, output Σ K_s × s² where K_s = frequency of value s in the range. Each add changes the answer by 2c + 1 (incremental update of (c+1)² − c²). Same Mo's framework.

Path queries on trees

Convert to Euler tour and run Mo's on the linearized sequence with toggle semantics: add when first seeing a vertex in the range, remove when seeing it again. Path (u, v) maps to a contiguous range on the Euler tour.

Why Mo's matters

  • When standard structures fail. Segment trees and Fenwick trees need an associative merge for range aggregates. Distinct counts, mode, and inversion counts have no clean merge — Mo's sidesteps the problem by avoiding merges entirely.
  • Competitive programming staple. Codeforces, ICPC, and IOI have featured Mo's-friendly problems regularly since 2010. Block-size tuning and Hilbert ordering are well-known meta-tricks among top competitors.
  • Frequency-of-frequencies queries. "How many values appear exactly k times?" or "What is the most-common value's count?" — point updates to a frequency table cost O(1), exactly what Mo's needs per add/remove.
  • Inverse-counting queries. Number of inversions in a range, number of pairs (i, j) with a[i] = a[j] — these have no simple merge but plug into Mo's by maintaining a running pair count.
  • Mo's on tree. Path-aggregate queries on rooted trees become Mo's-on-Euler-tour. Generalizes to subtree queries via DFS-order ranges.
  • Mo's with updates. Add a third sort key (time) to handle point updates between queries; total cost O(n^(5/3)) but conceptually identical.
  • Cache-friendly when q is huge. The pointer sweep is sequential — CPU prefetchers love it. For q ≫ n, Mo's often beats the asymptotically-better segment tree on wall-clock due to memory access pattern.

Common misconceptions

  • "Always √n optimal." Hilbert-curve ordering can be faster by a constant. For q ≪ n, optimal block size is n/√q rather than √n. The √n is the q ≈ n case.
  • "Only for arrays." Extends to trees (via Euler tour), to grids (via space-filling curve linearization), and even to multi-dimensional ranges in some restricted cases.
  • "Add/remove order doesn't matter." When both pointers move outward, you must add before any remove — otherwise you may operate on uninitialized state. The standard "extend then shrink" template is the safe order.
  • "Online version exists." The fundamental analysis depends on reordering. Online range-query structures (persistent segment trees) achieve similar problems but are different algorithms entirely.
  • "O(1) add/remove always available." Some queries need O(log n) per add/remove (e.g. when maintaining a sorted multiset). Total cost becomes O((n+q)√n log n) — still useful but not as tight.
  • "Replaces segment trees." For mergeable aggregates (sum, min, gcd) segment tree is faster (log n per query). Mo's wins only when merges are infeasible.
  • "Block size is universal." n^(2/3) for Mo's-with-updates, √(2n²/q) for q-tuned setups. Always profile.

Frequently asked questions

Why √n block size?

Total pointer movement breaks into two budgets: L moves and R moves. With block size B, L can move at most O(B) per query within a block (since queries in the same block share L's bucket), so total L-cost is O(qB). R moves monotonically inside each block (amortized O(n) per block) for a total of O(n × n/B) = O(n²/B). Sum: qB + n²/B. Minimize over B: differentiate and set q = n²/B², giving B = n/√q. When q ≈ n, B = √n is optimal — total cost O((n+q)√n) = O(n√n). For other q, scale B accordingly.

How does the sort order minimize pointer movements?

Sort queries by (l/√n, r) — primary key is left-pointer's block, secondary is right pointer. Within one block all queries share L's bucket, so L moves at most √n per query (bounded by block size), totaling O(q√n) for L. R moves monotonically within each block (always to the right), so R's total movement per block is O(n) — across n/√n = √n blocks, that's O(n√n) for R. Combined: O((n+q)√n). The trick is that R's monotonicity per block depends on the secondary sort being just r — not (l, r) jointly.

When does Mo's beat segment trees?

When the answer aggregator is hard to merge across two halves. Segment trees require an associative merge — sum, min, gcd are easy. Distinct-element count, mode, and inverse counts are awkward (or impossible) to merge in O(log n) per node. Mo's just adds and removes elements one at a time — any answer that supports O(1) or O(log n) point updates is fair game. For sum/min/gcd, segment trees beat Mo's by a √n factor (log n vs √n per query). For distinct counts, Mo's is the clean answer.

Why is it offline only?

Mo's reorders queries before answering them. Online algorithms must answer each query as it arrives, in input order. Mo's needs to see the full query batch first to sort it; only then does it sweep the pointers. If queries arrive in a stream and must be answered immediately, you cannot reorder, and Mo's analysis breaks down. Workarounds for streaming exist (Mo's-on-tree with persistence, segment tree variants), but plain Mo's is fundamentally offline.

What are typical update functions?

add(i): include element a[i] in the active range — for distinct-count, increment a frequency table cnt[a[i]]; if cnt becomes 1, increment distinct-count answer. remove(i): the inverse — decrement cnt[a[i]]; if cnt becomes 0, decrement answer. For sum: ans += a[i] / ans -= a[i]. For mode (most frequent value): maintain a frequency-of-frequencies bucket and update on each step. The pattern is: any answer aggregator that supports O(1) point insert/remove plugs into Mo's directly.

How does Mo's-on-tree work?

Compute the Euler tour: a sequence of length 2n where each vertex appears at its enter time tin[v] and exit time tout[v]. A path query (u, v) on the tree becomes a range query on the Euler tour with a small adjustment: nodes appearing exactly once in the range [tin[u], tin[v]] are on the path; nodes appearing twice are not. Mo's algorithm runs over Euler-tour ranges with toggle semantics (add when seeing a vertex first time, remove when seeing it again). Time O((n+q)√n) for path-aggregate queries on trees.