Graph Algorithms

Borůvka's MST

The 1926 parallel-friendly MST — every component picks its cheapest neighbor at once

Borůvka's algorithm finds a minimum spanning tree by having every component pick its cheapest outgoing edge simultaneously. Components merge in parallel — O(E log V) — the oldest MST algorithm, predating Prim and Kruskal.

  • TimeO(E log V) — log V rounds × O(E) per round
  • SpaceO(V + E) — union-find + edge list
  • ParallelismEach component runs independently — ideal for GPUs
  • Rounds≤ ⌈log₂ V⌉ — each round halves components
  • PredatesKruskal (1956), Prim (1957) — invented 1926
  • Used incuGraph, Gunrock, Karger-Klein-Tarjan O(E) MST

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Borůvka's algorithm works

Picture eight villages scattered across the countryside. To bring electricity to all of them you want the cheapest network of cables that connects every village — a minimum spanning tree. Otakar Borůvka, asked exactly this question for the Moravian electrical network in 1926, sketched the world's first MST algorithm. His insight was almost embarrassingly direct: every village, in parallel, picks its cheapest possible connection. The wires don't need a central plan; the village-level decisions collectively converge on the global optimum.

The algorithm:

  1. Initialize. Each vertex is its own component.
  2. Round. For every component, find the cheapest edge with one endpoint inside the component and the other endpoint outside. Add all such "minimum outgoing edges" to the MST simultaneously. Merge the components joined by these edges using union-find.
  3. Repeat until only one component remains (the graph is connected) or no component has any outgoing edges (graph was disconnected — produces minimum spanning forest).

Why does it work? It's the cut property: for any cut of the graph, the lightest edge crossing it is in some MST. Each component's minimum outgoing edge crosses some cut (the cut between the component and the rest of the graph), so it's safe to take. Tie-breaking matters — if two components might pick the same parallel edge, deterministic tie-breaking (by edge ID, lexicographic, etc.) ensures consistency.

Why it's parallel-friendly

Kruskal needs the edges sorted globally — a sequential bottleneck. Prim grows a single tree from a starting vertex — also sequential, even if you parallelize the priority queue. Borůvka asks every component to do its own work simultaneously with no global coordination beyond the union-find at the end of each round. That maps cleanly to:

  • GPU. One thread block per component, each scanning its edges to find the local minimum. The cuGraph and Gunrock GPU graph libraries both implement Borůvka for this reason.
  • Distributed graph (Pregel-style). Each vertex sends its "candidate minimum" to its component leader; leader compares; components merge via broadcast.
  • SIMD vectorization. Per-vertex scans vectorize trivially across edge lists.

The result: Borůvka beats Prim and Kruskal in production on multi-core or accelerator hardware. On a single core it's competitive — slightly slower than Kruskal for sparse graphs because of the per-round overhead, but the gap is small.

Walk-through example

Graph (8 vertices, 12 edges):
  edges (u, v, weight)
  (A, B, 1)   (A, C, 4)   (B, C, 3)   (B, D, 6)
  (C, D, 2)   (D, E, 8)   (E, F, 5)   (E, G, 7)
  (F, G, 9)   (F, H, 10)  (G, H, 11)  (D, F, 12)

Round 1 — each vertex picks its minimum outgoing edge:
  A picks (A, B, 1)
  B picks (A, B, 1)      — same edge, fine
  C picks (C, D, 2)
  D picks (C, D, 2)
  E picks (E, F, 5)
  F picks (E, F, 5)
  G picks (E, G, 7)
  H picks (F, H, 10)

  Edges added to MST: {AB, CD, EF, EG, FH}
  Components after merge: {A,B}, {C,D}, {E,F,G,H}  →  3 components

Round 2 — each component picks its minimum outgoing edge:
  {A,B}      → (B, C, 3)
  {C,D}      → (B, C, 3)
  {E,F,G,H}  → (D, E, 8)

  Edges added: {BC, DE}
  Components after merge: {A,B,C,D,E,F,G,H}  →  1 component  →  done.

MST edges: {AB(1), CD(2), BC(3), EF(5), EG(7), DE(8), FH(10)}
MST total: 1 + 2 + 3 + 5 + 7 + 8 + 10 = 36

7 edges = V - 1.   2 rounds total.

Borůvka vs Kruskal vs Prim

BorůvkaKruskalPrim
ApproachEvery component picks its cheapest outgoing edge in parallelSort all edges, take cheapest non-cyclingGrow one tree, extract-min the cheapest extending edge
Data structureUnion-findSorted edge list + union-findPriority queue (heap)
TimeO(E log V)O(E log E) = O(E log V)O((V+E) log V)
Sequential easeModerate — round logic + tie-breakingEasiestModerate
ParallelismExcellent — rounds parallelizeLimited — global sort bottleneckLimited — sequential growth
Best withGPU / multi-core hardwareSparse graphs, single-threadedDense graphs with adjacency matrix
Year192619561957 (Jarník 1930)
Famous usercuGraph, Gunrock, KKT linear-time MSTLEDA, networkx, intro textbooksNetwork design pedagogy

For pedagogy and single-threaded code, Kruskal remains simplest. For dense graphs with adjacency matrices, Prim's O(V²) variant wins. For GPU and distributed graph compute, Borůvka rules.

Pseudo-code

boruvka(V, E):
    uf = UnionFind(V)
    mst = []
    num_components = V

    while num_components > 1:
        // For each component, find its minimum outgoing edge.
        cheapest = dict()  // component_root -> (u, v, w) or NIL
        for (u, v, w) in E:
            ru, rv = uf.find(u), uf.find(v)
            if ru == rv: continue       // internal edge
            if ru not in cheapest or w < cheapest[ru][2] or
              (w == cheapest[ru][2] and (u, v) < (cheapest[ru][0], cheapest[ru][1])):
                cheapest[ru] = (u, v, w)
            if rv not in cheapest or w < cheapest[rv][2] or
              (w == cheapest[rv][2] and (u, v) < (cheapest[rv][0], cheapest[rv][1])):
                cheapest[rv] = (u, v, w)

        if no cheapest entries: break  // disconnected

        // Add all selected edges, merging components.
        for (u, v, w) in cheapest.values():
            if uf.union(u, v):         // false if already merged this round
                mst.append((u, v, w))
                num_components -= 1

    return mst

Python implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        root = x
        while self.parent[root] != root:
            root = self.parent[root]
        while self.parent[x] != root:
            self.parent[x], x = root, self.parent[x]
        return root

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False
        if self.rank[rx] < self.rank[ry]:
            self.parent[rx] = ry
        elif self.rank[rx] > self.rank[ry]:
            self.parent[ry] = rx
        else:
            self.parent[ry] = rx
            self.rank[rx] += 1
        return True

def boruvka(n, edges):
    """edges: list of (u, v, weight)."""
    uf = UnionFind(n)
    mst = []
    components = n

    while components > 1:
        cheapest = {}  # root -> (idx, weight)
        for idx, (u, v, w) in enumerate(edges):
            ru, rv = uf.find(u), uf.find(v)
            if ru == rv: continue
            for r in (ru, rv):
                cur = cheapest.get(r)
                if cur is None or w < cur[1] or (w == cur[1] and idx < cur[0]):
                    cheapest[r] = (idx, w)

        if not cheapest:
            break  # disconnected

        # Add chosen edges
        added = set()
        for idx, _ in cheapest.values():
            if idx in added: continue
            added.add(idx)
            u, v, w = edges[idx]
            if uf.union(u, v):
                mst.append((u, v, w))
                components -= 1

    return mst

# Example
edges = [
    (0, 1, 1), (0, 2, 4), (1, 2, 3), (1, 3, 6),
    (2, 3, 2), (3, 4, 8), (4, 5, 5), (4, 6, 7),
    (5, 6, 9), (5, 7, 10), (6, 7, 11), (3, 5, 12),
]
mst = boruvka(8, edges)
print(sum(w for _, _, w in mst))  # 36

JavaScript implementation

class UnionFind {
  constructor(n) { this.p = Array.from({ length: n }, (_, i) => i); this.r = new Array(n).fill(0); }
  find(x) {
    let root = x;
    while (this.p[root] !== root) root = this.p[root];
    while (this.p[x] !== root) { const next = this.p[x]; this.p[x] = root; x = next; }
    return root;
  }
  union(x, y) {
    const rx = this.find(x), ry = this.find(y);
    if (rx === ry) return false;
    if (this.r[rx] < this.r[ry]) this.p[rx] = ry;
    else if (this.r[rx] > this.r[ry]) this.p[ry] = rx;
    else { this.p[ry] = rx; this.r[rx]++; }
    return true;
  }
}

function boruvka(n, edges) {
  const uf = new UnionFind(n);
  const mst = [];
  let components = n;

  while (components > 1) {
    const cheapest = new Map();  // root -> { idx, w }
    edges.forEach(([u, v, w], idx) => {
      const ru = uf.find(u), rv = uf.find(v);
      if (ru === rv) return;
      for (const r of [ru, rv]) {
        const cur = cheapest.get(r);
        if (!cur || w < cur.w || (w === cur.w && idx < cur.idx)) {
          cheapest.set(r, { idx, w });
        }
      }
    });
    if (cheapest.size === 0) break;

    const added = new Set();
    for (const { idx } of cheapest.values()) {
      if (added.has(idx)) continue;
      added.add(idx);
      const [u, v, w] = edges[idx];
      if (uf.union(u, v)) {
        mst.push([u, v, w]);
        components--;
      }
    }
  }
  return mst;
}

When to use Borůvka

  • GPU graph compute. cuGraph and Gunrock implement Borůvka because each round is data-parallel — perfect for thousands of GPU cores. A billion-edge MST runs in seconds on a single A100.
  • Distributed graph processing (Pregel, Giraph, GraphX). Per-vertex computation with synchronization barriers between rounds — Borůvka's structure is exactly this.
  • Multi-core CPU. The per-round edge scan parallelizes trivially with OpenMP. For 64-core machines on sparse graphs, Borůvka beats sequential Kruskal.
  • Building block for KKT linear-time MST. Karger-Klein-Tarjan's randomized expected-O(E) algorithm uses three Borůvka phases up front to shrink the graph fast enough that random sampling can finish.
  • Streaming or insertion-only graphs. Borůvka rounds adapt naturally — re-run on new edges without redoing prior work.

Common pitfalls

  • Equal-weight edges without tie-breaking. The most common bug. If two components both pick a different parallel edge of the same weight, they may merge with the wrong neighbor and a cycle can sneak in. Always include a deterministic secondary key (edge index, lexicographic endpoint ids).
  • Double-adding edges. When two components both pick the same edge (typical — A picks edge AB, B picks edge AB), the algorithm should add the edge once, not twice. The union-find `union()` returning false on already-merged sets is the standard guard.
  • Forgetting internal edges become "internal" mid-round. Once components merge, edges that used to be between them become internal. The cheapest-edge scan must use `find()` for both endpoints, not raw vertex IDs.
  • Naive components-as-arrays. Tracking "which vertex belongs to which component" via a separate array, you must rebuild it each round. Union-find with path compression handles this implicitly and is far simpler.
  • Endless loop on disconnected graphs. If a component has no outgoing edges, it'll never merge. Detect this case explicitly — terminate if no component picked any edge this round.
  • Pretending sequential implementation is fast. Borůvka's appeal is parallelism. On one core, Kruskal usually wins. If you're implementing Borůvka sequentially, ask whether you actually need it.

Performance analysis

Each round scans all E edges once: O(E) work per round (the find() calls amortize to O(α(V)) each). Number of rounds is bounded by O(log V) because each round at least halves the component count. Total: O(E log V).

Memory: O(V + E) — union-find array, edge list, the per-round cheapest map.

Real numbers (single-core Python):

  • V = 1000, E = 5000: ~10 × 5000 = 5 × 10⁴ ops — under 5 ms.
  • V = 100,000, E = 1,000,000: ~17 × 10⁶ ops — ~200 ms (Python), ~5 ms (C++).
  • V = 10⁹, E = 10¹⁰ (huge web graph): impossible single-threaded. On a 4-GPU node with cuGraph: ~30 seconds.

Parallel speedup: with p processors, the per-round edge scan finishes in O(E/p) time. Rounds remain log V. Total parallel time: O((E/p) log V). For p = 1024 GPU cores on a billion-edge graph, that's a 1000× speedup over sequential — exactly the regime where Borůvka shines.

History — the first MST algorithm

Otakar Borůvka was a 27-year-old Czech mathematician in 1926. The West Moravian Power Plants asked the Brno Mathematical Society how to design an efficient regional electricity-distribution network. Borůvka responded with the algorithm now bearing his name, published in Czech under the title "O jistém problému minimálním" (On a Certain Minimal Problem).

His paper was prescient: it identified the abstract problem (minimum spanning tree on a weighted graph), proved the algorithm correct, and discussed practical issues like tie-breaking. The English-speaking algorithmic community didn't see the result for 50 years; Kruskal (1956) and Prim (1957) rediscovered MST independently. Only with the parallel-algorithms renaissance of the 1980s did Borůvka's parallel structure get its due. Today, in the GPU era, his 1926 algorithm is the natural choice.

Frequently asked questions

Why is Borůvka's algorithm parallel-friendly?

Every round, each component independently computes its own minimum outgoing edge — no coordination required. With c components and p processors, the per-round work distributes as (E/p) per processor in the simplest implementation. Modern GPU MST solvers use this property to compute MSTs of billion-edge graphs in seconds. Kruskal needs a global sort; Prim grows a single tree sequentially. Borůvka parallelizes.

How many rounds does Borůvka run?

At most ⌈log₂ V⌉ rounds. Each round at least halves the number of components — every component absorbs at least one neighbor when it picks its minimum outgoing edge. Starting from V components, after k rounds at most V/2^k components remain. log₂ V rounds, each O(E) work, gives the O(E log V) total.

Why must edge weights be unique?

If two components pick the same edge as their respective minimums, that's fine — the edge connects them into one merged component. But if two parallel edges between the same two components have the same minimum weight, components could each pick a different parallel edge and create a cycle. Break ties consistently (lexicographic by endpoint IDs is the standard trick). Real implementations either preprocess ties or use deterministic tie-breaking in the edge comparison.

What's the history — really older than Prim and Kruskal?

Yes. Otakar Borůvka published his algorithm in Czech in 1926, motivated by an electric-grid problem in Moravia. Kruskal published in 1956, Prim in 1957 (Jarník had actually published the same idea in 1930). The Borůvka paper sat in obscurity for decades; it was rediscovered in the parallel-algorithms revival of the 1980s when researchers realized it was inherently the most parallel of the three.

How does Borůvka compare to Karger-Klein-Tarjan's randomized linear-time MST?

KKT (1995) gives the best known expected-time MST at O(E). It uses three Borůvka phases as a critical subroutine — those Borůvka phases collapse the graph fast enough that random sampling can then finish the job. So Borůvka is not just a teaching algorithm; it's the warmup for the asymptotically fastest MST algorithm known.

When should I pick Borůvka over Kruskal or Prim?

When you have parallelism — multi-core CPUs, GPUs, distributed graph systems. GPU MST libraries (cuGraph, Gunrock) implement Borůvka because it maps cleanly to data-parallel hardware. For single-threaded code, Kruskal with union-find is simpler. Borůvka also shines on sparse graphs because its per-round work is O(E) without needing a sorted edge list or a heap.

Does Borůvka work on disconnected graphs?

Yes. After log V rounds, the algorithm naturally produces the minimum spanning forest — one MST per connected component. A component with no outgoing edges (the entire graph is contained in it) just stops contributing edges. Final edge count equals V minus the number of connected components.