Algorithms

Kruskal's MST

Sort edges, take cheapest non-cycling — greedy minimum spanning tree

Kruskal's algorithm finds a minimum spanning tree of a weighted graph by processing edges in ascending weight order, adding each one if it doesn't create a cycle. Powered by union-find for fast cycle detection, it runs in O(E log E). Used in network design, clustering, image segmentation, and as a textbook example of how greedy algorithms can be provably optimal.

  • TimeO(E log E) — dominated by edge sort
  • SpaceO(V) — union-find structure
  • ApproachGreedy — sort edges, take if non-cycling
  • Cycle detection viaUnion-find with path compression + union by rank
  • Vs Prim'sBetter for sparse graphs; Prim's wins on dense
  • Used inNetwork design, clustering, image segmentation

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Kruskal's algorithm works

Three steps:

  1. Sort all edges by weight ascending. O(E log E).
  2. Initialize a union-find structure with one component per vertex. O(V).
  3. Iterate through sorted edges. For each edge (u, v, w): if u and v are in different components (find(u) ≠ find(v)), add the edge to the MST and union the components. O(E α(V)).

Stop when V-1 edges have been added (forms a spanning tree) or when all edges are processed (graph may be disconnected). Total time — O(E log E), dominated by the sort.

Walk-through example

Graph with 6 vertices, 9 edges:

Edges sorted by weight:
  (A, B, 1)   take — A and B in different components → union
  (D, F, 2)   take — D and F different → union
  (B, C, 3)   take — B's component {A,B} different from C's {C} → union
  (D, E, 4)   take — D's component {D,F} different from E's {E} → union
  (E, F, 5)   skip — E and F now in same component (would create cycle)
  (B, F, 6)   take — {A,B,C} different from {D,E,F} → union
              MST has 5 edges (= V - 1) — done.
  (remaining edges skipped — no longer needed)

MST total weight: 1 + 2 + 3 + 4 + 6 = 16

The algorithm took 5 edges (V-1 for V=6), skipping any edge that would create a cycle. Union-find made cycle checks trivial — find(u) and find(v) compared in O(α) per check.

Kruskal vs Prim

KruskalPrim
ApproachSort edges, take lightest non-cyclingGrow tree from start vertex, take lightest extending edge
Data structureSorted edge list + union-findMin-priority queue (heap)
Time (sparse, E ≈ V)O(V log V)O((V + E) log V)
Time (dense, E ≈ V²)O(V² log V)O(V²) with adj matrix
Code complexity~30 lines + union-find~25 lines + heap
Disconnected graphsProduces minimum spanning forest naturallyOnly sees one component
Best forSparse graphs, simple implementationDense graphs, when you have a heap library

For most real graphs (sparse), Kruskal and Prim are competitive. Pick whichever your team is more comfortable with; they produce equivalent results.

JavaScript implementation

// Requires the UnionFind class from the union-find article
function kruskalMST(n, edges) {
  // edges: [[u, v, weight], ...] where u, v are 0..n-1
  edges.sort((a, b) => a[2] - b[2]);

  const uf = new UnionFind(n);
  const mst = [];
  let totalWeight = 0;

  for (const [u, v, w] of edges) {
    if (uf.union(u, v)) {  // returns true iff actually merged (no cycle)
      mst.push([u, v, w]);
      totalWeight += w;
      if (mst.length === n - 1) break; // tree complete
    }
  }

  return { mst, totalWeight, isSpanningTree: mst.length === n - 1 };
}

// Usage
const edges = [
  [0, 1, 1], [0, 5, 6], [1, 2, 3],
  [3, 4, 4], [3, 5, 2], [4, 5, 5],
];
const { mst, totalWeight, isSpanningTree } = kruskalMST(6, edges);
console.log('MST weight:', totalWeight);  // 16
console.log('Spanning tree?', isSpanningTree);  // true

Python implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    def find(self, x):
        root = x
        while self.parent[root] != root: root = self.parent[root]
        while self.parent[x] != root:
            self.parent[x], x = root, self.parent[x]
        return root
    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False
        if self.rank[rx] < self.rank[ry]: self.parent[rx] = ry
        elif self.rank[rx] > self.rank[ry]: self.parent[ry] = rx
        else: self.parent[ry] = rx; self.rank[rx] += 1
        return True

def kruskal_mst(n, edges):
    edges = sorted(edges, key=lambda e: e[2])
    uf = UnionFind(n)
    mst = []
    total = 0
    for u, v, w in edges:
        if uf.union(u, v):
            mst.append((u, v, w))
            total += w
            if len(mst) == n - 1: break
    return mst, total

When to use Kruskal's

  • Network design. Find the cheapest set of cables to connect every office. Edges are possible cables; weights are installation costs. MST gives the minimum-cost connection.
  • Clustering and segmentation. Hierarchical clustering — sort edges by similarity, merge clusters until you have k components. Image segmentation (Felzenszwalb-Huttenlocher) does this with pixel-similarity edges.
  • Approximation algorithms for the traveling salesman problem. Christofides' algorithm uses MST as a building block — its cost is at most 1.5× the optimal TSP tour for metric instances.
  • Critical-path analysis in networks. Failure of the heaviest edge in the MST disconnects the network minimally.
  • Building blocks for harder problems. Many algorithms reduce to MST — Borůvka's, Steiner trees, partial spanning trees.

Common bugs and pitfalls

  • Forgetting to stop after V-1 edges. Continuing past V-1 doesn't change the MST (subsequent edges all create cycles), but wastes work. Cheap optimization.
  • Not using union-find for cycle detection. Naive cycle detection (DFS each time) is O(V) per edge, total O(V·E) — much slower than union-find's O(α(V)) per edge.
  • Failing on graphs with parallel edges. Multiple edges between the same pair of vertices are fine — only the lightest gets accepted (others create cycles). But check that your edge representation doesn't lose information when sorting.
  • Assuming the graph is connected. On a disconnected graph, Kruskal's produces a minimum spanning forest. Document or check whether the input is connected; report disconnection by checking the final edge count.
  • Using Kruskal's when you need a Steiner tree. MST connects all vertices. Steiner tree connects a SUBSET of vertices, possibly using extra "Steiner points" not in the original set. NP-hard in general; Kruskal's is the wrong tool.

Frequently asked questions

What is a minimum spanning tree?

Given a connected, undirected, weighted graph, an MST is a subset of edges that connects all vertices with the minimum possible total weight and no cycles. For n vertices, the MST has exactly n-1 edges. There may be multiple MSTs of equal weight; the algorithm returns one of them.

Why does Kruskal's greedy approach produce an optimal MST?

The cut property — for any cut of the graph (split into two halves), the lightest edge crossing the cut is in some MST. Kruskal's processes edges in ascending weight; the first edge that crosses any pair-of-components-becoming-one-component is the lightest edge across that cut, so taking it is safe. Formal proof by exchange argument — any MST can be transformed into Kruskal's output without increasing total weight.

How is Kruskal's different from Prim's?

Both find MSTs. Kruskal's processes edges in weight order, joining disjoint components. Prim's grows a single tree from a starting vertex, repeatedly adding the cheapest edge from the tree to outside it. Time complexity — Kruskal O(E log E), Prim O((V+E) log V) with binary heap. For sparse graphs (E ≈ V), Kruskal is simpler and competitive. For dense graphs (E ≈ V²), Prim with adjacency matrix is O(V²), beating Kruskal's O(V² log V).

Why does Kruskal's use union-find?

To check whether adding an edge would create a cycle. An edge (u, v) creates a cycle iff u and v are already in the same connected component. Union-find checks this in O(α(n)) amortized — practically constant. After taking the edge, union the two components. Across all m edges, total cycle-check + union work is O(m α(n)) — virtually linear.

What's the cut property?

For any cut C (partition of vertices into two non-empty sets), the lightest edge crossing C must be in some MST. Why — if an MST didn't include the lightest crossing edge, you could swap a heavier edge in the MST with the lighter one, reducing total weight (contradiction). Kruskal's processes edges in weight order, so when it processes the lightest cross-cut edge, it's safe to take.

Does Kruskal's work on disconnected graphs?

Sort of. Run on a disconnected graph, it produces a minimum spanning forest — one MST per connected component. Each component's MST has (component_size - 1) edges. Total edges in the forest = V - (number of components). Detecting the disconnection — at the end of the algorithm, check whether you have V-1 edges; if fewer, the graph wasn't connected.

How does Kruskal's relate to image segmentation?

The Felzenszwalb-Huttenlocher segmentation algorithm builds a graph where pixels are vertices and edge weights measure pixel similarity (low weight = similar). Kruskal's processes edges in similarity order, merging components as long as the merge doesn't violate a similarity threshold. The final set of components is the segmentation. Linear time in pixel count via union-find.