Graph Algorithms

Hopcroft-Karp

Bipartite matching in O(E√V) — many augmenting paths per phase

Hopcroft-Karp finds a maximum bipartite matching in O(E√V) — better than naive O(VE). Layered BFS picks the level structure; DFS finds vertex-disjoint augmenting paths in parallel. The 1973 classic that bipartite assignment problems still hinge on.

TimeO(E√V) — best known for bipartite
SpaceO(V + E) — adjacency + level array + match array
ApproachPhased: BFS layers + DFS vertex-disjoint paths
Phases run≤ √V — paths grow strictly longer each phase
Vs naiveO(E√V) instead of O(VE) — √V faster
PublishedHopcroft & Karp 1973

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Hopcroft-Karp works

A bipartite graph has its vertices split into two disjoint sides, L and R, with every edge connecting a vertex in L to one in R. A matching is a set of edges such that no vertex appears in more than one edge. A maximum matching has as many edges as possible.

The classical algorithm — repeated augmenting paths via DFS — runs in O(VE). Hopcroft and Karp's 1973 insight was that you can augment along many vertex-disjoint shortest paths in each phase, not just one. Each phase:

BFS layers. Starting from every unmatched L-vertex simultaneously, BFS down through R-side and back through matched edges. Each vertex gets a level number. Stop at the first layer that contains an unmatched R-vertex — that's the target depth.
DFS in layers. From each unmatched L-vertex, DFS along edges that go from layer i to layer i+1 only. If you reach an unmatched R-vertex at the target depth, augment along the path (flip matched/unmatched on every edge). Mark all visited vertices as used; the next DFS starts fresh from a different unmatched L-vertex.

The two-step structure has a beautiful property: each phase strictly increases the shortest-augmenting-path length. After √V phases, paths are at least √V long, but at most V/√V = √V more augmentations fit — so the algorithm completes in at most O(√V) phases. Each phase costs O(E). Total: O(E√V).

Walk-through example

Left side L = {a, b, c, d}, right side R = {1, 2, 3, 4}, edges:

a — 1
a — 2
b — 1
b — 2
c — 2
c — 3
d — 3
d — 4

Phase 1:
  All L-vertices unmatched. BFS from {a, b, c, d}.
  Level 0: L₀ = {a, b, c, d}      (unmatched L)
  Level 1: R₁ = {1, 2, 3, 4}      (their neighbors, all unmatched)
  Target depth: 1
  DFS:
    a → 1  ✓ (path length 1, augment)
    b → 2  ✓ (1 was used by 'a', so try 2)
    c → 3  ✓ (2 was used)
    d → 4  ✓ (3 was used)
  Matching after phase 1: {a-1, b-2, c-3, d-4}.  Size = 4 (maximum).

Total phases: 1.   Total work: O(E + DFS) ≈ O(E).

This example was too easy. Realistic instances need 2-4 phases on small graphs and ~√V on adversarial ones.

Hopcroft-Karp vs alternatives

	Hopcroft-Karp	Naive augmenting	Hungarian (assignment)	Push-Relabel
Problem	Max bipartite matching	Max bipartite matching	Min-cost bipartite matching	Max bipartite matching (max-flow)
Time	O(E√V)	O(VE)	O(V³)	O(V²√E) practically
Path strategy	BFS layers + DFS, many paths/phase	One DFS path at a time	Reweighting + Hungarian forest	No paths — local pushes
Code complexity	~80 lines	~30 lines	~100 lines	~150 lines
Handles weights?	No — unweighted only	No	Yes (the whole point)	Yes with min-cost variant
Famous use	OR-Tools matching solver	Teaching	Munkres' transportation	FAISS, OR-Tools max-flow

For pure bipartite matching (no weights), Hopcroft-Karp is the standard. For min-cost or weighted matching, use the Hungarian algorithm or a successive-shortest-paths flow approach. For matching on general (non-bipartite) graphs, use Edmonds' blossom algorithm. See Edmonds-Karp for the max-flow generalization.

Pseudo-code

match_L[u] = vertex of R that u is matched to (or NIL)
match_R[v] = vertex of L that v is matched to (or NIL)
level[u]   = BFS layer of L-vertex u (or +∞)

bfs():
    queue = empty
    for u in L:
        if match_L[u] == NIL:
            level[u] = 0
            queue.enqueue(u)
        else:
            level[u] = +∞
    found = false
    while queue not empty:
        u = queue.dequeue()
        for v in adj[u]:
            partner = match_R[v]
            if partner == NIL:
                found = true
            else if level[partner] == +∞:
                level[partner] = level[u] + 1
                queue.enqueue(partner)
    return found

dfs(u):
    for v in adj[u]:
        partner = match_R[v]
        if partner == NIL or (level[partner] == level[u] + 1 and dfs(partner)):
            match_L[u] = v
            match_R[v] = u
            return true
    level[u] = +∞
    return false

hopcroft_karp():
    initialize match_L, match_R = NIL
    matches = 0
    while bfs():
        for u in L:
            if match_L[u] == NIL and dfs(u):
                matches += 1
    return matches

Python implementation

from collections import deque
import math

class HopcroftKarp:
    INF = float('inf')

    def __init__(self, adj, U, V):
        """adj: dict mapping each L-vertex to a list of R-neighbors.
           U, V: numbers of vertices on left/right (vertices are 0..U-1, 0..V-1)."""
        self.adj = adj
        self.U, self.V = U, V
        self.match_L = [-1] * U
        self.match_R = [-1] * V
        self.level = [self.INF] * U

    def bfs(self):
        queue = deque()
        for u in range(self.U):
            if self.match_L[u] == -1:
                self.level[u] = 0
                queue.append(u)
            else:
                self.level[u] = self.INF
        found = False
        while queue:
            u = queue.popleft()
            for v in self.adj.get(u, ()):
                partner = self.match_R[v]
                if partner == -1:
                    found = True
                elif self.level[partner] == self.INF:
                    self.level[partner] = self.level[u] + 1
                    queue.append(partner)
        return found

    def dfs(self, u):
        for v in self.adj.get(u, ()):
            partner = self.match_R[v]
            if partner == -1 or (
                self.level[partner] == self.level[u] + 1 and self.dfs(partner)
            ):
                self.match_L[u] = v
                self.match_R[v] = u
                return True
        self.level[u] = self.INF
        return False

    def run(self):
        matches = 0
        while self.bfs():
            for u in range(self.U):
                if self.match_L[u] == -1 and self.dfs(u):
                    matches += 1
        return matches

# Example: workers a,b,c,d → jobs 0,1,2,3
adj = {
    0: [0, 1],  # a → 1, 2
    1: [0, 1],  # b → 1, 2
    2: [1, 2],  # c → 2, 3
    3: [2, 3],  # d → 3, 4
}
hk = HopcroftKarp(adj, 4, 4)
print(hk.run())            # 4 (perfect matching)
print(hk.match_L)          # [0, 1, 2, 3]

JavaScript implementation

class HopcroftKarp {
  constructor(adj, U, V) {
    this.adj = adj;
    this.U = U; this.V = V;
    this.matchL = new Array(U).fill(-1);
    this.matchR = new Array(V).fill(-1);
    this.level = new Array(U).fill(Infinity);
  }

  bfs() {
    const queue = [];
    for (let u = 0; u < this.U; u++) {
      if (this.matchL[u] === -1) { this.level[u] = 0; queue.push(u); }
      else                       { this.level[u] = Infinity; }
    }
    let found = false;
    while (queue.length) {
      const u = queue.shift();
      for (const v of this.adj[u] || []) {
        const partner = this.matchR[v];
        if (partner === -1) found = true;
        else if (this.level[partner] === Infinity) {
          this.level[partner] = this.level[u] + 1;
          queue.push(partner);
        }
      }
    }
    return found;
  }

  dfs(u) {
    for (const v of this.adj[u] || []) {
      const partner = this.matchR[v];
      if (partner === -1 || (this.level[partner] === this.level[u] + 1 && this.dfs(partner))) {
        this.matchL[u] = v;
        this.matchR[v] = u;
        return true;
      }
    }
    this.level[u] = Infinity;
    return false;
  }

  run() {
    let matches = 0;
    while (this.bfs()) {
      for (let u = 0; u < this.U; u++) {
        if (this.matchL[u] === -1 && this.dfs(u)) matches++;
      }
    }
    return matches;
  }
}

When to use Hopcroft-Karp

Workers ↔ tasks. Assign people to shifts, drivers to cars, GPUs to inference requests. As soon as the relationship is bipartite (no self-edges) and you want maximum coverage, Hopcroft-Karp solves it in milliseconds for thousands of vertices.
Stable-marriage / school choice. The unweighted "is there a perfect matching?" check uses Hopcroft-Karp before applying Gale-Shapley's stability logic.
Kidney exchange (paired-donor stage). National kidney exchange programs first establish maximum bipartite matching between compatible donor-patient pairs.
Image alignment (feature correspondence). Match SIFT/ORB features between two images using bipartite matching on similarity-thresholded edges.
VLSI routing, FPGA assignment. Maps abstract function blocks to physical cells where each cell can host one block.
Crossword/Sudoku style constraint satisfaction. When constraints reduce to "each row/column needs a unique assignment," bipartite matching detects feasibility.

Common pitfalls

Forgetting to reset levels between phases. The level array must be reinitialized for every BFS — leaking levels from the previous phase silently destroys correctness.
DFS not pruning revisited vertices. When a DFS backtrack fails, set level[u] = ∞ so subsequent DFS calls within the same phase don't retry it. Without this pruning the algorithm degrades to O(VE).
Starting BFS only from one unmatched vertex. The whole point is parallelism: enqueue all unmatched L-vertices at level 0. Single-source BFS misses many disjoint paths.
Off-by-one with vertex indexing. L and R have separate id spaces. Mixing them (treating R-vertices as L-vertices in adj lookup) gives nonsense.
Trying to apply Hopcroft-Karp to weighted matching. It only finds max-cardinality, ignoring weights. For min-cost matching use the Hungarian algorithm or min-cost flow.
Calling on non-bipartite graphs. Hopcroft-Karp's correctness depends on the absence of odd cycles. On general graphs it produces a maximal but not necessarily maximum matching. Use blossom (Edmonds') for those.

Performance analysis

Per phase: BFS is O(V + E); DFS over all unmatched L-vertices is also O(V + E) amortized thanks to the level-pruning trick. Each phase is O(E) for connected graphs.

Number of phases: Hopcroft-Karp's celebrated argument bounds this at O(√V). The reasoning: after √V phases, every remaining augmenting path has length ≥ √V. Since the matching can grow by at most V/(path length) in remaining work, only O(√V) further phases are needed. Total: 2√V = O(√V) phases.

Real-world numbers:

V = 100, E = 1000: ~10 × 1000 = 10⁴ ops — under 1 ms.
V = 10,000, E = 100,000: 100 × 100,000 = 10⁷ ops — ~100 ms in Python, ~3 ms in C++.
V = 1,000,000, E = 10,000,000 (graph-of-graphs): 1000 × 10⁷ = 10¹⁰ ops — minutes in Python; production matching solvers handle this in seconds.

Memory: O(V + E) — adjacency lists, match arrays, level array. Negligible by modern standards.

Frequently asked questions

What is a bipartite matching?

In a bipartite graph, vertices split into two disjoint sets L and R and every edge connects L to R. A matching is a set of edges with no shared endpoints — each vertex appears in at most one edge. A maximum matching has as many edges as possible. The classical use case: assigning workers (L) to jobs (R) where each worker can be assigned at most one job and vice versa.

Why is Hopcroft-Karp O(E√V) instead of O(VE)?

Each phase of Hopcroft-Karp finds a maximal set of vertex-disjoint shortest augmenting paths — not just one. The shortest augmenting path length strictly increases between phases, and the longest path has at most √V hops at the boundary where short and long paths transition. So at most O(√V) phases run; each phase is O(E). Total O(E√V). Karzanov gave a parallel proof in 1973.

What's an augmenting path in matching?

A path that alternates between unmatched and matched edges, starting and ending at unmatched vertices. Flipping the alternation increases the matching by exactly one edge. Berge's theorem (1957): a matching is maximum if and only if it has no augmenting paths. Every matching algorithm — Hopcroft-Karp, Hungarian, blossom — boils down to finding augmenting paths.

How does layered BFS find multiple paths per phase?

BFS from all unmatched L-vertices simultaneously builds level layers — L₀ (unmatched left), R₀ (right neighbors of L₀), L₁ (their matched left partners), and so on. The first layer containing an unmatched R-vertex is the target depth. DFS then finds vertex-disjoint paths from L₀ to that R-layer, walking only between consecutive layers. Each DFS marks visited vertices so subsequent searches start from a different L₀ vertex.

Where is bipartite matching used in practice?

Job assignment (workers ↔ jobs, devices ↔ tasks), kidney exchange (donors ↔ patients), college admissions (students ↔ schools via deferred acceptance), online ad auctions (impressions ↔ advertisers), and scheduling (people ↔ time slots). The stable-marriage variant additionally requires no two unmatched pairs prefer each other to their current partners — Gale-Shapley solves that in O(V²).

How does Hopcroft-Karp relate to max-flow?

Bipartite matching reduces to max-flow on a graph with source s connected to all L-vertices, sink t connected to all R-vertices, and edge capacities of 1 everywhere. Maximum flow equals maximum matching. Running Dinic's algorithm on this unit-capacity flow network gives exactly O(E√V) — Hopcroft-Karp is Dinic specialized to bipartite matching. The phases correspond to Dinic's level-graph blocking flows.

What about non-bipartite matching?

General graphs need Edmonds' blossom algorithm. The complication is odd-length cycles in the augmenting path — these have to be contracted into 'blossoms.' Micali-Vazirani extends Hopcroft-Karp to general graphs at O(E√V), but the constant factor is large and the algorithm is notoriously intricate. For bipartite graphs, Hopcroft-Karp remains the practical winner.