Graph Algorithms
Hopcroft-Karp
Bipartite matching in O(E√V) — many augmenting paths per phase
Hopcroft-Karp finds a maximum bipartite matching in O(E√V) — better than naive O(VE). Layered BFS picks the level structure; DFS finds vertex-disjoint augmenting paths in parallel. The 1973 classic that bipartite assignment problems still hinge on.
- TimeO(E√V) — best known for bipartite
- SpaceO(V + E) — adjacency + level array + match array
- ApproachPhased: BFS layers + DFS vertex-disjoint paths
- Phases run≤ √V — paths grow strictly longer each phase
- Vs naiveO(E√V) instead of O(VE) — √V faster
- PublishedHopcroft & Karp 1973
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How Hopcroft-Karp works
A bipartite graph has its vertices split into two disjoint sides, L and R, with every edge connecting a vertex in L to one in R. A matching is a set of edges such that no vertex appears in more than one edge. A maximum matching has as many edges as possible.
The classical algorithm — repeated augmenting paths via DFS — runs in O(VE). Hopcroft and Karp's 1973 insight was that you can augment along many vertex-disjoint shortest paths in each phase, not just one. Each phase:
- BFS layers. Starting from every unmatched
L-vertex simultaneously, BFS down through R-side and back through matched edges. Each vertex gets a level number. Stop at the first layer that contains an unmatchedR-vertex — that's the target depth. - DFS in layers. From each unmatched
L-vertex, DFS along edges that go from layerito layeri+1only. If you reach an unmatchedR-vertex at the target depth, augment along the path (flip matched/unmatched on every edge). Mark all visited vertices as used; the next DFS starts fresh from a different unmatchedL-vertex.
The two-step structure has a beautiful property: each phase strictly increases the shortest-augmenting-path length. After √V phases, paths are at least √V long, but at most V/√V = √V more augmentations fit — so the algorithm completes in at most O(√V) phases. Each phase costs O(E). Total: O(E√V).
Walk-through example
Left side L = {a, b, c, d}, right side R = {1, 2, 3, 4}, edges:
a — 1
a — 2
b — 1
b — 2
c — 2
c — 3
d — 3
d — 4
Phase 1:
All L-vertices unmatched. BFS from {a, b, c, d}.
Level 0: L₀ = {a, b, c, d} (unmatched L)
Level 1: R₁ = {1, 2, 3, 4} (their neighbors, all unmatched)
Target depth: 1
DFS:
a → 1 ✓ (path length 1, augment)
b → 2 ✓ (1 was used by 'a', so try 2)
c → 3 ✓ (2 was used)
d → 4 ✓ (3 was used)
Matching after phase 1: {a-1, b-2, c-3, d-4}. Size = 4 (maximum).
Total phases: 1. Total work: O(E + DFS) ≈ O(E).
This example was too easy. Realistic instances need 2-4 phases on small graphs and ~√V on adversarial ones.
Hopcroft-Karp vs alternatives
| Hopcroft-Karp | Naive augmenting | Hungarian (assignment) | Push-Relabel | |
|---|---|---|---|---|
| Problem | Max bipartite matching | Max bipartite matching | Min-cost bipartite matching | Max bipartite matching (max-flow) |
| Time | O(E√V) | O(VE) | O(V³) | O(V²√E) practically |
| Path strategy | BFS layers + DFS, many paths/phase | One DFS path at a time | Reweighting + Hungarian forest | No paths — local pushes |
| Code complexity | ~80 lines | ~30 lines | ~100 lines | ~150 lines |
| Handles weights? | No — unweighted only | No | Yes (the whole point) | Yes with min-cost variant |
| Famous use | OR-Tools matching solver | Teaching | Munkres' transportation | FAISS, OR-Tools max-flow |
For pure bipartite matching (no weights), Hopcroft-Karp is the standard. For min-cost or weighted matching, use the Hungarian algorithm or a successive-shortest-paths flow approach. For matching on general (non-bipartite) graphs, use Edmonds' blossom algorithm. See Edmonds-Karp for the max-flow generalization.
Pseudo-code
match_L[u] = vertex of R that u is matched to (or NIL)
match_R[v] = vertex of L that v is matched to (or NIL)
level[u] = BFS layer of L-vertex u (or +∞)
bfs():
queue = empty
for u in L:
if match_L[u] == NIL:
level[u] = 0
queue.enqueue(u)
else:
level[u] = +∞
found = false
while queue not empty:
u = queue.dequeue()
for v in adj[u]:
partner = match_R[v]
if partner == NIL:
found = true
else if level[partner] == +∞:
level[partner] = level[u] + 1
queue.enqueue(partner)
return found
dfs(u):
for v in adj[u]:
partner = match_R[v]
if partner == NIL or (level[partner] == level[u] + 1 and dfs(partner)):
match_L[u] = v
match_R[v] = u
return true
level[u] = +∞
return false
hopcroft_karp():
initialize match_L, match_R = NIL
matches = 0
while bfs():
for u in L:
if match_L[u] == NIL and dfs(u):
matches += 1
return matches
Python implementation
from collections import deque
import math
class HopcroftKarp:
INF = float('inf')
def __init__(self, adj, U, V):
"""adj: dict mapping each L-vertex to a list of R-neighbors.
U, V: numbers of vertices on left/right (vertices are 0..U-1, 0..V-1)."""
self.adj = adj
self.U, self.V = U, V
self.match_L = [-1] * U
self.match_R = [-1] * V
self.level = [self.INF] * U
def bfs(self):
queue = deque()
for u in range(self.U):
if self.match_L[u] == -1:
self.level[u] = 0
queue.append(u)
else:
self.level[u] = self.INF
found = False
while queue:
u = queue.popleft()
for v in self.adj.get(u, ()):
partner = self.match_R[v]
if partner == -1:
found = True
elif self.level[partner] == self.INF:
self.level[partner] = self.level[u] + 1
queue.append(partner)
return found
def dfs(self, u):
for v in self.adj.get(u, ()):
partner = self.match_R[v]
if partner == -1 or (
self.level[partner] == self.level[u] + 1 and self.dfs(partner)
):
self.match_L[u] = v
self.match_R[v] = u
return True
self.level[u] = self.INF
return False
def run(self):
matches = 0
while self.bfs():
for u in range(self.U):
if self.match_L[u] == -1 and self.dfs(u):
matches += 1
return matches
# Example: workers a,b,c,d → jobs 0,1,2,3
adj = {
0: [0, 1], # a → 1, 2
1: [0, 1], # b → 1, 2
2: [1, 2], # c → 2, 3
3: [2, 3], # d → 3, 4
}
hk = HopcroftKarp(adj, 4, 4)
print(hk.run()) # 4 (perfect matching)
print(hk.match_L) # [0, 1, 2, 3]
JavaScript implementation
class HopcroftKarp {
constructor(adj, U, V) {
this.adj = adj;
this.U = U; this.V = V;
this.matchL = new Array(U).fill(-1);
this.matchR = new Array(V).fill(-1);
this.level = new Array(U).fill(Infinity);
}
bfs() {
const queue = [];
for (let u = 0; u < this.U; u++) {
if (this.matchL[u] === -1) { this.level[u] = 0; queue.push(u); }
else { this.level[u] = Infinity; }
}
let found = false;
while (queue.length) {
const u = queue.shift();
for (const v of this.adj[u] || []) {
const partner = this.matchR[v];
if (partner === -1) found = true;
else if (this.level[partner] === Infinity) {
this.level[partner] = this.level[u] + 1;
queue.push(partner);
}
}
}
return found;
}
dfs(u) {
for (const v of this.adj[u] || []) {
const partner = this.matchR[v];
if (partner === -1 || (this.level[partner] === this.level[u] + 1 && this.dfs(partner))) {
this.matchL[u] = v;
this.matchR[v] = u;
return true;
}
}
this.level[u] = Infinity;
return false;
}
run() {
let matches = 0;
while (this.bfs()) {
for (let u = 0; u < this.U; u++) {
if (this.matchL[u] === -1 && this.dfs(u)) matches++;
}
}
return matches;
}
}
When to use Hopcroft-Karp
- Workers ↔ tasks. Assign people to shifts, drivers to cars, GPUs to inference requests. As soon as the relationship is bipartite (no self-edges) and you want maximum coverage, Hopcroft-Karp solves it in milliseconds for thousands of vertices.
- Stable-marriage / school choice. The unweighted "is there a perfect matching?" check uses Hopcroft-Karp before applying Gale-Shapley's stability logic.
- Kidney exchange (paired-donor stage). National kidney exchange programs first establish maximum bipartite matching between compatible donor-patient pairs.
- Image alignment (feature correspondence). Match SIFT/ORB features between two images using bipartite matching on similarity-thresholded edges.
- VLSI routing, FPGA assignment. Maps abstract function blocks to physical cells where each cell can host one block.
- Crossword/Sudoku style constraint satisfaction. When constraints reduce to "each row/column needs a unique assignment," bipartite matching detects feasibility.
Common pitfalls
- Forgetting to reset levels between phases. The level array must be reinitialized for every BFS — leaking levels from the previous phase silently destroys correctness.
- DFS not pruning revisited vertices. When a DFS backtrack fails, set
level[u] = ∞so subsequent DFS calls within the same phase don't retry it. Without this pruning the algorithm degrades toO(VE). - Starting BFS only from one unmatched vertex. The whole point is parallelism: enqueue all unmatched
L-vertices at level 0. Single-source BFS misses many disjoint paths. - Off-by-one with vertex indexing. L and R have separate id spaces. Mixing them (treating R-vertices as L-vertices in adj lookup) gives nonsense.
- Trying to apply Hopcroft-Karp to weighted matching. It only finds max-cardinality, ignoring weights. For min-cost matching use the Hungarian algorithm or min-cost flow.
- Calling on non-bipartite graphs. Hopcroft-Karp's correctness depends on the absence of odd cycles. On general graphs it produces a maximal but not necessarily maximum matching. Use blossom (Edmonds') for those.
Performance analysis
Per phase: BFS is O(V + E); DFS over all unmatched L-vertices is also O(V + E) amortized thanks to the level-pruning trick. Each phase is O(E) for connected graphs.
Number of phases: Hopcroft-Karp's celebrated argument bounds this at O(√V). The reasoning: after √V phases, every remaining augmenting path has length ≥ √V. Since the matching can grow by at most V/(path length) in remaining work, only O(√V) further phases are needed. Total: 2√V = O(√V) phases.
Real-world numbers:
- V = 100, E = 1000: ~10 × 1000 = 10⁴ ops — under 1 ms.
- V = 10,000, E = 100,000: 100 × 100,000 = 10⁷ ops — ~100 ms in Python, ~3 ms in C++.
- V = 1,000,000, E = 10,000,000 (graph-of-graphs): 1000 × 10⁷ = 10¹⁰ ops — minutes in Python; production matching solvers handle this in seconds.
Memory: O(V + E) — adjacency lists, match arrays, level array. Negligible by modern standards.
Frequently asked questions
What is a bipartite matching?
In a bipartite graph, vertices split into two disjoint sets L and R and every edge connects L to R. A matching is a set of edges with no shared endpoints — each vertex appears in at most one edge. A maximum matching has as many edges as possible. The classical use case: assigning workers (L) to jobs (R) where each worker can be assigned at most one job and vice versa.
Why is Hopcroft-Karp O(E√V) instead of O(VE)?
Each phase of Hopcroft-Karp finds a maximal set of vertex-disjoint shortest augmenting paths — not just one. The shortest augmenting path length strictly increases between phases, and the longest path has at most √V hops at the boundary where short and long paths transition. So at most O(√V) phases run; each phase is O(E). Total O(E√V). Karzanov gave a parallel proof in 1973.
What's an augmenting path in matching?
A path that alternates between unmatched and matched edges, starting and ending at unmatched vertices. Flipping the alternation increases the matching by exactly one edge. Berge's theorem (1957): a matching is maximum if and only if it has no augmenting paths. Every matching algorithm — Hopcroft-Karp, Hungarian, blossom — boils down to finding augmenting paths.
How does layered BFS find multiple paths per phase?
BFS from all unmatched L-vertices simultaneously builds level layers — L₀ (unmatched left), R₀ (right neighbors of L₀), L₁ (their matched left partners), and so on. The first layer containing an unmatched R-vertex is the target depth. DFS then finds vertex-disjoint paths from L₀ to that R-layer, walking only between consecutive layers. Each DFS marks visited vertices so subsequent searches start from a different L₀ vertex.
Where is bipartite matching used in practice?
Job assignment (workers ↔ jobs, devices ↔ tasks), kidney exchange (donors ↔ patients), college admissions (students ↔ schools via deferred acceptance), online ad auctions (impressions ↔ advertisers), and scheduling (people ↔ time slots). The stable-marriage variant additionally requires no two unmatched pairs prefer each other to their current partners — Gale-Shapley solves that in O(V²).
How does Hopcroft-Karp relate to max-flow?
Bipartite matching reduces to max-flow on a graph with source s connected to all L-vertices, sink t connected to all R-vertices, and edge capacities of 1 everywhere. Maximum flow equals maximum matching. Running Dinic's algorithm on this unit-capacity flow network gives exactly O(E√V) — Hopcroft-Karp is Dinic specialized to bipartite matching. The phases correspond to Dinic's level-graph blocking flows.
What about non-bipartite matching?
General graphs need Edmonds' blossom algorithm. The complication is odd-length cycles in the augmenting path — these have to be contracted into 'blossoms.' Micali-Vazirani extends Hopcroft-Karp to general graphs at O(E√V), but the constant factor is large and the algorithm is notoriously intricate. For bipartite graphs, Hopcroft-Karp remains the practical winner.