Graph Algorithms

Johnson's Algorithm

Q: Why does Johnson's preserve shortest paths after reweighting?

Define w'(u,v) = w(u,v) + h(u) − h(v). For any path P from s to t with edges (v₀,v₁), (v₁,v₂), …, (vₖ₋₁,vₖ), the reweighted total is Σ w'(vᵢ,vᵢ₊₁) = Σ (w(vᵢ,vᵢ₊₁) + h(vᵢ) − h(vᵢ₊₁)) = (Σ w) + h(s) − h(t). The h-difference depends only on endpoints, not the path. So all s→t paths shift by the same constant — the relative ordering is preserved. Shortest path under w is shortest under w'. To recover original distances after Dijkstra: d(u,v) = d'(u,v) − h(u) + h(v).

Q: When is it faster than Floyd-Warshall?

Johnson's runs in O(VE + V² log V) using binary heap (or O(VE + V² log V) with Fibonacci). Floyd-Warshall is O(V³). The crossover: Johnson's wins when E = o(V² / log V) — meaning sparse to medium-density graphs. Concrete: V = 10000, E = 50000 (sparse). Johnson's: 10⁴ · 5·10⁴ + 10⁸ · 14 ≈ 2 · 10⁹. Floyd-Warshall: 10¹². Johnson's is roughly 500× faster. For dense graphs (E ≈ V²), Floyd-Warshall's tighter inner loop and cache-friendly memory access often win in practice despite the asymptotic tie.

Q: Why need Bellman-Ford in the first phase?

Dijkstra requires non-negative edge weights. If your input graph has negative edges (but no negative cycles), Dijkstra cannot run directly. Bellman-Ford handles negative edges in O(VE) and produces shortest distances h(v) from a virtual source s* that has zero-weight edges to every vertex. Those h values satisfy h(v) ≤ h(u) + w(u,v) for every edge (u,v) — equivalently w(u,v) + h(u) − h(v) ≥ 0. So the reweighted graph has all non-negative edges, and Dijkstra applies. Bellman-Ford's overhead is paid once; Dijkstra runs V times on the reweighted graph for the all-pairs result.

Q: How does it detect negative cycles?

Bellman-Ford after V−1 relaxation rounds has correct distances if no negative cycle is reachable from the source. Run one more round (the V-th) and check whether any edge still relaxes — i.e. dist[v] > dist[u] + w(u,v) for some edge. If yes, a negative cycle is reachable; abort with an error. If no, h(v) is well-defined and reweighting is safe. Johnson's algorithm halts before Dijkstra phase whenever a negative cycle exists.

Reweight edges with Bellman-Ford, then run Dijkstra from each vertex — O(V² log V + VE)

Johnson's algorithm computes shortest paths between all pairs of vertices in a sparse, possibly negatively-weighted directed graph. It runs Bellman-Ford once to compute a height function h(v), then re-weights all edges to be non-negative via w'(u,v) = w(u,v) + h(u) − h(v), then runs Dijkstra from every vertex. Total time: O(V² log V + VE) with binary heap (or O(V² log V + VE) using Fibonacci) — much faster than Floyd-Warshall's O(V³) on sparse graphs (V² log V vs V³ for V much larger than log V). Discovered by Donald Johnson in 1977.

TimeO(V² log V + VE)
Negative edgesHandled via reweighting
Negative cyclesDetected by Bellman-Ford
Reweight invariantShortest paths preserved
InventedJohnson 1977
Beats Floyd-Warshall whenE = o(V² / log V)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Johnson's algorithm works

Three phases:

Augment with virtual source. Add a vertex s* with zero-weight edges to every other vertex.
Run Bellman-Ford from s*. Get h(v) = shortest distance from s* to v. Detect negative cycles; abort if found. h(v) ≤ 0 for all v (zero-weight edge from s* gives that bound).
Reweight every edge: w'(u, v) = w(u, v) + h(u) − h(v). All w'(u, v) ≥ 0 by Bellman-Ford's slack constraint.
Run Dijkstra from each original vertex u on the reweighted graph. Get d'(u, v).
Recover original distances: d(u, v) = d'(u, v) − h(u) + h(v).

Total cost: Bellman-Ford O(VE), V Dijkstra runs at O(E log V) each = O(VE log V), recovery O(V²). Best case (Fibonacci heap): O(V² log V + VE).

Why reweighting preserves shortest paths

For any path P = v₀ → v₁ → … → vₖ, the reweighted total telescopes:

Σ w'(vᵢ, vᵢ₊₁) = Σ (w(vᵢ, vᵢ₊₁) + h(vᵢ) − h(vᵢ₊₁))
              = (Σ w) + h(v₀) − h(vₖ)

The h-difference depends only on the endpoints v₀, vₖ — every path between them is shifted by the same constant h(v₀) − h(vₖ). Relative ordering of paths is preserved, so the path with minimum w-sum is also the one with minimum w'-sum. Dijkstra finds it correctly under w'.

Non-negativity: by Bellman-Ford, h satisfies the triangle inequality h(v) ≤ h(u) + w(u, v) for every edge — rearranged, w(u, v) + h(u) − h(v) ≥ 0, which is exactly w'(u, v).

JavaScript implementation

// Returns dist[u][v] = shortest distance, or null if negative cycle
function johnson(n, edges) {
  // edges: [[u, v, w], ...] for original directed graph
  // Phase 1: add virtual source n, zero-weight edges to all
  const augmented = [...edges];
  for (let v = 0; v < n; v++) augmented.push([n, v, 0]);

  // Phase 2: Bellman-Ford from n
  const h = new Array(n + 1).fill(Infinity);
  h[n] = 0;
  for (let i = 0; i < n; i++) {
    let changed = false;
    for (const [u, v, w] of augmented) {
      if (h[u] + w < h[v]) {
        h[v] = h[u] + w;
        changed = true;
      }
    }
    if (!changed) break;
  }
  // Negative cycle check
  for (const [u, v, w] of augmented) {
    if (h[u] + w < h[v]) return null;
  }

  // Phase 3: build reweighted adjacency list (drop virtual source)
  const adj = Array.from({ length: n }, () => []);
  for (const [u, v, w] of edges) {
    adj[u].push([v, w + h[u] - h[v]]);
  }

  // Phase 4: Dijkstra from each vertex
  const dist = Array.from({ length: n }, () => new Array(n).fill(Infinity));
  for (let s = 0; s < n; s++) {
    const d = new Array(n).fill(Infinity);
    d[s] = 0;
    const pq = new MinHeap();
    pq.push([0, s]);
    while (pq.size() > 0) {
      const [du, u] = pq.pop();
      if (du > d[u]) continue;
      for (const [v, wp] of adj[u]) {
        const nd = du + wp;
        if (nd < d[v]) { d[v] = nd; pq.push([nd, v]); }
      }
    }
    // Phase 5: recover original distances
    for (let v = 0; v < n; v++) {
      dist[s][v] = d[v] === Infinity ? Infinity : d[v] - h[s] + h[v];
    }
  }
  return dist;
}

Python implementation

import heapq

def johnson(n, edges):
    augmented = list(edges) + [(n, v, 0) for v in range(n)]

    h = [float('inf')] * (n + 1)
    h[n] = 0
    for _ in range(n):
        changed = False
        for u, v, w in augmented:
            if h[u] + w < h[v]:
                h[v] = h[u] + w
                changed = True
        if not changed:
            break
    for u, v, w in augmented:
        if h[u] + w < h[v]:
            return None  # negative cycle

    adj = [[] for _ in range(n)]
    for u, v, w in edges:
        adj[u].append((v, w + h[u] - h[v]))

    dist = [[float('inf')] * n for _ in range(n)]
    for s in range(n):
        d = [float('inf')] * n
        d[s] = 0
        pq = [(0, s)]
        while pq:
            du, u = heapq.heappop(pq)
            if du > d[u]:
                continue
            for v, wp in adj[u]:
                nd = du + wp
                if nd < d[v]:
                    d[v] = nd
                    heapq.heappush(pq, (nd, v))
        for v in range(n):
            if d[v] != float('inf'):
                dist[s][v] = d[v] - h[s] + h[v]
    return dist

Johnson's vs Floyd-Warshall: when each wins

Property	Johnson's	Floyd-Warshall
Time	O(V² log V + VE)	O(V³)
Space	O(V + E) for graph + O(V²) output	O(V²)
Best for	Sparse graphs (E = o(V²/log V))	Dense graphs, simple loop
Negative edges	Yes (handled by Bellman-Ford)	Yes (intrinsic)
Negative cycle detection	Yes (Bellman-Ford phase)	Yes (diagonal becomes negative)
Cache friendliness	Mixed (V Dijkstra runs)	Excellent (3 nested loops)
Implementation lines	~80 lines	~10 lines

Numeric example: V = 1000, E = 5000. Floyd-Warshall: 10⁹ inner-loop steps. Johnson's: 10⁶ + 5·10³·V·log V ≈ 5·10⁷ — about 20× faster. For V = 1000, E = 100000: Floyd-Warshall still 10⁹; Johnson's 10⁹ — break-even. For V = 1000 fully dense (E = 10⁶): Floyd-Warshall ~10⁹; Johnson's ~10¹⁰ — Floyd-Warshall wins.

Where Johnson's appears in practice

Sparse routing problems

Internet routing tables, road networks, and metro systems are sparse: V ≈ 10⁵–10⁶ but average degree is small (5–20). Johnson's gives full all-pairs distances in time linear-ish in V², far below V³.

Difference constraints

A system of constraints x_i − x_j ≤ b_ij can be modeled as a graph with edge (j → i) of weight b_ij. Bellman-Ford from a virtual source produces a feasible solution; the same h-values used by Johnson's give one assignment minimizing maximum slack.

Dependency analysis with negative weights

Build systems where some dependencies "release" credit (negative edges) while others consume time. Johnson's-style reweighting makes all costs non-negative for fast cumulative analysis.

Why Johnson's matters

Sparse APSP standard. Whenever you need all-pairs distances in a sparse directed graph with possibly negative edges, Johnson's is the textbook answer. Library APSP routines in NetworkX, Boost Graph Library, and igraph implement it.
Negative-weight handling. Most APSP problems in the wild are non-negative, but anywhere flows can be reversed (currency arbitrage, refund flows, dependency credits) negative edges show up. Floyd-Warshall handles them too, but loses the sparse-graph advantage.
Modular structure. Johnson's is built from two well-understood pieces (Bellman-Ford + Dijkstra), making it easier to instrument and parallelize. The V Dijkstra runs are independent — embarrassingly parallel for cluster execution.
Constraint satisfaction. Difference-constraint LP problems map directly onto shortest paths; Johnson's reweighting computes feasible offsets for solver preprocessing.
Network reliability. Compute shortest expected-failure paths in a directed network where some edges are subsidies (negative cost). Johnson's gives the full reliability matrix in sub-cubic time when E is small.
Algorithm research. Johnson's was the first algorithm to break the O(V³) barrier for APSP with negative weights (1977). It remains the asymptotic standard for sparse inputs four decades later — recent improvements are by polylog factors only.

Common misconceptions

"Always best for APSP." Only on sparse graphs. For dense graphs (E = Θ(V²)), Floyd-Warshall is asymptotically equivalent and constant-factor faster due to its inner-loop simplicity and cache locality.
"Needs non-negative weights from start." The whole point is to handle negative edges. The Bellman-Ford phase produces h-values that make reweighted edges non-negative, allowing Dijkstra to run.
"Always need a virtual source." If your graph is strongly connected, any real vertex works as the Bellman-Ford source. The virtual-source trick is for the general case where the graph may not be strongly connected.
"Negative cycles are silently broken." Bellman-Ford detects them — the algorithm aborts cleanly. Don't try to "patch" by capping negative edges; the answer is undefined when negative cycles exist.
"Replaces Bellman-Ford for single-source." No — Johnson's is for all-pairs. For single-source on a graph with negative edges, run Bellman-Ford alone (O(VE)).
"Output distances are reweighted." The final phase un-shifts using d(u, v) = d'(u, v) − h(u) + h(v) so reported distances match the original graph weights.
"Slower than V applications of Bellman-Ford." No — V Bellman-Ford runs cost O(V²E). Johnson's is O(VE + V² log V), strictly better when log V < E/V (almost always).

Frequently asked questions

Why does Johnson's preserve shortest paths after reweighting?

Define w'(u,v) = w(u,v) + h(u) − h(v). For any path P from s to t with edges (v₀,v₁), (v₁,v₂), …, (vₖ₋₁,vₖ), the reweighted total is Σ w'(vᵢ,vᵢ₊₁) = Σ (w(vᵢ,vᵢ₊₁) + h(vᵢ) − h(vᵢ₊₁)) = (Σ w) + h(s) − h(t). The h-difference depends only on endpoints, not the path. So all s→t paths shift by the same constant — the relative ordering is preserved. Shortest path under w is shortest under w'. To recover original distances after Dijkstra: d(u,v) = d'(u,v) − h(u) + h(v).

When is it faster than Floyd-Warshall?

Johnson's runs in O(VE + V² log V) using binary heap (or O(VE + V² log V) with Fibonacci). Floyd-Warshall is O(V³). The crossover: Johnson's wins when E = o(V² / log V) — meaning sparse to medium-density graphs. Concrete: V = 10000, E = 50000 (sparse). Johnson's: 10⁴ · 5·10⁴ + 10⁸ · 14 ≈ 2 · 10⁹. Floyd-Warshall: 10¹². Johnson's is roughly 500× faster. For dense graphs (E ≈ V²), Floyd-Warshall's tighter inner loop and cache-friendly memory access often win in practice despite the asymptotic tie.

Why need Bellman-Ford in the first phase?

Dijkstra requires non-negative edge weights. If your input graph has negative edges (but no negative cycles), Dijkstra cannot run directly. Bellman-Ford handles negative edges in O(VE) and produces shortest distances h(v) from a virtual source s* that has zero-weight edges to every vertex. Those h values satisfy h(v) ≤ h(u) + w(u,v) for every edge (u,v) — equivalently w(u,v) + h(u) − h(v) ≥ 0. So the reweighted graph has all non-negative edges, and Dijkstra applies. Bellman-Ford's overhead is paid once; Dijkstra runs V times on the reweighted graph for the all-pairs result.

How does it detect negative cycles?

Bellman-Ford after V−1 relaxation rounds has correct distances if no negative cycle is reachable from the source. Run one more round (the V-th) and check whether any edge still relaxes — i.e. dist[v] > dist[u] + w(u,v) for some edge. If yes, a negative cycle is reachable; abort with an error. If no, h(v) is well-defined and reweighting is safe. Johnson's algorithm halts before Dijkstra phase whenever a negative cycle exists.

Why use a virtual source vertex?

If we ran Bellman-Ford from any real vertex s, h(v) would be the shortest path from s — but vertices unreachable from s would have h(v) = ∞, breaking the reweighting. Adding a virtual vertex s* with zero-weight edges to every original vertex guarantees every v is reachable; h(v) becomes the shortest distance from s* in the augmented graph. The virtual vertex is then discarded before the Dijkstra phase. The augmented graph has V+1 vertices and V+E edges — Bellman-Ford on it costs O((V+1)(V+E)) = O(VE).

Could you use SPFA instead of Bellman-Ford?

Yes — SPFA (Shortest Path Faster Algorithm) is a queue-based optimization of Bellman-Ford that runs in O(VE) worst case but often O(kE) in practice for small k. It produces the same h(v) values, and you can detect negative cycles by counting how many times a vertex is relaxed (more than V times = negative cycle). SPFA is the de-facto choice in competitive programming, but worst-case adversarial graphs hit O(VE) so safety-critical systems stick with Bellman-Ford.

Interactive visualization

Watch the 60-second explainer

How Johnson's algorithm works

Why reweighting preserves shortest paths

JavaScript implementation

Python implementation

Johnson's vs Floyd-Warshall: when each wins

Where Johnson's appears in practice

Sparse routing problems

Difference constraints

Dependency analysis with negative weights

Why Johnson's matters

Common misconceptions

Frequently asked questions

Related concepts