Algorithms

Floyd-Warshall

All-pairs shortest paths in O(V³) — three nested loops, beautifully simple

Floyd-Warshall finds the shortest path between every pair of vertices in a weighted graph. Three nested loops over the vertices and a single relaxation step inside — O(V³) time, O(V²) space, works with negative weights (just no negative cycles). Used in network routing, transitive closure, and as the standard textbook example of dynamic programming on graphs.

  • TimeO(V³)
  • SpaceO(V²) — distance matrix
  • Edge weightsAny (positive, negative, zero)
  • Negative cyclesDetected if any dist[i][i] becomes negative
  • Vs Dijkstra-from-each-vertexSame O(V³) for dense graphs; simpler code
  • Used inNetwork routing, transitive closure, distance summaries

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Floyd-Warshall works

Maintain a V × V distance matrix dist. Initialize:

  • dist[i][i] = 0 for all i.
  • dist[i][j] = w if there's a direct edge i→j with weight w.
  • dist[i][j] = infinity otherwise.

Then for each intermediate vertex k from 1 to V, for each pair (i, j), relax:

if dist[i][k] + dist[k][j] < dist[i][j]:
    dist[i][j] = dist[i][k] + dist[k][j]

After the outer loop completes, dist[i][j] holds the shortest distance from i to j. The triple-nested loop is O(V³). The whole algorithm fits in 6 lines of code.

The dynamic programming intuition

Define dp[k][i][j] = shortest path from i to j using only intermediate vertices from {1, 2, ..., k}.

  • Base case — dp[0][i][j] = direct edge weight or infinity.
  • Recursive case — dp[k][i][j] = min of two options:
    • Don't use vertex k — dp[k-1][i][j].
    • Use vertex k — dp[k-1][i][k] + dp[k-1][k][j].

This is a 3D DP. But dp[k] only depends on dp[k-1], so we can collapse to a single 2D matrix updated in place — that's the standard implementation. The outer loop on k is essential; without it, the in-place update would use already-modified values.

JavaScript implementation

function floydWarshall(graph) {
  // graph: V × V matrix where graph[i][j] is the edge weight or Infinity
  const V = graph.length;
  // Copy input — don't mutate
  const dist = graph.map(row => [...row]);

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        if (dist[i][k] + dist[k][j] < dist[i][j]) {
          dist[i][j] = dist[i][k] + dist[k][j];
        }
      }
    }
  }

  // Negative cycle detection
  for (let i = 0; i < V; i++) {
    if (dist[i][i] < 0) {
      return { dist, hasNegativeCycle: true };
    }
  }
  return { dist, hasNegativeCycle: false };
}

// With path reconstruction
function floydWarshallWithPaths(graph) {
  const V = graph.length;
  const dist = graph.map(row => [...row]);
  const next = Array.from({ length: V }, (_, i) =>
    Array.from({ length: V }, (_, j) =>
      i === j || dist[i][j] === Infinity ? null : j
    )
  );

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        if (dist[i][k] + dist[k][j] < dist[i][j]) {
          dist[i][j] = dist[i][k] + dist[k][j];
          next[i][j] = next[i][k];
        }
      }
    }
  }
  return { dist, next };
}

function reconstructPath(next, i, j) {
  if (next[i][j] === null) return null;
  const path = [i];
  while (i !== j) {
    i = next[i][j];
    path.push(i);
  }
  return path;
}

Python implementation

from math import inf

def floyd_warshall(graph):
    V = len(graph)
    dist = [row[:] for row in graph]  # copy
    for k in range(V):
        for i in range(V):
            for j in range(V):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]
    has_neg_cycle = any(dist[i][i] < 0 for i in range(V))
    return dist, has_neg_cycle

Floyd-Warshall vs other shortest-path algorithms

Floyd-WarshallDijkstra (per source)Bellman-Ford (per source)
ProblemAll pairsSingle sourceSingle source
Time (single source)O(V³)O((V+E) log V)O(V·E)
Time (all pairs)O(V³)O(V·(V+E) log V)O(V²·E)
Edge weightsAnyNon-negativeAny
Code complexity~6 lines~25 lines + heap~10 lines
Best for dense graphsYesYes (V·V·log V ≈ V³)No
Best for sparse graphsNo (always V³)YesNo (V²·E ≫ V³)
Negative cycle detectionYesNoYes (per source)

For dense graphs, all three are O(V³) for all-pairs. Floyd-Warshall has the simplest code. For sparse graphs with non-negative weights, run Dijkstra V times — faster.

When to use Floyd-Warshall

  • Dense graphs (E ≈ V²). O(V³) is asymptotically optimal for the dense case; the simple code wins.
  • Negative edge weights. Dijkstra silently fails on negative weights. Floyd-Warshall works.
  • All-pairs queries. If you'll query many (i, j) pairs after computing once, the V² lookup table is convenient.
  • Transitive closure. Replace (min, +) with (OR, AND) and you get a reachability matrix in O(V³). Used in compilers, database query optimization, and graph analysis.
  • Network routing protocols on small networks. Internet routing protocols are designed for huge graphs and use distance-vector or link-state. For small internal networks, Floyd-Warshall computes complete routing tables directly.
  • Educational reference. Cleanest non-trivial example of dynamic programming on graphs. The recurrence is direct and the loop nesting is intuitive.

Famous Floyd-Warshall applications

Transitive closure

function transitiveClosure(adjacency) {
  const V = adjacency.length;
  const reach = adjacency.map(row => [...row]);

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        reach[i][j] = reach[i][j] || (reach[i][k] && reach[k][j]);
      }
    }
  }
  return reach;
}

O(V³) reachability. Useful for "given an edge for each direct dependency, who depends on whom transitively?"

Graph diameter

The diameter of a graph is the longest shortest path. Run Floyd-Warshall, then take the max over all dist[i][j]. O(V³) total.

Closeness centrality

For each vertex v, sum of distances from v to all other vertices. Run Floyd-Warshall once, then sum each row. Vertices with low total are "central" (close to everything else). Used in social network analysis, organizational structure analysis.

Common Floyd-Warshall bugs

  • Wrong loop order. The outer loop MUST be k. Putting i or j outside silently produces wrong answers because the in-place update would read partially-computed values. The fix is always for k; for i; for j.
  • Initializing dist[i][j] = 0 for non-edges. Should be infinity (no path). 0 means "free transit" — the algorithm thinks every vertex pair has a zero-cost direct connection.
  • Forgetting dist[i][i] = 0. If you initialize the diagonal to infinity, the algorithm computes shortest cycles through each vertex (often non-zero), corrupting subsequent relaxations. Self-distance is 0.
  • Integer overflow on large graphs. dist[i][k] + dist[k][j] can overflow if either side is "infinity" represented as INT_MAX. Use a sentinel that won't overflow when added (INT_MAX / 2), or skip relaxation when either operand is infinity.
  • Mutating the input. Floyd-Warshall is conventionally given a "graph as matrix" parameter and produces "distances as matrix." If you mutate the input, the caller's data is corrupted. Copy on entry.
  • Trying to use it on sparse graphs. For sparse graphs (E ≈ V), Dijkstra-from-each is much faster (V² log V vs V³). Profile before defaulting to Floyd-Warshall.

Frequently asked questions

When should I use Floyd-Warshall vs running Dijkstra from each vertex?

For dense graphs (E ≈ V²), they're equivalent — both O(V³). For sparse graphs (E ≈ V), Dijkstra-from-each is O(V × (V+E) log V) = O(V² log V), faster. For negative weights, Dijkstra fails — use Bellman-Ford-from-each (O(V²·E)) or stick with Floyd-Warshall (O(V³)). Choice — dense or negative weights → Floyd. Sparse positive → Dijkstra-from-each.

How does Floyd-Warshall handle negative cycles?

After running, check if any dist[i][i] is negative — that means there's a negative cycle reachable from vertex i. Don't trust any path through that vertex. The algorithm completes regardless; you just can't interpret the results as valid shortest paths.

What's the intuition behind the three nested loops?

dp[k][i][j] = shortest path from i to j using only intermediate vertices in {1..k}. Base case dp[0][i][j] = direct edge weight or infinity. Recurrence — dp[k][i][j] = min(dp[k-1][i][j], dp[k-1][i][k] + dp[k-1][k][j]) — either don't use k, or use k. Outer loop iterates k; inner two iterate i and j. We can collapse to a 2D table because dp[k] only depends on dp[k-1].

Can Floyd-Warshall find the actual paths, not just distances?

Yes, with a secondary "next" matrix. <code>next[i][j]</code> = the next vertex on the shortest path from i to j. Update during relaxation — whenever dist[i][j] improves via k, set <code>next[i][j] = next[i][k]</code>. Reconstruct path by following next pointers from i to j.

How does Floyd-Warshall compute transitive closure?

Run it on a binary matrix where A[i][j] = 1 iff there's an edge i→j. Replace min with OR and + with AND in the relaxation. After O(V³), A[i][j] = 1 iff there's any path from i to j. Used for reachability queries, dependency analysis, and reflexive-transitive closure of relations.

Why is Floyd-Warshall's loop order important?

The outer loop must iterate k. Inner loops can be in either order. The reason — at iteration k, dp[i][k] and dp[k][j] must be the values that already include intermediate vertices 1..k-1, which they will if k is the outer loop. Reversing the order silently gives incorrect results. The standard <code>for k; for i; for j</code> ordering is essential.

How does Floyd-Warshall find a longest path?

Negate edge weights and find shortest path. But careful — this turns positive cycles into negative cycles, breaking the algorithm. So negation works only on DAGs (no cycles). For general longest-path, the problem is NP-hard.