Algorithms

Floyd-Warshall

Name: Floyd-Warshall — 60-second explainer
Uploaded: 2026-05-13T09:57:36Z
Duration: 1 min
Description: Floyd-Warshall finds the shortest path between every pair of vertices in a weighted graph. Three nested loops over the vertices and a single relaxation step inside — O(V³) time, O(V²) space, works with negative weights (just no negative cycles). Used in network routing, transitive closure, and as the standard textbook example of dynamic programming on graphs.

All-pairs shortest paths in O(V³) — three nested loops, beautifully simple

Floyd-Warshall finds the shortest path between every pair of vertices in a weighted graph. Three nested loops over the vertices and a single relaxation step inside — O(V³) time, O(V²) space, works with negative weights (just no negative cycles). Used in network routing, transitive closure, and as the standard textbook example of dynamic programming on graphs.

TimeO(V³)
SpaceO(V²) — distance matrix
Edge weightsAny (positive, negative, zero)
Negative cyclesDetected if any dist[i][i] becomes negative
Vs Dijkstra-from-each-vertexSame O(V³) for dense graphs; simpler code
Used inNetwork routing, transitive closure, distance summaries

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Watch on YouTube

How Floyd-Warshall works

Maintain a V × V distance matrix dist. Initialize:

dist[i][i] = 0 for all i.
dist[i][j] = w if there's a direct edge i→j with weight w.
dist[i][j] = infinity otherwise.

Then for each intermediate vertex k from 1 to V, for each pair (i, j), relax:

if dist[i][k] + dist[k][j] < dist[i][j]:
    dist[i][j] = dist[i][k] + dist[k][j]

After the outer loop completes, dist[i][j] holds the shortest distance from i to j. The triple-nested loop is O(V³). The whole algorithm fits in 6 lines of code.

The dynamic programming intuition

Define dp[k][i][j] = shortest path from i to j using only intermediate vertices from {1, 2, ..., k}.

Base case — dp[0][i][j] = direct edge weight or infinity.
Recursive case — dp[k][i][j] = min of two options:
- Don't use vertex k — dp[k-1][i][j].
- Use vertex k — dp[k-1][i][k] + dp[k-1][k][j].

This is a 3D DP. But dp[k] only depends on dp[k-1], so we can collapse to a single 2D matrix updated in place — that's the standard implementation. The outer loop on k is essential; without it, the in-place update would use already-modified values.

JavaScript implementation

function floydWarshall(graph) {
  // graph: V × V matrix where graph[i][j] is the edge weight or Infinity
  const V = graph.length;
  // Copy input — don't mutate
  const dist = graph.map(row => [...row]);

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        if (dist[i][k] + dist[k][j] < dist[i][j]) {
          dist[i][j] = dist[i][k] + dist[k][j];
        }
      }
    }
  }

  // Negative cycle detection
  for (let i = 0; i < V; i++) {
    if (dist[i][i] < 0) {
      return { dist, hasNegativeCycle: true };
    }
  }
  return { dist, hasNegativeCycle: false };
}

// With path reconstruction
function floydWarshallWithPaths(graph) {
  const V = graph.length;
  const dist = graph.map(row => [...row]);
  const next = Array.from({ length: V }, (_, i) =>
    Array.from({ length: V }, (_, j) =>
      i === j || dist[i][j] === Infinity ? null : j
    )
  );

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        if (dist[i][k] + dist[k][j] < dist[i][j]) {
          dist[i][j] = dist[i][k] + dist[k][j];
          next[i][j] = next[i][k];
        }
      }
    }
  }
  return { dist, next };
}

function reconstructPath(next, i, j) {
  if (next[i][j] === null) return null;
  const path = [i];
  while (i !== j) {
    i = next[i][j];
    path.push(i);
  }
  return path;
}

Python implementation

from math import inf

def floyd_warshall(graph):
    V = len(graph)
    dist = [row[:] for row in graph]  # copy
    for k in range(V):
        for i in range(V):
            for j in range(V):
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]
    has_neg_cycle = any(dist[i][i] < 0 for i in range(V))
    return dist, has_neg_cycle

Floyd-Warshall vs other shortest-path algorithms

	Floyd-Warshall	Dijkstra (per source)	Bellman-Ford (per source)
Problem	All pairs	Single source	Single source
Time (single source)	O(V³)	O((V+E) log V)	O(V·E)
Time (all pairs)	O(V³)	O(V·(V+E) log V)	O(V²·E)
Edge weights	Any	Non-negative	Any
Code complexity	~6 lines	~25 lines + heap	~10 lines
Best for dense graphs	Yes	Yes (V·V·log V ≈ V³)	No
Best for sparse graphs	No (always V³)	Yes	No (V²·E ≫ V³)
Negative cycle detection	Yes	No	Yes (per source)

For dense graphs, all three are O(V³) for all-pairs. Floyd-Warshall has the simplest code. For sparse graphs with non-negative weights, run Dijkstra V times — faster.

When to use Floyd-Warshall

Dense graphs (E ≈ V²). O(V³) is asymptotically optimal for the dense case; the simple code wins.
Negative edge weights. Dijkstra silently fails on negative weights. Floyd-Warshall works.
All-pairs queries. If you'll query many (i, j) pairs after computing once, the V² lookup table is convenient.
Transitive closure. Replace (min, +) with (OR, AND) and you get a reachability matrix in O(V³). Used in compilers, database query optimization, and graph analysis.
Network routing protocols on small networks. Internet routing protocols are designed for huge graphs and use distance-vector or link-state. For small internal networks, Floyd-Warshall computes complete routing tables directly.
Educational reference. Cleanest non-trivial example of dynamic programming on graphs. The recurrence is direct and the loop nesting is intuitive.

Famous Floyd-Warshall applications

Transitive closure

function transitiveClosure(adjacency) {
  const V = adjacency.length;
  const reach = adjacency.map(row => [...row]);

  for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
      for (let j = 0; j < V; j++) {
        reach[i][j] = reach[i][j] || (reach[i][k] && reach[k][j]);
      }
    }
  }
  return reach;
}

O(V³) reachability. Useful for "given an edge for each direct dependency, who depends on whom transitively?"

Graph diameter

The diameter of a graph is the longest shortest path. Run Floyd-Warshall, then take the max over all dist[i][j]. O(V³) total.

Closeness centrality

For each vertex v, sum of distances from v to all other vertices. Run Floyd-Warshall once, then sum each row. Vertices with low total are "central" (close to everything else). Used in social network analysis, organizational structure analysis.

Common Floyd-Warshall bugs

Wrong loop order. The outer loop MUST be k. Putting i or j outside silently produces wrong answers because the in-place update would read partially-computed values. The fix is always for k; for i; for j.
Initializing dist[i][j] = 0 for non-edges. Should be infinity (no path). 0 means "free transit" — the algorithm thinks every vertex pair has a zero-cost direct connection.
Forgetting dist[i][i] = 0. If you initialize the diagonal to infinity, the algorithm computes shortest cycles through each vertex (often non-zero), corrupting subsequent relaxations. Self-distance is 0.
Integer overflow on large graphs. dist[i][k] + dist[k][j] can overflow if either side is "infinity" represented as INT_MAX. Use a sentinel that won't overflow when added (INT_MAX / 2), or skip relaxation when either operand is infinity.
Mutating the input. Floyd-Warshall is conventionally given a "graph as matrix" parameter and produces "distances as matrix." If you mutate the input, the caller's data is corrupted. Copy on entry.
Trying to use it on sparse graphs. For sparse graphs (E ≈ V), Dijkstra-from-each is much faster (V² log V vs V³). Profile before defaulting to Floyd-Warshall.

Frequently asked questions

When should I use Floyd-Warshall vs running Dijkstra from each vertex?

For dense graphs (E ≈ V²), they're equivalent — both O(V³). For sparse graphs (E ≈ V), Dijkstra-from-each is O(V × (V+E) log V) = O(V² log V), faster. For negative weights, Dijkstra fails — use Bellman-Ford-from-each (O(V²·E)) or stick with Floyd-Warshall (O(V³)). Choice — dense or negative weights → Floyd. Sparse positive → Dijkstra-from-each.

How does Floyd-Warshall handle negative cycles?

After running, check if any dist[i][i] is negative — that means there's a negative cycle reachable from vertex i. Don't trust any path through that vertex. The algorithm completes regardless; you just can't interpret the results as valid shortest paths.

What's the intuition behind the three nested loops?

dp[k][i][j] = shortest path from i to j using only intermediate vertices in {1..k}. Base case dp[0][i][j] = direct edge weight or infinity. Recurrence — dp[k][i][j] = min(dp[k-1][i][j], dp[k-1][i][k] + dp[k-1][k][j]) — either don't use k, or use k. Outer loop iterates k; inner two iterate i and j. We can collapse to a 2D table because dp[k] only depends on dp[k-1].

Can Floyd-Warshall find the actual paths, not just distances?

Yes, with a secondary "next" matrix. <code>next[i][j]</code> = the next vertex on the shortest path from i to j. Update during relaxation — whenever dist[i][j] improves via k, set <code>next[i][j] = next[i][k]</code>. Reconstruct path by following next pointers from i to j.

How does Floyd-Warshall compute transitive closure?

Run it on a binary matrix where A[i][j] = 1 iff there's an edge i→j. Replace min with OR and + with AND in the relaxation. After O(V³), A[i][j] = 1 iff there's any path from i to j. Used for reachability queries, dependency analysis, and reflexive-transitive closure of relations.

Why is Floyd-Warshall's loop order important?

The outer loop must iterate k. Inner loops can be in either order. The reason — at iteration k, dp[i][k] and dp[k][j] must be the values that already include intermediate vertices 1..k-1, which they will if k is the outer loop. Reversing the order silently gives incorrect results. The standard <code>for k; for i; for j</code> ordering is essential.

How does Floyd-Warshall find a longest path?

Negate edge weights and find shortest path. But careful — this turns positive cycles into negative cycles, breaking the algorithm. So negation works only on DAGs (no cycles). For general longest-path, the problem is NP-hard.