Computer Science

Dynamic Programming

Solve once, reuse — turning exponential into polynomial

Dynamic programming solves problems by breaking them into overlapping subproblems, computing each subproblem only once, and reusing the result. It turns exponential-time recursion into polynomial-time iteration. The two preconditions are optimal substructure (the optimal solution composes from optimal subsolutions) and overlapping subproblems (the same subproblem appears many times).

  • Time savingsO(2ⁿ) → O(n²) typical
  • SpaceO(states) — table size
  • ApproachTop-down (memo) or bottom-up (tabulation)
  • Required property 1Overlapping subproblems
  • Required property 2Optimal substructure
  • Famous applicationsKnapsack, LCS, edit distance, shortest paths

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How dynamic programming works

DP applies when a problem has two properties:

  1. Optimal substructure. The optimal solution can be built from optimal solutions to smaller versions of the same problem.
  2. Overlapping subproblems. The same smaller problem appears many times across the recursion.

The technique: define a recurrence in terms of smaller subproblems, then either memoize the recursive function (top-down) or fill a table iteratively from base cases up (bottom-up). Both turn exponential recursion into polynomial work because each subproblem is solved exactly once.

Top-down vs bottom-up

The two flavors give the same time complexity but differ in style and constants:

Top-down (memoization)Bottom-up (tabulation)
DirectionOriginal → subproblemsBase cases → original
Drives which subproblems get computedRecursion does it for youYou decide the iteration order
Code styleRecursive function + cacheLoop + array
Stack usageO(depth) recursion stackO(1) extra (no recursion)
Natural forTree-shaped problemsLinear / grid problems
Speedup over naiveSame Big-O, ~2× constantSame Big-O, faster in practice
Easier to write whenRecurrence is naturalState transitions are simple

Use top-down when you want the code to read like the recurrence; use bottom-up when you want the lowest constant factors and don't mind thinking about iteration order. Many production solutions are bottom-up because the savings on overhead matter at scale.

Canonical example: climbing stairs

You're at the bottom of a staircase with n steps. Each move you can take 1 or 2 steps. How many distinct ways are there to reach the top?

Recurrence: ways(n) = ways(n-1) + ways(n-2) (last step was 1 or 2). Base cases: ways(0) = 1, ways(1) = 1. Notice this is just Fibonacci.

// Naive recursive — O(2^n)
function climbNaive(n) {
  if (n < 2) return 1;
  return climbNaive(n - 1) + climbNaive(n - 2);
}

// Top-down (memoized) — O(n)
function climbMemo(n, memo = {}) {
  if (n < 2) return 1;
  if (memo[n]) return memo[n];
  return memo[n] = climbMemo(n - 1, memo) + climbMemo(n - 2, memo);
}

// Bottom-up (tabulated) — O(n) time, O(1) space
function climbDP(n) {
  let prev = 1, curr = 1;
  for (let i = 2; i <= n; i++) [prev, curr] = [curr, prev + curr];
  return curr;
}

For n = 50, the naive version takes ~2^50 ≈ 10^15 calls (years to run). The memoized version takes 50 calls. The bottom-up version uses 2 variables. Same problem, three orders of magnitude in performance.

When to use DP

Look for these patterns in the problem statement:

  • "Optimal" something. Maximum, minimum, longest, shortest. The problem is asking for the best among many possible solutions.
  • Counting paths or arrangements. "How many ways to..." problems.
  • Recursive structure with repeated subproblems. If your recursion tree is doing the same work multiple times, that's the sign.
  • Sequence problems. Longest increasing subsequence, edit distance, longest common subsequence — anything operating on prefixes or suffixes of strings/arrays.
  • Choice with constraints. Knapsack, coin change, partition problems — choose items subject to a budget or constraint.

If the problem doesn't have overlapping subproblems, DP gives no speedup. Try greedy or divide-and-conquer first.

Famous DP problems

0/1 Knapsack

Items with weights w[i] and values v[i], capacity W. Maximize value without exceeding capacity. Each item can be taken or skipped.

function knapsack(weights, values, W) {
  const n = weights.length;
  // dp[i][w] = max value using first i items with capacity w
  const dp = Array.from({ length: n + 1 }, () => new Array(W + 1).fill(0));

  for (let i = 1; i <= n; i++) {
    for (let w = 0; w <= W; w++) {
      dp[i][w] = dp[i-1][w]; // skip item i
      if (weights[i-1] <= w) {
        dp[i][w] = Math.max(dp[i][w], dp[i-1][w - weights[i-1]] + values[i-1]);
      }
    }
  }
  return dp[n][W];
}

O(n × W) time, O(n × W) space. The space can be reduced to O(W) by noting that row i only depends on row i-1.

Longest Common Subsequence

function lcs(a, b) {
  const m = a.length, n = b.length;
  const dp = Array.from({ length: m + 1 }, () => new Array(n + 1).fill(0));
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (a[i-1] === b[j-1]) dp[i][j] = dp[i-1][j-1] + 1;
      else dp[i][j] = Math.max(dp[i-1][j], dp[i][j-1]);
    }
  }
  return dp[m][n];
}

Runs in O(m × n). Used for diff tools, version control merges, DNA sequence alignment.

Python DP idioms

from functools import lru_cache

# Top-down with @lru_cache — Python's built-in memoization
@lru_cache(maxsize=None)
def fib(n):
    return n if n < 2 else fib(n - 1) + fib(n - 2)

# Bottom-up — typically more idiomatic in Python
def coin_change(coins, amount):
    """Min coins to make amount; -1 if impossible."""
    INF = float('inf')
    dp = [0] + [INF] * amount
    for a in range(1, amount + 1):
        for c in coins:
            if c <= a:
                dp[a] = min(dp[a], dp[a - c] + 1)
    return dp[amount] if dp[amount] != INF else -1

Space optimization tricks

  • Roll the table. If dp[i] only depends on dp[i-1] (or some bounded number of previous rows), keep only those rows. The fib bottom-up keeps 2 variables instead of an n-element array.
  • In-place updates. For 1D problems where dp[i] depends on dp[i - delta], you can sometimes update the array in place in the right iteration order.
  • Bitmask DP for small subsets. If the state is "which subset of n elements is included," for n ≤ 20 you can encode the subset as a 32-bit integer. dp[mask] is then a 1D array of size 2^n.

Common bugs and edge cases

  • Wrong base cases. The most common DP bug — getting the iteration started incorrectly. Always test with the smallest input where the answer is known by inspection.
  • Wrong iteration order. If dp[i] depends on dp[i-1] and dp[i-2], you must compute lower indices before higher ones. Reversing the order silently gives wrong answers.
  • Off-by-one in the table dimensions. Knapsack's table has dimensions (n + 1) × (W + 1) — extra row and column for the empty / zero-capacity cases. Forgetting either causes index-out-of-bounds at the boundaries.
  • Trying DP when greedy works. If the problem has the greedy choice property, DP is correct but unnecessarily complex. Try greedy first; if it fails on a small adversarial example, fall back to DP.
  • Trying DP when there are no overlapping subproblems. Adding memoization to a function that's already only called once with each input does nothing. Profile first; verify the same subproblem is hit many times before optimizing.

Frequently asked questions

When does dynamic programming apply?

Two properties must hold. (1) Optimal substructure — the optimal answer to the whole problem can be assembled from optimal answers to subproblems. (2) Overlapping subproblems — the same subproblem appears many times during the computation. If both hold, DP turns exponential recursion into polynomial-time iteration. If overlapping subproblems are missing, DP gives no speedup over plain recursion.

What's the difference between top-down and bottom-up DP?

Top-down (memoization) starts from the original problem and recurses, caching each subproblem's answer. The recursion drives which subproblems get computed — only those needed. Bottom-up (tabulation) starts from the smallest subproblems and iterates up, filling a table. Both achieve the same complexity; bottom-up usually has better constants and avoids recursion-stack overhead, top-down is easier to write when the recurrence is natural.

How is memoization different from caching?

Memoization is a specific technique — caching the return value of a pure function keyed on its arguments. Caching is a broader concept covering any kind of result reuse. Memoization is automatic in some languages (Python's `@lru_cache`, JavaScript via a Map wrapper) and requires explicit table management in others.

What does "optimal substructure" mean?

The optimal solution to the whole problem is composed of optimal solutions to its subproblems. Shortest path has it (the shortest path from A to C passes through some intermediate B and is optimal A→B + optimal B→C). Longest path does NOT have it — the longest A→C path doesn't necessarily go through optimal A→B and B→C subpaths because they might share edges. That's why shortest path has DP solutions and longest path is NP-hard.

What's the knapsack problem and why is it the canonical DP?

Given items with weights and values and a capacity limit, choose items to maximize value without exceeding capacity. Naive recursion explores 2^n subsets. The DP exploits that the optimal solution at capacity W using the first k items only depends on the optimal solutions for capacities ≤ W using the first k-1 items — that's optimal substructure plus overlapping subproblems. The 2D table is O(n × W) time and space, polynomial in n and W.

Is greedy the same as DP?

No. Both build a solution piece by piece, but greedy commits to the locally best choice at each step and never revisits. DP considers all subproblem solutions and combines them. Greedy works when the greedy choice property holds (Dijkstra, MST, Huffman). When it doesn't, you need DP. Coin change is the classic example: greedy works for US coins, fails for arbitrary denominations — DP works for both.

Why is naive Fibonacci O(2ⁿ)?

`fib(n)` recursively calls `fib(n-1)` and `fib(n-2)`. The recursion tree has roughly Fibonacci-many nodes — the n-th Fibonacci number itself, which is exponential in n. Memoizing the function caches each `fib(k)` so it's computed once, turning the call tree into a linear chain — O(n) time, O(n) space.