Computer Science
Memoization
Cache function results — turn O(2ⁿ) into O(n) with one decorator
Memoization is caching the return value of a pure function keyed on its arguments. The first call computes the result; subsequent calls with the same arguments return instantly from the cache. It transforms exponential-time recursion into polynomial time when the recursion has overlapping subproblems — the lightest, most reusable form of dynamic programming.
- Time savings (typical)O(2ⁿ) → O(n) on overlapping recursion
- Space costO(unique inputs)
- Required propertyPure function (no side effects, deterministic)
- ImplementationHash map keyed on argument tuple
- Built-in supportPython @lru_cache, JS via wrapper
- Vs full DPTop-down only — recursion-driven
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How memoization works
Take any pure function. Wrap it so the first call with each argument tuple computes and stores the result; subsequent calls with the same arguments return the stored value:
function memoize(fn) {
const cache = new Map();
return function (...args) {
const key = JSON.stringify(args); // or a more efficient hash
if (cache.has(key)) return cache.get(key);
const result = fn(...args);
cache.set(key, result);
return result;
};
}
That's it. Five lines turn an exponential-recursion function into a polynomial one — provided the function has overlapping subproblems.
Canonical example: Fibonacci
Naive recursive Fibonacci:
function fib(n) {
if (n < 2) return n;
return fib(n - 1) + fib(n - 2);
}
Time complexity: O(2ⁿ). Computing fib(50) takes ~10¹⁵ recursive calls and runs for tens of minutes. Why? fib(48) is computed twice, fib(47) three times, fib(46) five times, fib(45) eight times — Fibonacci-many duplicates.
Memoize it:
const fibMemo = memoize(function fib(n) {
if (n < 2) return n;
return fibMemo(n - 1) + fibMemo(n - 2);
});
Time complexity: O(n). fib(50) finishes in 50 actual computations plus 99 cache hits — microseconds. The function looks the same; one wrapper changes the asymptotic cost.
Memoization vs tabulation
Both achieve the same complexity but differ in style and constants:
| Memoization (top-down) | Tabulation (bottom-up) | |
|---|---|---|
| Direction | Recursive — start from the top | Iterative — start from base cases |
| Driven by | Recursion — only computes what's needed | Loop — computes all subproblems |
| Storage | Hash map (sparse) | Array (dense) |
| Constant factors | Higher (hash + recursion overhead) | Lower (array + loop) |
| Stack usage | O(depth) recursion stack | O(1) extra (no recursion) |
| Best for | Sparse subproblem space, natural recurrence | Dense subproblem space, performance-critical |
Memoization is faster to write and easier to reason about. Tabulation is faster to run. Many production solutions start as memoized prototypes and switch to tabulation only if profiling shows the cache lookup is hot.
When to use memoization
- Recursive algorithms with overlapping subproblems. Fibonacci, Pascal's triangle, climbing stairs, edit distance, knapsack. Any recursion tree where the same input appears multiple times.
- Expensive pure computations. Cache the result of a function that does heavy I/O or computation but produces the same answer for the same input. Database lookups for static data, regex compilations, parsed configs.
- Backtracking with repeated subproblems. Some backtracking searches revisit equivalent states; memoizing keyed on the state (not the path to it) prunes whole subtrees.
- UI / framework function caching. React's
useMemo, Vue'scomputed— these are memoization at the framework level, recomputing only when inputs change.
Skip memoization when the function isn't pure, when arguments are continuous (floats with infinite uniqueness), or when the function is already iterative.
Python implementation
Python ships memoization in the standard library:
from functools import lru_cache, cache
# Unbounded cache (Python 3.9+)
@cache
def fib(n):
return n if n < 2 else fib(n - 1) + fib(n - 2)
# Bounded cache (LRU eviction past maxsize)
@lru_cache(maxsize=128)
def heavy_compute(x, y):
# ... expensive pure computation ...
return result
# Inspect cache stats
print(fib.cache_info()) # CacheInfo(hits=98, misses=51, maxsize=None, currsize=51)
# Clear when needed (e.g., on test setup)
fib.cache_clear()
@cache is unbounded — equivalent to @lru_cache(maxsize=None). Use it for true memoization where the input space is bounded by the problem. Use @lru_cache(maxsize=N) when memory matters and you can tolerate cache misses.
JavaScript memoization patterns
// Generic memoization wrapper using a Map
function memoize(fn) {
const cache = new Map();
return function (...args) {
const key = args.length === 1 ? args[0] : JSON.stringify(args);
if (cache.has(key)) return cache.get(key);
const result = fn.apply(this, args);
cache.set(key, result);
return result;
};
}
// LRU memoization with bounded size — evicts oldest on overflow
function memoizeLRU(fn, maxSize = 128) {
const cache = new Map();
return function (...args) {
const key = JSON.stringify(args);
if (cache.has(key)) {
const value = cache.get(key);
cache.delete(key); // re-insert to mark as recently used
cache.set(key, value);
return value;
}
const result = fn.apply(this, args);
cache.set(key, result);
if (cache.size > maxSize) cache.delete(cache.keys().next().value);
return result;
};
}
The Map's insertion-order-preserving iteration is what makes the second snippet work as an LRU — moving an entry to the end requires a delete + set; oldest entries sit at the front.
Famous memoization problems
Climbing Stairs
From the bottom of n stairs, each move climbs 1 or 2 steps. How many distinct paths to the top?
@cache
def climb(n):
if n < 2: return 1
return climb(n - 1) + climb(n - 2)
Without memoization, O(2ⁿ). With it, O(n). The cache turns an exponential recursion into a linear one, even though the function body looks the same.
Edit distance (Levenshtein)
@cache
def edit_distance(a, b):
if not a: return len(b)
if not b: return len(a)
if a[0] == b[0]: return edit_distance(a[1:], b[1:])
return 1 + min(
edit_distance(a[1:], b), # delete
edit_distance(a, b[1:]), # insert
edit_distance(a[1:], b[1:]), # substitute
)
Naive recursion is O(3^(m+n)). Memoization brings it to O(m × n). The diff algorithms in git, version control, and spell-check all use this principle (often with a tabulated DP for cache locality).
Common bugs and edge cases
- Memoizing impure functions. Caching a function that depends on global state, current time, or a database produces stale results. The function returns its first call's value forever. Always verify purity before adding the wrapper.
- Mutable arguments aren't hashable. Lists, dicts, sets — can't be cache keys. Convert to tuples, frozensets, or strings. Be careful: changing the original after caching doesn't update the cache.
- JSON.stringify as a key on objects with key-order differences.
{a: 1, b: 2}and{b: 2, a: 1}serialize differently in JavaScript. Either canonicalize the object or use a different keying scheme. - Memory leaks from unbounded caches. A long-running process that memoizes everything accumulates memory until OOM. Use bounded LRU caches or call
cache_clear()periodically when memory is finite. - Recursion that bypasses the cached version. If you memoize
fibbut the originalfibrecurses on itself (not the memoized wrapper), the cache never helps. Use the decorator pattern or rebind the name.
Frequently asked questions
What's the difference between memoization and dynamic programming?
Memoization is a technique — cache results of recursive calls. DP is a class of algorithms that solve problems with overlapping subproblems and optimal substructure. Memoization implements DP top-down (driven by recursion). Tabulation implements DP bottom-up (driven by an explicit loop). Both achieve the same complexity; memoization is easier to write when the recurrence is natural.
When does memoization NOT help?
When the function has no overlapping subproblems — every call has unique arguments. Memoization adds overhead (hash lookup per call) without saving any work. Profile your function: if the same arguments rarely repeat, skip memoization.
What does "pure function" mean?
A function whose return value depends only on its arguments and which has no side effects. Same input → same output, every time. Memoization requires this — caching impure functions causes subtle bugs where cached results don't reflect updated state. If your function reads from a database or random number generator, it's not pure and shouldn't be memoized.
How does Python's @lru_cache work?
Wraps a function in a least-recently-used cache backed by a dict + doubly linked list. Keys are tuples of (args, frozen kwargs). When the cache exceeds maxsize, the oldest unused entry is evicted. Use `@lru_cache(maxsize=None)` for unbounded caching (effectively memoization). Implemented in C in CPython, ~10× faster than a hand-rolled Python wrapper.
Can you memoize a function with mutable arguments?
Not directly — mutable arguments aren't hashable. Convert lists to tuples, dicts to frozensets, or convert the function to take immutable arguments. If the function takes a class instance, the instance must be hashable (define `__hash__` and `__eq__`) and immutable across the cache lifetime.
What about cache invalidation?
For pure functions on immutable inputs, no invalidation is needed — the result will be the same forever. For impure or context-dependent functions, you need an explicit `cache.clear()` or per-key eviction. This is the source of the famous quote: "There are only two hard things in computer science: cache invalidation and naming things."
Should I memoize iterative functions?
Usually not. Memoization shines on recursive functions where the same subproblem appears in multiple recursion branches. Iterative functions process each input once by definition — there's nothing to cache. Exception: an iterative function called repeatedly from outside with the same arguments — that's a different memoization (caching the function's whole input).