Algorithms
Knuth-Morris-Pratt String Search
Find a pattern in a text without ever re-reading a character
The Knuth-Morris-Pratt algorithm finds a pattern of length m inside a text of length n in O(n + m) time, never re-examining a text character. The trick is a precomputed failure function that captures every proper prefix of the pattern that is also a suffix.
- Time (preprocess pattern)O(m)
- Time (search text)O(n)
- Time (total worst)O(n + m)
- SpaceO(m)
- Text re-readsZero
- Streaming-friendlyYes
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The problem with naive search
Naive substring search slides the pattern over the text one character at a time. At each position, it compares the pattern to the corresponding text slice; on a mismatch, it shifts by one and starts over. Worst case: O(nm). Concretely:
Text: AAAAAAAAAAAAAAAAAAAAB
Pattern: AAAAAB
At each text position, we compare 5 'A's then mismatch on 'B'.
Over n positions, that's 5n character comparisons. For n=10⁹ this is
~5 seconds — bad. KMP turns this into ~10⁹ comparisons total.
The waste is obvious: after matching AAAAA and failing on B, naive search throws away the partial match and restarts from text position 1 — even though it just verified that text positions 1-4 are AAAA. KMP captures that information in a precomputed table and reuses it.
How KMP works
Two phases:
- Preprocess the pattern to compute the prefix function π. π[i] = length of the longest proper prefix of
pattern[0..i]that is also a suffix. "Proper" means not equal to the whole prefix itself. - Search the text using two pointers:
iover the text,jover the pattern. On mismatch, instead of restarting, jumpjback to π[j-1] — the longest prefix of the pattern that matches the suffix of what we've already verified in the text.
The text pointer i only ever moves forward. The pattern pointer j can jump back, but its net advance is bounded by the number of forward steps — so total work is linear in n.
Building the prefix function
For pattern ABABCABAB, the prefix function is:
i: 0 1 2 3 4 5 6 7 8
char: A B A B C A B A B
π[i]: 0 0 1 2 0 1 2 3 4
Reading π[i]:
- π[2] = 1: pattern[0..2] = "ABA"; longest proper prefix that's
also a suffix is "A" (length 1).
- π[3] = 2: pattern[0..3] = "ABAB"; "AB" matches both ends.
- π[4] = 0: pattern[0..4] = "ABABC"; no proper prefix matches the
trailing C.
- π[8] = 4: "ABABCABAB"; "ABAB" is both prefix and suffix.
The construction is itself a self-search: comparing the pattern with shifted copies of itself, using already-computed π values to skip ahead. The implementation:
function buildPrefixFunction(pattern) {
const m = pattern.length;
const pi = new Array(m).fill(0);
let k = 0; // length of current matching prefix
for (let i = 1; i < m; i++) {
while (k > 0 && pattern[k] !== pattern[i]) {
k = pi[k - 1]; // fall back along previous prefix matches
}
if (pattern[k] === pattern[i]) k++;
pi[i] = k;
}
return pi;
}
buildPrefixFunction("ABABCABAB"); // [0, 0, 1, 2, 0, 1, 2, 3, 4]
buildPrefixFunction("AAACAAAA"); // [0, 1, 2, 0, 1, 2, 3, 3]
buildPrefixFunction("ABCDEF"); // [0, 0, 0, 0, 0, 0]
Total work for buildPrefixFunction is O(m). The inner while looks like it could blow up, but each iteration of the while strictly decreases k, and k can only be increased by 1 per outer loop step — so amortized O(m) total.
Trace: search "ABABDABACDABABCABAB" for "ABABCABAB"
Pattern π = [0, 0, 1, 2, 0, 1, 2, 3, 4] (computed above).
i=0,j=0: Text A vs Pat A → match, i=1, j=1
i=1,j=1: Text B vs Pat B → match, i=2, j=2
i=2,j=2: Text A vs Pat A → match, i=3, j=3
i=3,j=3: Text B vs Pat B → match, i=4, j=4
i=4,j=4: Text D vs Pat C → MISMATCH
j > 0, so jump j = π[3] = 2 (don't move i)
i=4,j=2: Text D vs Pat A → MISMATCH
j > 0, so jump j = π[1] = 0
i=4,j=0: Text D vs Pat A → MISMATCH
j == 0, advance i (no rewind possible). i=5
i=5,j=0: Text A vs Pat A → match, i=6, j=1
i=6,j=1: Text B vs Pat B → match, i=7, j=2
i=7,j=2: Text A vs Pat A → match, i=8, j=3
i=8,j=3: Text C vs Pat B → MISMATCH
j = π[2] = 1
i=8,j=1: Text C vs Pat B → MISMATCH
j = π[0] = 0
i=8,j=0: Text C vs Pat A → MISMATCH
advance i. i=9
i=9..17: match through the full pattern at text position 10
j reaches 9 = m → match found at i=18-9 = 10 ✓
Notice text indices 0-3 and 5-7 are read once each, and the algorithm never goes back. The total text reads = 19. Naive search on the same input would do ~50 reads (lots of restarts after partial matches).
When to use KMP
- Pattern has repeated prefixes. DNA sequences, binary patterns, programming-language tokens. Naive search wastes work re-checking matched prefixes.
- Streaming text. KMP needs only the pattern in memory (O(m) space) and processes text in one pass — useful for log scanners, network packet matchers, file watchers.
- Adversarial input. Naive search has O(nm) worst case, exploitable in DoS attacks. KMP is O(n+m) regardless of input.
- Foundation for Aho-Corasick. If you need to match many patterns simultaneously, you'll build on KMP's failure-function idea.
For one-shot search of natural-language text in a static file, the built-in strstr/String.includes/str.find wins — modern implementations are SIMD-accelerated. KMP is for the cases where they're not enough.
KMP vs other string-search algorithms
| KMP | Naive | Boyer-Moore | Rabin-Karp | Aho-Corasick | |
|---|---|---|---|---|---|
| Preprocess | O(m) | O(0) | O(m + Σ) | O(m) | O(Σmᵢ) |
| Search (avg) | O(n) | O(n) | O(n/m) sublinear | O(n) avg | O(n) |
| Search (worst) | O(n) | O(nm) | O(nm) | O(nm) | O(n) |
| Space | O(m) | O(1) | O(m + Σ) | O(1) | O(Σmᵢ) |
| Pattern count | 1 | 1 | 1 | k (parallel) | k |
| Streaming | Yes | Yes | No (looks ahead) | Yes | Yes |
| Best for | Repeated-prefix patterns, streaming | Tiny patterns | Long patterns, English | Multiple equal-length patterns | Many patterns at once |
Σ is alphabet size. The worst case for naive, Boyer-Moore, and Rabin-Karp is rare in practice — they often beat KMP on natural text. KMP's strength is the worst-case guarantee.
What "linear" means in practice
KMP's inner loop does ~3 operations per text character (read, compare, branch). On modern hardware that's about 2-5 ns:
| n (text) | m (pattern) | KMP wall time | Naive worst |
|---|---|---|---|
| 1,000 | 10 | 5 µs | 50 µs |
| 1,000,000 | 10 | 5 ms | 50 ms |
| 1,000,000 | 1,000 | 5 ms | 5 s |
| 1,000,000,000 | 10 | 5 s | 50 s |
| 1,000,000,000 | 1,000 | 5 s | 5,000 s |
The KMP cell is constant in m for the search phase — preprocessing pays the O(m) cost once. For grep over a 1GB log file with a 1KB pattern, naive search at 5000 seconds is a bug; KMP at 5 seconds is acceptable.
JavaScript implementation
function kmpSearch(text, pattern) {
if (pattern.length === 0) return [0];
const pi = buildPrefixFunction(pattern);
const matches = [];
let j = 0; // pattern index
for (let i = 0; i < text.length; i++) {
while (j > 0 && pattern[j] !== text[i]) j = pi[j - 1];
if (pattern[j] === text[i]) j++;
if (j === pattern.length) {
matches.push(i - j + 1);
j = pi[j - 1]; // continue searching for overlapping matches
}
}
return matches;
}
function buildPrefixFunction(pattern) {
const m = pattern.length;
const pi = new Array(m).fill(0);
let k = 0;
for (let i = 1; i < m; i++) {
while (k > 0 && pattern[k] !== pattern[i]) k = pi[k - 1];
if (pattern[k] === pattern[i]) k++;
pi[i] = k;
}
return pi;
}
// Find all occurrences (overlapping)
kmpSearch("ababababab", "abab"); // [0, 2, 4, 6]
kmpSearch("AAAAA", "AA"); // [0, 1, 2, 3]
kmpSearch("hello world", "world"); // [6]
The j = pi[j - 1] after a match is what enables overlapping detection: instead of jumping past the match, KMP backs up to the longest proper prefix-suffix and continues. Replace it with j = 0 for non-overlapping search.
Python implementation
def kmp_search(text, pattern):
if not pattern:
return [0]
pi = build_prefix_function(pattern)
matches = []
j = 0
for i, ch in enumerate(text):
while j > 0 and pattern[j] != ch:
j = pi[j - 1]
if pattern[j] == ch:
j += 1
if j == len(pattern):
matches.append(i - j + 1)
j = pi[j - 1]
return matches
def build_prefix_function(pattern):
m = len(pattern)
pi = [0] * m
k = 0
for i in range(1, m):
while k > 0 and pattern[k] != pattern[i]:
k = pi[k - 1]
if pattern[k] == pattern[i]:
k += 1
pi[i] = k
return pi
# In production: just use the built-in
"hello world".find("world") # 6
"hello world".count("l") # 3 — but this is character count
import re
[m.start() for m in re.finditer("ab", "ababab")] # [0, 2, 4]
Famous problem: shortest period of a string
The period of a string s is the smallest p such that s[i] = s[i+p] for all valid i. KMP's failure function gives this for free:
// Find the shortest period of s using its prefix function.
// If n - π[n-1] divides n, that's the period; else the period is n.
function shortestPeriod(s) {
if (s.length === 0) return 0;
const pi = buildPrefixFunction(s);
const candidate = s.length - pi[s.length - 1];
return s.length % candidate === 0 ? candidate : s.length;
}
shortestPeriod("abcabcabc"); // 3 ("abc" repeats 3×)
shortestPeriod("ababab"); // 2 ("ab" repeats 3×)
shortestPeriod("aabaabaab"); // 3
shortestPeriod("abc"); // 3 (no repetition)
The intuition: π[n-1] = length of the longest border of s (a border is a string that is both prefix and suffix). If n − π[n-1] divides n, the string is a power of that prefix.
Famous problem: is one string a rotation of another?
// "abcde" is a rotation of "cdeab" (shift by 2). Trick: a is a
// rotation of b iff a appears as a substring of b+b.
function isRotation(a, b) {
if (a.length !== b.length) return false;
return kmpSearch(b + b, a).length > 0;
}
isRotation("abcde", "cdeab"); // true
isRotation("abcde", "abcdf"); // false
isRotation("waterbottle", "erbottlewat"); // true
Why this works: rotating cdeab by 2 gives abcde. Concatenating cdeab + cdeab = cdeabcdeab contains every rotation as a length-n substring. KMP makes the substring check O(n).
Common bugs and edge cases
- Empty pattern. Many implementations return -1 or
[0]on empty input — pick a convention and handle it explicitly. Naively, the prefix-function loop runs zero times and the search loop matches at every position. - Off-by-one in π indexing. Some textbooks index π from 1 (Cormen-style "failure function"); others from 0.
π[j-1]in 0-indexed is the same asπ[j]in 1-indexed. Mixing the two produces silent wrong results. - Prefix vs failure function. The classical "failure function" f from KMP's 1977 paper is f[i] = π[i] − 1 (or sometimes the longest proper prefix-suffix expressed as an ending index). Modern code almost always uses the 0-indexed prefix function π. Don't paste fragments across conventions.
- Forgetting
j = π[j-1]after a match. If you want all (possibly overlapping) matches, you must back up j toπ[m-1]after a successful match, not jump to 0. Settingj = 0misses overlapping occurrences. - Inner while-loop without
k > 0guard. Readingπ[-1]in JS gives undefined; in Python it indexes from the end. Always check the bound before stepping back. - Treating the failure-function jump as O(log n). A single mismatch can trigger a chain of π lookups; the chain isn't bounded per step. The amortized argument bounds the total over the whole search, not per text character.
- Unicode pitfalls. KMP compares characters as code units. Pattern matching across surrogate pairs or grapheme clusters needs Unicode-aware indexing — otherwise "café" (4 chars) might mismatch "café" (5 chars).
Frequently asked questions
What is the difference between the prefix function and the failure function?
Both names refer to the same thing — usually. The prefix function π[i] returns the length of the longest proper prefix of pattern[0..i] that is also a suffix. The failure function is a near-synonym from the original 1977 paper. Some textbooks define the failure function as π[i] − 1 (a 1-indexed shift); always check the convention before copying code between sources.
Why is KMP O(n + m) and not O(nm)?
Naive search backs up the text pointer when a mismatch occurs. KMP never does. After each text character is read, either the pattern pointer advances (positive increment) or the failure function jumps it backward — but the text pointer only moves forward. The pattern pointer's net advance over the algorithm is bounded by 2n, giving total work O(n + m).
When is KMP faster than naive search in practice?
Almost never on natural-language text. Naive search has O(nm) worst case but O(n) on typical English — most mismatches happen on the first character. KMP shines on patterns with many repeated prefixes (DNA, binary, regex literals) and on adversarial inputs. For everyday substring search, Boyer-Moore and the SIMD memmem in glibc are usually faster.
What does the Aho-Corasick algorithm have to do with KMP?
Aho-Corasick generalizes KMP to multiple patterns. Where KMP builds a 1D failure function over one pattern, Aho-Corasick builds a trie of all patterns and a 'fail link' from each node — essentially a multi-pattern KMP. It runs in O(n + total pattern length + matches) and is what tools like fgrep use.
Why does the failure function never grow by more than 1 per step?
If π[i] could equal k+2 when π[i-1] = k, that would mean pattern[0..k+1] is a suffix of pattern[0..i] — which means pattern[0..k] is a suffix of pattern[0..i-1] of length k+1, contradicting π[i-1] = k. The failure function increases by at most 1 per step; it can decrease by any amount.
Should I use KMP or built-in string search?
Built-in. Python's str.find, JavaScript's String.includes, and C's strstr all use highly tuned algorithms (often two-way string matching or SIMD-accelerated naive search) that beat hand-rolled KMP on real input. KMP is essential pedagogy and the foundation for Aho-Corasick, but it is rarely the right production choice.