Algorithms
Karatsuba Multiplication
Three products instead of four — and the quadratic wall comes down
Karatsuba multiplication multiplies two n-digit numbers with three recursive half-size products instead of four, dropping the schoolbook O(n²) cost to O(n^log₂3) ≈ O(n^1.585).
- Time complexityO(n^1.585)
- Recurrence3T(n/2) + O(n)
- Products per level3 (not 4)
- DiscoveredKaratsuba, 1960
- Typical crossover~10–40 limbs
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How Karatsuba multiplication works
When you multiply two big numbers by hand, you multiply every digit of one by every digit of the other. For two n-digit numbers that's n² single-digit products plus the carries. Double the length and the work quadruples. In 1956 Andrey Kolmogorov conjectured this Ω(n²) cost was fundamental — that you simply could not multiply faster. Four years later a 23-year-old named Anatoly Karatsuba broke it in about a week.
The trick is divide and conquer. Split each number down the middle. Pick a base B (10 for decimal, 2³² or 2⁶⁴ for machine words) and a split point m = ⌈n/2⌉, then write:
x = a·Bᵐ + b
y = c·Bᵐ + d
where a, c are the high halves and b, d the low halves. Multiplying out:
x·y = ac·B²ᵐ + (ad + bc)·Bᵐ + bd
The naive reading of that line needs four half-size products — ac, ad, bc, bd — and four products on halves recurse into 4T(n/2), which is still O(n²). You've gained nothing.
Karatsuba's observation: you don't need the middle term's two products separately, only their sum ad + bc. And that sum hides inside a single extra product. Compute three things:
z2 = a·c
z0 = b·d
z1 = (a + b)·(c + d) − z2 − z0 // because (a+b)(c+d) = ac + ad + bc + bd
// so (a+b)(c+d) − ac − bd = ad + bc
Now z1 is exactly the middle term, recovered with only one extra multiplication instead of two. Reassemble:
x·y = z2·B²ᵐ + z1·Bᵐ + z0
The two shifts by Bᵐ and B²ᵐ are just digit-position moves — appending zeros, free in the asymptotic sense. The additions and the two subtractions are O(n). Three recursive products plus linear glue gives the recurrence T(n) = 3T(n/2) + O(n).
Why three products changes the exponent
The whole game is the constant in front of T(n/2). By the Master Theorem, a recurrence T(n) = k·T(n/2) + O(n) with k > 2 is dominated by the leaves and solves to O(n^log₂k):
- Schoolbook:
k = 4→O(n^log₂4) = O(n²). - Karatsuba:
k = 3→O(n^log₂3) = O(n^1.5849...).
That single dropped multiplication shaves the exponent from 2.0 to 1.585. It looks small until you scale it. Going from 4 products to 3 cuts the leaf count of the recursion tree by a factor that compounds at every level — there are 3^(log₂n) leaves instead of 4^(log₂n), and 3^(log₂n) = n^log₂3.
When to use Karatsuba
- Big-integer arithmetic — cryptography (RSA key generation, modular exponentiation), arbitrary-precision math libraries, and computer-algebra systems where operands run to thousands of bits.
- Polynomial multiplication — the identity holds in any commutative ring, so the same three-product split multiplies degree-
npolynomials inO(n^1.585)coefficient operations. - The "medium" size band — numbers too large for schoolbook to win on constants, but too small for FFT-based methods to amortize their overhead. Karatsuba owns this middle ground in essentially every production big-num library.
Do not reach for Karatsuba on small numbers. Below the crossover the extra additions, the three recursive calls, and the memory churn make it slower than the cache-friendly quadratic loop. And for truly enormous inputs, Toom-Cook and FFT-based multiplication overtake it. Karatsuba is a band, not a universal answer.
Karatsuba vs other multiplication algorithms
| Schoolbook | Karatsuba | Toom-3 | Schönhage–Strassen | Harvey–van der Hoeven | |
|---|---|---|---|---|---|
| Time complexity | O(n²) | O(n^1.585) | O(n^1.465) | O(n log n log log n) | O(n log n) |
| Recurrence | 4T(n/2)+O(n) | 3T(n/2)+O(n) | 5T(n/3)+O(n) | FFT over a ring | multidim. FFT |
| Products per split | 4 | 3 | 5 | n/a (transform) | n/a (transform) |
| Constant factor | tiny | small | moderate | large | very large |
| Wins from (roughly) | 1 limb | ~10–40 limbs | ~100–10,000 limbs | ~10,000+ limbs | only astronomically large |
| Year | antiquity | 1960 | 1963 | 1971 | 2019 |
| Real-world use | tiny operands, base case | GMP, Java BigInteger, Python ints | GMP large path | GMP huge path, π records | theoretical (galactic) |
The pattern across the row is a ladder: each algorithm trades a smaller asymptotic exponent for a bigger constant factor, so each one wins only above the crossover where its asymptotics finally beat the previous algorithm's constants. A well-tuned library like GMP runs all of them and picks by operand size.
What the numbers actually say
- The exponent gap is 2.0 vs 1.585. For two 1,024-bit numbers (~32 limbs of 32 bits), schoolbook does ~1,024 limb-products; Karatsuba's effective work scales like
32^1.585 ≈ 243— roughly a 4× reduction at that already-modest size, and the gap widens with length. - At 1,000,000 digits, the ratio is enormous.
n²is 10¹², whilen^1.585 ≈ 10^9.5 ≈ 3×10⁹— about a 300× difference in elementary operations. This is why pi-digit and big-prime records never use schoolbook. - Crossover is real and measured. GMP's tuned
MUL_TOOM22_THRESHOLD(its Karatsuba cutoff) sits around 10–40 limbs depending on CPU. Below it the quadratic loop wins on cache locality and zero recursion overhead. - One extra add per level is the cost. Karatsuba replaces one multiplication with several additions/subtractions. Multiplication of
n/2-digit numbers costs more than addition ofn-digit numbers oncenis large — that inequality is exactly why the trade-off pays off above the crossover and loses below it.
JavaScript implementation
Using BigInt for clarity, with a power-of-base split. The shifts are exact multiplications by 10^m; in a real limb-based library you'd shift by whole machine words instead.
function karatsuba(x, y) {
// Base case: small enough that native multiply wins.
if (x < 10n && y < 10n) return x * y;
// Number of decimal digits in the larger operand.
const n = Math.max(x.toString().length, y.toString().length);
const m = BigInt(Math.ceil(n / 2));
const B = 10n ** m; // split base
const a = x / B, b = x % B; // x = a·B + b (high, low)
const c = y / B, d = y % B; // y = c·B + d
const z2 = karatsuba(a, c); // high · high
const z0 = karatsuba(b, d); // low · low
const z1 = karatsuba(a + b, c + d) - z2 - z0; // the saved cross term
return z2 * B * B + z1 * B + z0; // recombine with shifts
}
// karatsuba(31415926n, 27182818n) === 31415926n * 27182818n → true
Two things to flag. First, the base case matters: real implementations switch to schoolbook once operands drop under the crossover, not at a single digit — recursing all the way down is slower. Second, a + b and c + d can each carry one extra digit beyond m; that's fine here because BigInt grows automatically, but in a fixed-width limb implementation you must allocate room for that carry or you'll silently truncate.
Python implementation
def karatsuba(x, y):
# Base case: hand off small operands to the native multiply.
if x < 10 or y < 10:
return x * y
n = max(len(str(x)), len(str(y)))
m = n // 2
base = 10 ** m
a, b = divmod(x, base) # x = a*base + b
c, d = divmod(y, base) # y = c*base + d
z2 = karatsuba(a, c) # high * high
z0 = karatsuba(b, d) # low * low
z1 = karatsuba(a + b, c + d) - z2 - z0 # cross term, one extra product
return z2 * base * base + z1 * base + z0
# Polynomial Karatsuba — the same identity over coefficient lists.
# Multiplies two polynomials given low-to-high coefficient arrays.
def poly_karatsuba(p, q):
if len(p) == 1 or len(q) == 1:
return _schoolbook(p, q)
# Pad both to a common length so the split point m lines up.
n = max(len(p), len(q))
p = p + [0] * (n - len(p))
q = q + [0] * (n - len(q))
m = n // 2
a, b = p[m:], p[:m] # high, low halves of coefficients
c, d = q[m:], q[:m]
z2 = poly_karatsuba(a, c)
z0 = poly_karatsuba(b, d)
z1 = _sub(_sub(poly_karatsuba(_add(a, b), _add(c, d)), z2), z0)
res = [0] * (2 * n - 1)
for i, v in enumerate(z0): res[i] += v
for i, v in enumerate(z1): res[i + m] += v
for i, v in enumerate(z2): res[i + 2 * m] += v
return res
The polynomial version makes the deeper point: Karatsuba isn't really about decimal digits. The split point is arbitrary and the identity (a+b)(c+d) − ac − bd = ad + bc holds in any commutative ring, so the exact same three-product recursion multiplies polynomials, Gaussian integers, or matrices of numbers.
Variants worth knowing
Toom-Cook (Toom-k). Karatsuba is Toom-2: split into 2 parts, do 3 products. Toom-3 splits into 3 parts and does 5 products via polynomial evaluation/interpolation, giving O(n^log₃5) ≈ O(n^1.465). Higher k pushes the exponent toward 1 but inflates the constant and the interpolation bookkeeping, so libraries stop around Toom-4.
Schönhage–Strassen. Treat the digits as a signal and multiply via the Fast Fourier Transform, turning convolution (which is what multiplication is) into pointwise products. Runs in O(n log n log log n) and is the workhorse for million-digit-plus operands.
Harvey–van der Hoeven (2019). An O(n log n) integer multiplication — the long-conjectured optimum. It's a "galactic" algorithm: the crossover where it beats Schönhage–Strassen is so astronomically large it has no practical use yet, but it closes the theoretical question.
Karatsuba for matrices vs Strassen. The same "fewer multiplications via clever combinations" idea drives Strassen's matrix multiplication, which does 7 multiplications instead of 8 on 2×2 block matrices for O(n^2.807). Strassen is the matrix analogue of Karatsuba's integer trick.
Common bugs and edge cases
- Recursing to a single digit. The biggest performance bug: no base-case cutoff, so you pay recursion overhead on tiny operands. Always fall back to schoolbook below a tuned threshold — the entire speedup is asymptotic and reverses on small inputs.
- The carry in
a+bandc+d. Those sums can bem+1digits, notm. In a fixed-width limb implementation you must size the buffer for the overflow bit, or the middle product is silently wrong. - Mismatched split points. Both numbers must be split at the same
m. Splittingxat the midpoint ofxandyat the midpoint ofywhen they differ in length corrupts the reassembly — pad to a common length or use the larger operand'sm. - Forgetting the double shift. The high term is multiplied by
B²ᵐ, notBᵐ. Off-by-one on the exponent is a classic and produces a result that's wrong by orders of magnitude. - Negative
z1. The subtraction(a+b)(c+d) − z2 − z0is always non-negative for the correct cross term, but intermediate sign handling trips up unsigned limb arithmetic — use a signed accumulator or guard the subtraction order. - Assuming it always wins. Benchmarking Karatsuba on 32-bit numbers and concluding it's "slow" — it is, there. It only beats schoolbook above the crossover, which is the whole reason production code keeps both.
Frequently asked questions
Why is Karatsuba O(n^1.585) and not O(n²)?
The recurrence is T(n) = 3·T(n/2) + O(n): three recursive products on half-size numbers plus linear-time additions and shifts. By the Master Theorem that solves to O(n^log₂3) ≈ O(n^1.585). Schoolbook needs four half-size products, giving T(n) = 4·T(n/2) + O(n) = O(n²).
How does Karatsuba compute a product with only three multiplications?
Split each number as x = a·Bᵐ + b and y = c·Bᵐ + d. The product needs ac, bd, and the cross term ad + bc. Karatsuba gets the cross term for free as (a+b)(c+d) − ac − bd, so it reuses ac and bd and only needs one extra product instead of two.
At what size does Karatsuba beat schoolbook multiplication?
The crossover is workload-dependent but usually tens of machine words. GMP switches from schoolbook to Karatsuba at roughly 10–40 limbs (about 600–2,500 bits). Below that the extra additions and recursion overhead outweigh the saved multiplication, so production code keeps a base case that falls back to the quadratic method.
Is Karatsuba the fastest multiplication algorithm?
No. For very large numbers, Toom-Cook (Toom-3 is O(n^1.465)) and FFT-based methods like Schönhage–Strassen (O(n log n log log n)) and the 2019 Harvey–van der Hoeven O(n log n) algorithm are asymptotically faster. Karatsuba wins only in a middle band — bigger than the schoolbook crossover, smaller than the Toom-Cook/FFT crossover.
Why was Karatsuba historically important?
In 1956 Kolmogorov conjectured that multiplication required Ω(n²) operations. In 1960 the 23-year-old Anatoly Karatsuba found the three-product trick in about a week, disproving the conjecture. It was the first multiplication algorithm to break the quadratic barrier and launched the field of fast arithmetic.
Does Karatsuba work in any number base?
Yes. The split point Bᵐ is arbitrary — base 10 for decimal digits, 2³² or 2⁶⁴ for machine-word limbs, or even polynomial coefficients. The identity (a+b)(c+d) − ac − bd holds in any commutative ring, which is why the same trick speeds up polynomial multiplication, not just integers.