Numerical Analysis

Bisection Method

Halve [a,b] where f(a)·f(b)<0 and squeeze a root with guaranteed linear convergence

The bisection method finds a root of f(x) = 0 by repeatedly halving an interval [a,b] where f changes sign. Each iteration adds one bit of precision — roughly 3.3 iterations per decimal digit. It's slower than Newton, but never fails on a continuous function with a sign change.

  • Iteration rulec = (a+b)/2; keep half where sign change persists
  • Convergence order1 (linear), exactly 1 bit per step
  • Iterations per decimal digit~3.3 (log₂ 10)
  • Required inputContinuous f, bracket [a,b] with f(a)·f(b)<0
  • GuaranteeAlways converges (intermediate value theorem)
  • Used asSafety fallback in Brent, brentq, TOMS 748

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The interval-halving idea

Suppose f is continuous on [a,b] and f(a) and f(b) have opposite signs — say f(a) < 0 and f(b) > 0. The intermediate value theorem guarantees that f takes every value between f(a) and f(b) somewhere on (a,b), including the value 0. So a root r is trapped inside the interval. You don't know where, only that it's there.

To narrow it down, evaluate f at the midpoint c = (a+b)/2. Three cases:

  • If f(c) = 0 exactly, you're done — c is a root.
  • If f(a)·f(c) < 0, a sign change happens in [a,c]. A root lies in that half; throw away [c,b].
  • If f(c)·f(b) < 0, the sign change is in [c,b]. Keep that half; throw away [a,c].

You always keep the half that still brackets a root, and that half is exactly half the width of the original. Repeat. After n iterations the interval width has shrunk from (b-a) to (b-a)/2ⁿ. That's bisection.

Why convergence is linear (and exactly so)

Let e_n = b_n - a_n be the width of the bracket after n iterations. By construction

e_{n+1} = e_n / 2

So e_n = (b - a) · 2⁻ⁿ. To drive the bracket below tolerance ε you need

n ≥ log₂((b - a) / ε)

Bisection's error halves each iteration, which is one bit of precision per step. Since log₂(10) ≈ 3.32, you gain about three decimal digits every ten iterations, or one decimal digit per 3.3 iterations. Going from a starting interval of width 1 to double-precision machine epsilon (≈ 2.22 · 10⁻¹⁶) needs roughly 53 iterations.

Compare to Newton's method, which doubles correct digits per step — quadratic convergence — and typically reaches machine precision in 4 or 5 iterations. The factor of ~10× in iteration count is the price you pay for the guarantee. For an inexpensive f the absolute cost is still small (53 function evaluations is microseconds on modern hardware), but for a costly f you might want a faster method wrapped around a bisection safety net.

Worked example: computing √2

Solve f(x) = x² − 2 = 0 with the initial bracket [a, b] = [1, 2]. Then f(1) = -1 < 0 and f(2) = 2 > 0, so the bracket is valid. Iterate:

n   a              b              c = (a+b)/2    f(c)         interval width
─   ────────────   ────────────   ─────────────  ─────────    ──────────────
0   1.000000000    2.000000000    1.500000000    +0.250         1.000
1   1.000000000    1.500000000    1.250000000    -0.437         0.500
2   1.250000000    1.500000000    1.375000000    -0.109         0.250
3   1.375000000    1.500000000    1.437500000    +0.066         0.125
4   1.375000000    1.437500000    1.406250000    -0.022         0.0625
5   1.406250000    1.437500000    1.421875000    +0.022         0.03125
...
20  1.414213562    1.414213657    1.414213610    +0.0000001     ~10⁻⁷
...
53  1.414213562    1.414213562    1.414213562    ~0             ~10⁻¹⁶

From a starting bracket of width 1, the algorithm needs about 53 iterations to reach double precision. After 20 iterations the bracket is roughly 10⁻⁶ wide — a six-digit answer. Each row halves the previous interval; the digit count grows logarithmically.

Newton-Raphson on the same problem reaches 16 digits in 5 iterations from x₀ = 1. Bisection is 10× slower in iteration count, but it cannot diverge, cannot cycle, and needs no derivative — facts that matter when you don't trust the function.

Bisection vs Newton vs secant vs Brent

BisectionNewton-RaphsonSecantBrent's method
Convergence order1 (linear)2 (quadratic)≈ 1.618Superlinear most steps
Per-step work1 f-eval1 f + 1 f'1 f-eval1 f-eval
Needs derivativeNoYesNoNo
Needs bracketYes (sign change)No (good guess)Two starting guessesYes (sign change)
Guaranteed convergenceYesOnly locallyOnly locallyYes (bracketed)
Iterations to 10⁻¹⁵ from [1,2]~50~5~7~10
Best whenRobustness mandatoryf' cheap, good startf' unknownDefault production choice

In modern numerical libraries, plain bisection is rarely the user-facing choice for production work — Brent's method (or its refinement, TOMS 748 by Alefeld–Potra–Shi) gives both bisection's guarantee and most of secant's speed. SciPy's scipy.optimize.brentq, MATLAB's fzero, and GSL's gsl_root_fsolver_brent all implement bracketed hybrid solvers with bisection backups.

The algorithm in six lines

while (b - a) > tol:
    c = (a + b) / 2
    if f(c) == 0: return c
    if f(a) · f(c) < 0: b = c        # root is in [a, c]
    else:                a = c        # root is in [c, b]
return (a + b) / 2

One subtlety: compute the midpoint as a + (b - a) / 2, not (a + b) / 2. For very large a and b near the floating-point overflow boundary, the second form can produce a midpoint outside [a,b] due to rounding. The first form always lands inside. Another subtlety: test sign agreement by sign(f(a)) != sign(f(c)) rather than the product f(a)·f(c) < 0, since the product can underflow to zero for tiny but nonzero values.

Why the intermediate value theorem matters

The bisection method is an algorithmic shadow of Bolzano's intermediate value theorem (1817): if f is continuous on [a,b] and y is any value between f(a) and f(b), there exists c ∈ (a,b) with f(c) = y. Set y = 0 and you get the bracket-and-root condition. The proof of the theorem is itself a bisection argument — Bolzano constructs the root as the limit of a sequence of halved sub-intervals, which is exactly what the algorithm does at runtime.

The theorem fails for discontinuous f. The function f(x) = sign(x) - 1/2 has f(-1) = -3/2 and f(1) = 1/2 — a sign change — but no root, only a jump. If you ran bisection on it the algorithm would happily converge to an interval around x = 0, but f never reaches zero there. The output is a sign-change point, not a root. Always check that |f(c)| is small at the end, not just that the bracket is.

Where bisection goes wrong

The method's failure modes are mostly about its inputs, not the iteration itself:

  1. No bracket. If f(a)·f(b) ≥ 0, the algorithm refuses to start. You need a sign change. Common in roots of even multiplicity (f(x) = x² touches zero but doesn't cross) and roots of polynomials with conjugate complex roots near the real axis.
  2. Multiple roots in one bracket. If the bracket contains three roots, bisection converges to one of them — usually unpredictably — and silently ignores the others. Scan the input range for sign changes first if you suspect multiple roots.
  3. Discontinuities. If f jumps across zero rather than passing through it, bisection finds the jump point. The bracket shrinks correctly but f(c) doesn't go to zero. Test |f(c)| as a sanity check at termination.
  4. Near-tangent roots. A function that just barely crosses zero (like f(x) = x · 10⁻¹⁵) gives a tiny f(c) at the midpoint, which can be confused with rounding noise. Use a relative tolerance keyed to the typical magnitude of f, not an absolute one.

Where bisection shows up

  • SciPy's brentq and MATLAB's fzero. Production root-finders that combine an inverse-quadratic-interpolation step with bisection. The bisection fallback is what gives them guaranteed convergence.
  • Implicit volatility solvers in finance. Black–Scholes pricing inverts to recover σ given the option premium; the inversion is monotone, so a bracketed bisection (or its Brent variant) is the standard implementation in Bloomberg and quant libraries.
  • Compiler constant folding. Some compilers evaluate transcendental constants like the smallest positive x with sin(x) = 1/2 by running bisection at compile time. The deterministic bit-per-step rate gives reproducible cross-platform results.
  • Game-physics collision response. When two convex shapes are about to overlap, swept-collision detection bisects the time interval to find the first contact. The function being zeroed is the signed distance between the shapes; bisection's robustness is essential because the distance function may not be differentiable.
  • Searching sorted arrays. Binary search on a discrete array is the integer-arithmetic version of bisection. The same halving argument gives O(log n) lookups.

Variants

  • Regula falsi (false position). Same bracket logic, but choose the new test point as the x-intercept of the secant through (a,f(a)) and (b,f(b)) rather than the midpoint. Faster on convex functions but can stall on one side; Illinois and Pegasus algorithms damp this stalling.
  • Ridders' method. Multiplies a regula-falsi step by an exponential factor that's exact for quadratic f, giving order ≈1.84 convergence with retained bracketing.
  • ITP (Interpolate-Truncate-Project). A modern algorithm (2020) that interleaves a regula-falsi prediction, a width-truncation step, and a projection onto a minimum-shrink target. Provably faster than bisection in the worst case, unlike Brent.
  • Multi-section. Instead of halving, divide the bracket into k sub-intervals and test each. Cheaper in parallel because the k function evaluations are independent. Used inside some root-isolation routines for polynomials.

Common pitfalls

  • Tolerance too small. Asking for an interval width below 2 · machine epsilon · |c| is unreachable — at that scale rounding noise dominates the function value and the algorithm hangs. Use a relative tolerance.
  • Wrong sign test. Multiplying f(a)·f(c) can underflow to zero when both are tiny. Use sign-equality tests directly.
  • Midpoint overflow. Computing (a+b)/2 can overflow for huge a and b. Use a + (b-a)/2.
  • Confusing bracket width with absolute error. The true root is within (b-a)/2 of the midpoint, not (b-a) — a factor of 2 sometimes missed in termination tests.
  • Skipping the convergence check on |f(c)|. A discontinuous f gives a tight bracket around a jump, not a root. Verify f(c) is small before returning.

Why bisection endures

Bisection is the simplest non-trivial algorithm in numerical analysis. Its proof of correctness fits on one line — intermediate value theorem plus halving — and its code fits in six. Once continuous f and a bracket [a,b] with sign change exist, nothing on Earth stops bisection from converging. That bound is unique in the family of root-finders: Newton can diverge, secant can stall, regula falsi can stagnate, but bisection cannot. Every robust solver in the wild keeps a bisection step in its back pocket as the algorithm of last resort.

Production codes wrap it in a faster method because the linear rate is slow. Brent's method, TOMS 748, and Alefeld–Potra–Shi all give bisection's guarantee with most of secant's speed. But strip those algorithms back and the same interval-halving lives at the bottom — the floor everyone falls onto when nothing fancier works.

Frequently asked questions

Why is bisection guaranteed to converge?

By the intermediate value theorem: if f is continuous on [a,b] and f(a)·f(b) < 0, then f takes the value 0 somewhere in (a,b). When you halve the interval, the new sub-interval that still has f-of-endpoints with opposite signs still brackets a root. So at every step a root is trapped inside, the bracket width strictly halves, and the limit point must be a root. No starting-guess gymnastics, no derivative needed — just continuity and a sign change.

How fast does bisection converge?

Linearly, gaining exactly one binary digit per iteration. After n iterations the interval width is (b-a)/2ⁿ. To reach a tolerance of ε, you need n = log₂((b-a)/ε) iterations — about 3.3 iterations per decimal digit because log₂(10) ≈ 3.32. Going from a starting interval of width 1 to machine precision 10⁻¹⁶ takes about 53 iterations. Newton can do the same in 5; bisection's price for guaranteed convergence is that linear rate.

How does bisection compare to Newton's method?

Bisection is slow but bombproof. Newton is fast but fragile. Newton doubles correct digits per step (quadratic), while bisection gains only one bit. But Newton needs f'(x), a good starting guess, and can diverge near horizontal tangents or get stuck in cycles. Bisection asks only for continuity and a bracket. Production root-finders like Brent's method combine the two — use a fast secant or inverse-quadratic step when it stays inside the bracket, fall back to bisection when it doesn't.

What is the stopping criterion?

Three common tests, used in combination. First, the interval width: stop when b - a < tol_abs + tol_rel·|c|. Second, the residual: stop when |f(c)| is below a threshold. Third, an iteration cap. Width-based tests are reliable because bisection's interval is a hard error bound — the true root is provably within (b-a)/2 of the midpoint. Residual-based tests can fail near flat zones where |f(c)| is tiny but c is far from a root.

What about multiple roots inside the bracket?

Bisection finds one root. If the bracket contains multiple sign changes — say three roots r1 < r2 < r3 with f(a) < 0 and f(b) < 0 but f crosses zero three times in between — then f(a)·f(b) > 0 and the algorithm refuses to start. If f(a)·f(b) < 0 there's an odd number of sign changes, and bisection converges to one of them, often unpredictably. Pre-search by scanning sign changes on a fine grid if multiple roots are expected.

Why does every robust solver still use bisection as a backup?

Because nothing else is guaranteed. Brent's method, Ridders' method, Illinois, and TOMS 748 are all hybrid algorithms — they try a fast step (secant, inverse quadratic interpolation, or hyperbolic) and accept it only if the new estimate stays inside the current bracket and reduces it sufficiently. If the fast step fails any test, the algorithm falls back to a bisection step. This preserves the guaranteed convergence of bisection while picking up the speed of higher-order methods when the function behaves well.

Who first wrote down the bisection method?

The idea is implicit in Bernard Bolzano's 1817 proof of the intermediate value theorem — he constructed a root by successive halving and bracketing. Cauchy gave a more explicit version in his Cours d'Analyse (1821). As a computational algorithm with stopping criteria, it solidified in the early 20th century with the rise of numerical tables. Today it's the first algorithm in every numerical analysis textbook because its proof is one line and its code is six.