Measure Theory
Fubini's Theorem
When iterated integrals equal the double integral — and when they don't
Fubini's theorem (Guido Fubini, 1907) is the rule that lets you compute a double integral by integrating one variable at a time. If f is measurable on a product of σ-finite measure spaces X × Y and ∫|f| d(μ × ν) is finite, then ∫∫ f d(μ × ν) = ∫(∫ f dy) dx = ∫(∫ f dx) dy — both orders agree with the double integral. The companion theorem Tonelli (1909) handles non-negative f without needing finite absolute integral. Together they justify essentially every change-of-order-of-integration argument in modern analysis, probability, and PDE.
- AuthorsFubini 1907 (signed); Tonelli 1909 (non-negative)
- Hypothesisf measurable, ∫|f| d(μ × ν) < ∞
- ConclusionDouble = either iterated integral
- Counterexample∫∫ (x²−y²)/(x²+y²)² disagrees by ± π/4
- Settingσ-finite product measure spaces
- Used inProbability (independence), Fourier, PDE, integral transforms
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Statement
Let (X, μ) and (Y, ν) be σ-finite measure spaces and f : X × Y → ℝ measurable on the product. If
∫_{X × Y} |f| d(μ × ν) < ∞
then f is integrable on X × Y, and:
∫_{X × Y} f d(μ × ν) = ∫_X (∫_Y f(x, y) dν(y)) dμ(x)
= ∫_Y (∫_X f(x, y) dμ(x)) dν(y)
Each of the two iterated integrals exists for μ-almost every x (and ν-almost every y), is itself integrable, and equals the double integral. Three equalities, one hypothesis.
The geometric picture
Imagine a positive function f(x, y) over a rectangle in the xy-plane. The double integral is the volume of the solid below the graph. You can compute this volume two ways:
- Slice along x. For each fixed x, ∫f(x, y) dy is the area of the cross-section at that x. Sum the cross-section areas over x: ∫(∫f dy) dx.
- Slice along y. For each fixed y, ∫f(x, y) dx is the area of the cross-section at that y. Sum over y: ∫(∫f dx) dy.
Fubini says: same volume, whether you slice with vertical or horizontal planes. The volume is intrinsic to the solid; the slicing direction is a computational choice.
Worked example — a simple double integral
Compute ∫₀¹ ∫₀² (x + y²) dy dx. Fubini lets us choose either order.
Order 1 — integrate y first:
∫₀¹ (∫₀² (x + y²) dy) dx
= ∫₀¹ [xy + y³/3]_{y=0}^{y=2} dx
= ∫₀¹ (2x + 8/3) dx
= [x² + 8x/3]_0^1
= 1 + 8/3 = 11/3
Order 2 — integrate x first:
∫₀² (∫₀¹ (x + y²) dx) dy
= ∫₀² [x²/2 + xy²]_{x=0}^{x=1} dy
= ∫₀² (1/2 + y²) dy
= [y/2 + y³/3]_0^2
= 1 + 8/3 = 11/3 ✓
Same answer. The function is bounded and the domain is bounded, so ∫|f| < ∞ trivially — Fubini applies.
Counterexample — when Fubini fails
The classical counterexample on [0, 1] × [0, 1]:
f(x, y) = (x² − y²) / (x² + y²)² for (x, y) ≠ (0, 0)
Iterated one way:
∫₀¹ (∫₀¹ f dy) dx = ∫₀¹ (1/(1 + x²)) ... = π/4
Iterated the other way:
∫₀¹ (∫₀¹ f dx) dy = -π/4
Two different answers — π/2 apart!
What went wrong? Compute the absolute integral:
∫∫ |f| dA = ∞ (divergent near origin)
Fubini's hypothesis ∫|f| < ∞ fails. The positive and negative parts both have infinite mass, and the order of summation rearranges the cancellation. This is exactly analogous to conditionally convergent series — Σ (−1)k/k can be rearranged to converge to any value. Absolute integrability is the protection against this.
Proof idea — extension from indicators
The standard proof builds up in stages:
- Indicator of rectangle. For f = 1_{A × B} with A ⊂ X, B ⊂ Y measurable, (μ × ν)(A × B) = μ(A) ν(B). Iterated integrals each give μ(A) ν(B). Trivial agreement.
- Indicator of general measurable set. Approximate by countable unions of rectangles. The σ-finite hypothesis ensures the product σ-algebra contains them.
- Simple non-negative functions. Linear combinations of indicators. Linearity of integration.
- Non-negative measurable f (Tonelli). Monotone limits of simple functions; pass through monotone convergence. Both iterated integrals well-defined in [0, ∞] and equal.
- Signed f (Fubini). Write f = f⁺ − f⁻ where f⁺ = max(f, 0), f⁻ = max(−f, 0). The hypothesis ∫|f| < ∞ means ∫f⁺ < ∞ and ∫f⁻ < ∞ separately. Apply Tonelli to each, subtract.
This "machine" — prove for indicators, extend by linearity to simple, by monotone convergence to non-negative, by decomposition to signed — is the workhorse of measure-theoretic proofs.
Numerical verification
// Numerical Fubini check on f(x, y) = x · sin(y) on [0, 1] × [0, π]
// Analytical answer: ∫₀^π sin(y) dy = 2, ∫₀^1 x dx = 1/2, product = 1
// Both iterated orders should give 1.
function trapezoidalDouble(f, x0, x1, y0, y1, nx, ny) {
const dx = (x1 - x0) / nx, dy = (y1 - y0) / ny;
let sum = 0;
for (let i = 0; i <= nx; i++) {
const x = x0 + i * dx;
const wx = (i === 0 || i === nx) ? 0.5 : 1;
for (let j = 0; j <= ny; j++) {
const y = y0 + j * dy;
const wy = (j === 0 || j === ny) ? 0.5 : 1;
sum += wx * wy * f(x, y);
}
}
return sum * dx * dy;
}
const f = (x, y) => x * Math.sin(y);
trapezoidalDouble(f, 0, 1, 0, Math.PI, 200, 200); // ~1.0000 — matches analytical
// Counterexample where Fubini fails:
const fBad = (x, y) => {
if (x === 0 && y === 0) return 0;
return (x * x - y * y) / Math.pow(x * x + y * y, 2);
};
// Iterated y-first then x-first
function iteratedYX(f, x0, x1, y0, y1, nx, ny) {
const dx = (x1 - x0) / nx;
let sum = 0;
for (let i = 0; i <= nx; i++) {
const x = x0 + i * dx;
const w = (i === 0 || i === nx) ? 0.5 : 1;
let innerY = 0;
const dy = (y1 - y0) / ny;
for (let j = 0; j <= ny; j++) {
const y = y0 + j * dy;
const wy = (j === 0 || j === ny) ? 0.5 : 1;
innerY += wy * f(x, y);
}
sum += w * innerY * dy;
}
return sum * dx;
}
iteratedYX(fBad, 1e-6, 1, 1e-6, 1, 400, 400); // ≈ 0.785 ≈ π/4
iteratedYX((x, y) => fBad(y, x), 1e-6, 1, 1e-6, 1, 400, 400); // ≈ -0.785 ≈ -π/4
Variants and comparison
| Theorem | Setting | Hypothesis | Conclusion | Year |
|---|---|---|---|---|
| Fubini | σ-finite product measures, signed f | ∫|f| < ∞ | Double = iterated (both orders) | 1907 |
| Tonelli | σ-finite product measures, f ≥ 0 | None (just measurable) | Double = iterated in [0, ∞] | 1909 |
| Fubini-Tonelli | Combined workflow | Use Tonelli on |f|; if finite, apply Fubini | Same as Fubini | — |
| Classical Fubini (Riemann) | Bounded rectangle, continuous f | Continuity | Same equality | Pre-1907 |
| Fubini for distributions | Tempered distributions, test functions | One factor is a distribution | Pairing factors | 20th c. |
| Fubini on σ-finite vs general measure | Non-σ-finite needs care | Without σ-finiteness, can fail | Counterexamples exist | — |
| Sard's theorem (related slice idea) | Smooth maps, slicing | Smoothness | Critical values measure zero | 1942 |
Common pitfalls
- Forgetting the ∫|f| < ∞ hypothesis. The signed Fubini absolutely requires it. Without absolute integrability, the iterated integrals can disagree (counterexample above). Always check ∫|f| first — that's exactly what Tonelli is for.
- Confusing σ-finite with finite. ℝⁿ with Lebesgue measure has infinite measure but is σ-finite (cover by balls). Counting measure on an uncountable set is not σ-finite — Fubini can fail there.
- Treating "Fubini fails ⇒ the integrand isn't integrable." If both iterated integrals exist and agree, the conclusion holds — but Fubini's clean statement requires the absolute hypothesis. In practice, when iterated integrals disagree, you've confirmed ∫|f| = ∞.
- Swapping limits and integrals without justification. Limits inside integrals also need a theorem (Dominated Convergence, Monotone Convergence, Fatou); Fubini is for swapping integrals with each other.
- Applying it on the wrong product space. The measure on X × Y must be the product μ × ν, not some other measure on the rectangle. For non-product measures (Borel measures with non-product structure), Fubini doesn't apply directly.
- Treating Riemann iterated and Lebesgue double as automatically equal. Continuous f on a bounded rectangle — yes, trivially. For unbounded domains or singular f, you need the Lebesgue version explicitly.
Applications
- Probability — independence. If X, Y are independent random variables, their joint distribution is the product measure. E[g(X)h(Y)] = E[g(X)] E[h(Y)] is Fubini on the product. Convolutions of independent X + Y come from Fubini on the joint density.
- Fourier analysis. Plancherel's theorem ‖f‖₂ = ‖f̂‖₂, Parseval, Fourier inversion all require swapping integrals; Fubini-Tonelli is the engine.
- Convolutions. The integral defining (f * g)(x) = ∫f(y)g(x − y) dy and identities like ‖f * g‖₁ ≤ ‖f‖₁ ‖g‖₁ all use Fubini on a double integral.
- Integral transforms. Laplace, Mellin, Hankel transforms all involve inversion formulas that exchange orders of integration — Fubini justifies them when absolute integrability holds.
- PDE — Green's functions. Solving Δu = f via u(x) = ∫G(x, y)f(y) dy and integrating over a domain requires swapping the order of integration when integrating against a test function.
- Statistics — joint distributions and marginals. Marginal density f_X(x) = ∫f_{X,Y}(x, y) dy is a partial Fubini integration; computing E[X] from joint distribution uses Fubini to integrate y out before x.
- Physics — multiple integrals. Volume integrals in cylindrical / spherical coordinates, computations of moments of inertia, mass integrals over 3D solids — all use Fubini to convert triple integrals into iterated single integrals.
History and significance
Guido Fubini proved the signed-function version in 1907 as part of his work on integration on product spaces. Leonida Tonelli followed in 1909 with the non-negative version that doesn't need absolute integrability. The two are often quoted together as Fubini-Tonelli because the working pattern is unified: use Tonelli on |f|, then Fubini on f.
The theorem is foundational because before Fubini, switching the order of integration was a delicate calculation, justified case by case. After Fubini, it became a one-line check: does ∫|f| converge? The result enabled clean development of probability theory (Kolmogorov 1933), Fourier analysis on Lp spaces, and the modern theory of PDE.
Frequently asked questions
What does Fubini's theorem state?
Let f : X × Y → ℝ be measurable on a product of σ-finite measure spaces. If ∫|f| d(μ × ν) < ∞ (absolutely integrable), then f is integrable on X × Y and ∫_{X×Y} f d(μ × ν) = ∫_X (∫_Y f(x, y) dν(y)) dμ(x) = ∫_Y (∫_X f(x, y) dμ(x)) dν(y). Both iterated integrals exist for almost every fixed variable, are themselves integrable, and equal the double integral. Geometrically: volume under the surface = sum of x-slices = sum of y-slices.
What's the difference between Fubini and Tonelli?
Tonelli's theorem (1909) applies to non-negative measurable functions f ≥ 0 and never requires absolute integrability. The iterated integrals always exist in [0, ∞] and agree with the double integral — even if the value is ∞. Fubini (1907) applies to signed (or complex) f and requires ∫|f| < ∞. The standard workflow: use Tonelli to compute ∫|f|; if finite, apply Fubini to swap orders for the signed f. Together they're often called Fubini-Tonelli.
When does Fubini fail?
When the absolute integral diverges, the iterated integrals can disagree. Classic counterexample: f(x, y) = (x² − y²) / (x² + y²)² on [0, 1] × [0, 1]. The iterated integral ∫₀¹ ∫₀¹ f dx dy = −π/4, but ∫₀¹ ∫₀¹ f dy dx = π/4. The two orders give different answers because ∫|f| = ∞ — Fubini's hypothesis fails, and the conclusion fails too. This is the cleanest demonstration that absolute integrability isn't a technicality.
How is Fubini used in probability theory?
For independent random variables X and Y on a product probability space, the joint distribution is the product of marginals. E[XY] = ∫∫ xy dP_X dP_Y = (∫x dP_X)(∫y dP_Y) = E[X]E[Y] is Fubini applied to the product measure. More generally, expectations of functions of independent variables factor by Fubini. Convolutions of independent random variables (sum distribution) are Fubini-derived. Joint-distribution computations rely on the theorem constantly.
Does Fubini work for Riemann integrals too?
Yes, with stricter hypotheses. The classical Riemann version: if f is continuous on a closed rectangle [a, b] × [c, d], the double integral equals either iterated integral. For more general Riemann-integrable f, equality holds when both iterated integrals exist. But Riemann's machinery can't handle the convergence issues that arise in unbounded domains or for highly oscillatory f. The full power of the theorem — handling general measurable f on σ-finite product spaces — requires the Lebesgue framework.
What does σ-finite mean and why is it needed?
A measure space (X, μ) is σ-finite if X can be written as a countable union of measurable sets each with finite measure. ℝⁿ with Lebesgue measure is σ-finite (cover by balls of radius n). σ-finiteness ensures the product measure μ × ν is well-defined and unique. Without it, the product measure can fail to exist or fail to be unique, and Fubini's theorem can break in subtle ways. All standard measure spaces in analysis are σ-finite, so this is rarely a working constraint.
What's the proof idea?
Prove it first for indicator functions 1_E of measurable rectangles E = A × B: ∫1_E = μ(A)ν(B), and the iterated integrals trivially agree. Extend by linearity to simple functions. Take monotone limits using Tonelli to handle non-negative measurable f. Finally, split signed f into positive and negative parts (f = f⁺ − f⁻) and use absolute integrability ∫|f| < ∞ to handle subtraction. The proof is a textbook example of the "extending from indicators" standard machine.