Number Theory
Möbius Function
μ(n) = (-1)^k if n is a product of k distinct primes, 0 if n has a squared factor
The Möbius function μ(n), introduced by August Möbius in 1832, is defined for positive integers as: μ(1) = 1; μ(n) = (-1)^k if n is a product of k distinct primes (square-free); μ(n) = 0 if n has a squared prime factor. Multiplicative — μ(ab) = μ(a)μ(b) for coprime a, b. The Möbius inversion formula is its raison d'être: if g(n) = Σ_{d|n} f(d), then f(n) = Σ_{d|n} μ(n/d) g(d). The Mertens function M(n) = Σ_{k≤n} μ(k) — a famous open problem: M(n) = O(n^(1/2+ε)) is equivalent to the Riemann Hypothesis.
- DefinedMöbius 1832
- Valuesμ(1)=1, μ(p)=-1, μ(p²)=0
- Multiplicativeμ(ab)=μ(a)μ(b) coprime
- Sum identityΣ_(d|n) μ(d) = [n=1]
- Mertens functionM(n) = Σ μ(k)
- Connects to RHM(n) bound
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Why the Möbius function matters
- Inclusion-exclusion at scale. Möbius inversion is inclusion-exclusion for divisor lattices. Counting square-free numbers ≤ n, primes by sieve, integers coprime to a given n, irreducible polynomials over F_q — every such count uses Möbius inversion to convert "≤ d divides" into "exactly d divides".
- Distribution of primes. Selberg's symmetric formula and the Erdős-Selberg elementary proof of the prime number theorem use μ in critical places. The von Mangoldt function Λ(n) — equal to log p when n = p^k, else 0 — satisfies Λ = -μ * log * 1, expressing prime power counts via Möbius inversion.
- Riemann hypothesis equivalences. RH ↔ M(n) = O(n^(1/2+ε)). The partial-sum behaviour of μ is exquisitely sensitive to ζ's zeros, making μ one of the most direct experimental windows on RH.
- Combinatorics on posets. Möbius functions of partially ordered sets generalise μ — in lattice theory, the Möbius function of a finite lattice gives the inversion formula for arbitrary order-preserving counts. Rota's 1964 paper "On the Foundations of Combinatorial Theory" elevated Möbius inversion to a central tool of modern combinatorics.
- Computer science enumeration. Counting necklaces with k beads of c colours uses Möbius inversion: the number of distinct necklaces is (1/k) Σ_(d|k) μ(d) c^(k/d). Counting Lyndon words, primitive sequences, irreducible polynomials over finite fields all follow this pattern.
- Cyclotomic polynomials. The n-th cyclotomic polynomial Φₙ(x) factors as ∏_{d|n} (x^d − 1)^μ(n/d), giving an explicit closed form via Möbius inversion of x^n − 1 = ∏_{d|n} Φ_d(x).
- Cryptography and number-theoretic sums. Sums of multiplicative functions weighted by μ — Mobius randomness sums — appear in proofs of equidistribution, in Vinogradov's theorem on three primes, and in modern sieve theory.
Common misconceptions
- "μ is just a parity function." It is the parity of ω(n) (number of distinct primes) ONLY for square-free n. For non-square-free n, μ is 0 — not ±1. The Liouville function λ(n) = (-1)^Ω(n), counting with multiplicity, is the parity-everywhere companion: λ(p^2) = 1, μ(p^2) = 0. They behave similarly on average but differ pointwise.
- "Mertens conjecture is true." Mertens conjectured |M(n)| ≤ sqrt(n) for n ≥ 1; this would have implied RH. Odlyzko and te Riele disproved it in 1985. The conjecture's failure does not disprove RH — only the strong bound is too tight; a slightly weaker O(n^(1/2+ε)) bound is still equivalent to RH.
- "Irregular = random." μ has structure: it is multiplicative, deterministic, and computable from prime factorisation. The "Möbius randomness heuristic" says correlations of μ with reasonable arithmetic functions are small — but this is a deep conjectural property, not a literal claim that μ is random.
- "Möbius inversion needs a number-theoretic context." Inversion is an identity in any incidence algebra of a locally finite poset. Number theory is one example (divisor lattice); subset lattice gives standard inclusion-exclusion; subgroup lattice gives Burnside-style counts. Rota showed all these are instances of one phenomenon.
- "μ is multiplicative for all a, b." Multiplicativity requires coprime a, b. μ(2 · 4) = μ(8) = 0 ≠ μ(2) · μ(4) = (-1)(0) = 0 (this happens to agree, but try μ(2)·μ(2) = 1 vs μ(4) = 0). Coprimality matters.
- "Computing μ(n) is hard." Given the prime factorisation of n, μ(n) follows immediately: 0 if any exponent ≥ 2, else (-1)^(number of factors). The hard part is factorisation itself — for cryptographic-size n, computing μ(n) is as hard as factoring.
A worked Möbius inversion
Recover Euler's totient from its summatory: it is known that Σ_{d|n} φ(d) = n. Apply Möbius inversion with g(n) = n and f = φ: φ(n) = Σ_{d|n} μ(d) · (n/d). For n = 12: divisors 1, 2, 3, 4, 6, 12 with μ values 1, -1, -1, 0, 1, 0. Hence φ(12) = 12 - 6 - 4 + 0 + 2 + 0 = 4. Check directly: integers in [1,12] coprime to 12 are {1, 5, 7, 11} — count 4. The closed form φ(n) = n · ∏_(p|n) (1 - 1/p) drops out by expanding the product. One identity, two routes; Möbius inversion is the bridge.
Density of square-free integers
Among integers up to x, what fraction are square-free? The Möbius function provides the answer: the count is Σ_{d ≤ sqrt(x)} μ(d) · floor(x/d²) ≈ x · Σ_d μ(d)/d² = x/ζ(2) = 6x/π² ≈ 0.6079 x. So roughly 60.79% of integers are square-free. The proof: each integer is either square-free or has a unique squareful prefactor d², and inclusion-exclusion via μ peels off the squareful contributions. The constant 6/π² recurs in many density problems and is one of the most direct meeting-points of μ, ζ, and elementary counting.
Frequently asked questions
Why μ(n) = 0 for non-square-free?
Because μ is the unique multiplicative function with the property Σ_{d|n} μ(d) = [n=1]. Setting μ(p2) = 0 (and hence μ(p^k) = 0 for k ≥ 2) is the only consistent choice that makes the convolution identity μ * 1 = δ hold. If you tried any other value, the inversion formula would fail. The 0 is forced by structure, not chosen by convention.
What is Möbius inversion and where is it used?
If g(n) = Σ_{d|n} f(d), then f(n) = Σ_{d|n} μ(n/d) g(d). The pair (μ, 1) are convolution-inverse arithmetic functions, so summing one undoes summing the other. Used to: derive φ(n) = Σ_{d|n} μ(d) (n/d) from φ * 1 = id; count irreducible polynomials over F_q via cyclic-group inclusion-exclusion; recover prime power counts from the von Mangoldt function; and as the prototype for Möbius functions on partially ordered sets.
How does μ count Euler's totient (φ = id*μ)?
φ(n) = Σ_{d|n} μ(d) · (n/d). This compact formula expresses Euler's totient as a Dirichlet convolution of identity and Möbius. It is equivalent to the closed form φ(n) = n · ∏_{p|n} (1 - 1/p): expanding the product term by term reproduces the convolution. Computationally it lets you compute φ(n) once you have the prime factorisation of n in a single pass.
What is Mertens' conjecture and why was it disproved?
Mertens conjectured |M(n)| ≤ sqrt(n) for all n ≥ 1, where M(n) = Σ_{k≤n} μ(k). This would have implied the Riemann hypothesis. In 1985, Andrew Odlyzko and Herman te Riele disproved it: there exist n where |M(n)| > sqrt(n), with the first counterexample known to be at most 10^14. The disproof was non-constructive — they didn't exhibit n explicitly, only showed it must exist. Subsequent work has narrowed the bound but no explicit n is known.
How does μ link to Riemann hypothesis?
Through the Mertens function: M(n) = O(n^(1/2 + ε)) for every ε > 0 if and only if RH is true. The Dirichlet series for the Möbius function is 1/ζ(s) = Σ μ(n)/n^s, valid for Re(s) > 1, with analytic continuation determined by ζ. Zeros of ζ correspond to poles of 1/ζ, and the location of those zeros (RH says all on Re(s) = 1/2) is exactly the analytic information needed to bound the partial sums of μ.
Why is μ(n) "random"-looking?
Because primes appear in n irregularly: whether ω(n) is even or odd flips at each prime factor. The Liouville function λ(n) = (-1)^Ω(n) (counting with multiplicity) and μ are both believed to behave like ±1 random variables on average — a heuristic encoded by the Mobius randomness law and supported by partial-sum bounds. Yet the function is fully deterministic: every value is computable from the prime factorisation, no randomness involved.