Cosmology

Press-Schechter Formalism

Count the fraction of a Gaussian density field above 1.686 — and you have predicted how many dark matter halos of every mass the universe will make

An analytic prediction of the halo mass function from Gaussian initial fluctuations and a spherical-collapse threshold of δ_c = 1.686.

  • PublishedPress & Schechter, 1974
  • Collapse thresholdδ_c = 1.686
  • Core ideafraction above threshold → mass function
  • The famous fudgefactor of 2, fixed by excursion sets
  • High-mass tailexp(−δ_c²/2σ²) cutoff
  • Calibrated heirSheth-Tormen, Tinker (N-body)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The trick: count noise, predict galaxies

Imagine you want to know how many galaxy clusters of mass 1015 solar masses exist in a cubic gigaparsec of the universe today, and how many of 1012, and how many of every mass in between. The honest way is to take the measured initial conditions, evolve a billion particles forward under gravity in a supercomputer, and count the halos that condense out. The Press-Schechter formalism does it with a pencil. From two assumptions — that the initial density field is Gaussian noise, and that a patch collapses once its overdensity reaches δc = 1.686 — it writes down a closed-form expression for the abundance of halos at every mass. That expression is the halo mass function, and the leap that produces it is one of the most economical ideas in all of cosmology.

The intuition is almost embarrassingly simple. Take the density field of the early universe, δ(x) = δρ/ρ, and blur it with a filter that averages over a sphere containing a mass M. Now ask: what fraction of the volume sits above the collapse threshold? Those over-threshold patches are exactly the regions that will, by today, have pulled themselves together into a bound halo of at least mass M. Because the field is Gaussian, "the fraction above a threshold" is just the tail of a bell curve — an error function. Differentiate that collapsed fraction with respect to M and you have how many halos form at each mass. That is the whole machine.

How it works, ingredient by ingredient

Ingredient one — a Gaussian random field. Inflation predicts that the primordial density contrast is, to exquisite accuracy, a Gaussian random field: at any point, smoothed on any scale, δ is drawn from a normal distribution with zero mean and a variance σ²(M) you can compute. The variance comes from integrating the matter power spectrum P(k) against the filter:

σ²(M) = (1 / 2π²) ∫ P(k) Ŵ²(k R) k² dk ,   M = (4/3) π ρ̄ R³

Large regions average over more of the field, so σ(M) is a decreasing function of mass. This single curve, σ(M), carries all the cosmology — change σ8 or the shape of P(k) and the whole prediction shifts.

Ingredient two — a collapse threshold. Spherical-collapse theory tells you when a region has gone nonlinear. Take a uniform overdense sphere; it expands, decelerates, turns around, and collapses. At the moment of collapse the real overdensity is formally infinite, but if you take the linear-theory growth of that same perturbation and extrapolate it to the collapse time, you get a finite number: δc = 1.686. Since the Gaussian field Press-Schechter counts is the linear field, this is the right yardstick. Any smoothed patch with linear δ > 1.686 has, in reality, already collapsed.

Putting them together. The fraction of mass that has collapsed into objects of mass ≥ M is the fraction of the field, smoothed on scale M, lying above δc:

F(>M) = 2 × ∫[δ_c → ∞] (1 / √(2π σ²)) exp(−δ² / 2σ²) dδ
      = 2 × (1/2) erfc( δ_c / (√2 σ(M)) )
      = erfc( δ_c / (√2 σ(M)) )

Note the factor of 2 out front — we will return to it, because that single number is the formalism's most famous wart. Differentiating −dF/dM and multiplying by ρ̄/M (to convert collapsed-mass fraction into a number density) gives the mass function:

dn/dM = √(2/π) (ρ̄ / M²) (δ_c / σ) |d ln σ / d ln M| exp(−δ_c² / 2σ²)

Everything interesting lives in the exponential. The dimensionless ratio ν = δc/σ(M) — "peak height" — controls the whole shape. Where σ ≫ δc (small M), ν is tiny, the exponential is ≈ 1, and the mass function is a near power law: low-mass halos are everywhere. Where σ drops below δc (cluster scale), ν exceeds 1 and the exponential slams the abundance down. That exponential cutoff is why clusters are exponentially rare and why their counts are such a sharp ruler.

Where 1.686 comes from

The threshold is not a fit — it is a derivation. In an Einstein-de Sitter universe (matter-dominated, flat), track a spherical top-hat overdensity with the exact parametric solution of the collapse:

collapse turnaround:  linear δ_ta = (3/20)(6π)^(2/3) ≈ 1.062
full collapse:        linear δ_c  = (3/20)(12π)^(2/3) ≈ 1.686
virialization:        nonlinear Δ_vir ≈ 178 (the "200×" of M_200)

The same calculation hands you three iconic numbers at once: the linear threshold 1.686 used in Press-Schechter, and the virial overdensity ~178 (rounded to 200 in the M200 halo-mass convention). In a ΛCDM universe the threshold drifts weakly with redshift because dark energy slows growth — at z = 0 it is closer to 1.674 — but the variation is at the percent level and δc = 1.686 is used almost universally as the canonical value.

The factor of 2 and the excursion set

Here is the wart. Press and Schechter's counting argument tallies only the mass sitting directly above δc after smoothing. But a Gaussian is symmetric about zero, and in the limit of a vanishing threshold it splits the mass exactly 50/50 — so their formula assigns only half of all the mass in the universe to collapsed objects, even as the threshold goes to zero where everything should have collapsed. To force the books to balance, Press and Schechter multiplied by 2 by hand. It worked, it matched the simulations of the day, and it bothered theorists for seventeen years.

The missing half is the "cloud-in-cloud" problem. A point can be underdense (δ < δc) when you smooth on a small scale, yet be embedded inside a much larger region that exceeds δc when smoothed on a big scale. That point is genuinely inside a halo; the naive counting throws it away.

The repair came from Bond, Cole, Efstathiou & Kaiser (1991) — the excursion-set picture. Fix a point in space and watch its smoothed overdensity δ(S) as you shrink the smoothing scale, where S = σ²(M) runs upward from zero. With a sharp filter in k-space, each shell of new k-modes is independent, so δ(S) executes a Brownian random walk against the variable S. A halo of mass M is identified with the first scale at which the walk crosses the barrier δc. The mathematics of first passage is classical: the first-crossing distribution of a Brownian walk against a constant barrier is exactly

f(S) dS = (δ_c / √(2π S^3)) exp(−δ_c² / 2S) dS

and integrating it reproduces the Press-Schechter mass function including the factor of 2 — now derived, not imposed. The factor of 2 is the contribution of trajectories that have already crossed the barrier on a larger scale and wandered back below it; the naive argument silently discarded them. After 1991, the formalism stood on a rigorous footing.

Worked example: putting numbers in

Take a ΛCDM universe with σ8 = 0.81, Ωm = 0.31, and evaluate the cluster-scale abundance today. At a mass scale M = 1015 h−1 M, the smoothed variance is roughly σ(M) ≈ 0.55, so the peak height is

ν = δ_c / σ = 1.686 / 0.55 ≈ 3.07
exponential suppression = exp(−ν²/2) = exp(−4.7) ≈ 9 × 10⁻³

That factor of ~0.009 is what makes 1015-solar-mass clusters rare: a comoving number density of order 10−7 per (h−1 Mpc)3. Now drop to M = 1012 h−1 M, a Milky-Way-scale halo. There σ(M) ≈ 1.8, so

ν = 1.686 / 1.8 ≈ 0.94
exponential suppression = exp(−0.44) ≈ 0.64

The suppression has all but vanished — galaxy-scale halos are common, with number densities a factor ~105 higher than clusters. Three orders of magnitude in mass move you from "barely suppressed power law" to "exponentially crushed tail," entirely because σ(M) crossed δc. And notice the sensitivity: if you nudged σ8 up by 5%, σ(M) at the cluster scale rises to ~0.58, ν falls to 2.91, and exp(−ν²/2) jumps to ~0.014 — a 50% increase in cluster abundance from a 5% change in σ8. This exponential leverage is exactly why cluster counts are a premier σ8 probe.

Variants and refinements

The bare Press-Schechter formula is the trunk of a whole family tree:

  • Extended Press-Schechter (EPS). The conditional first-crossing distribution — the probability that a walk crossing δc on scale M1 at redshift z1 crossed an earlier barrier on scale M2 > M1 at an earlier redshift — gives halo merger rates and progenitor mass functions. EPS merger trees power semi-analytic models of galaxy formation.
  • Sheth-Tormen (1999). Real perturbations collapse ellipsoidally, not spherically; the effective barrier becomes mass-dependent ("moving barrier"). Sheth, Mo & Tormen fit a form with this physics that matches N-body simulations far better — fixing PS's factor-~2 over-prediction at low mass and under-prediction at high mass.
  • Tinker et al. (2008). A purely empirical fitting function calibrated directly against large N-body simulations, accurate to a few percent over the cluster-cosmology mass range, with explicit dependence on the chosen overdensity (Δ = 200, 500, etc.). This is what survey pipelines actually use.
  • Non-Gaussian extensions. If primordial fluctuations have a skewness fNL ≠ 0, the tail of the distribution above δc is enhanced or suppressed, shifting the cluster abundance. The high-mass tail is therefore a sensitive probe of primordial non-Gaussianity.

PS versus its descendants

FormalismYearCollapse modelFactor of 2N-body accuracyMain use
Press-Schechter1974spherical, δ_c = 1.686inserted by handtens of percentanalytic intuition
Bond et al. (excursion set)1991spherical, random walkderived rigorouslysame shape as PSrigorous foundation
Extended PS (EPS)1991–93conditional crossingsderivedtens of percentmerger trees, SAMs
Sheth-Tormen1999ellipsoidal, moving barrierbuilt in~10–20%improved mass function
Jenkins et al.2001simulation fitn/a~10–20%universal fit
Tinker et al.2008simulation fit, Δ-dependentn/a~few percentcluster cosmology
Despali et al.2016simulation, virial Δn/a~few percentuniversal recalibration

Observational status and applications

Press-Schechter — or rather its calibrated descendants — is one of the workhorses of precision cosmology. The high-mass exponential tail maps directly onto the abundance of galaxy clusters, the most massive bound objects in the universe, and counting them as a function of mass and redshift is a clean test of structure growth:

  • Sunyaev-Zel'dovich cluster counts. Planck, the South Pole Telescope (SPT), and the Atacama Cosmology Telescope (ACT) detect clusters by the SZ decrement they imprint on the CMB. Fitting their number counts to a Tinker-class mass function constrains σ8 and Ωm — and produced an early, mild tension with the CMB-inferred σ8.
  • X-ray surveys. eROSITA's all-sky X-ray survey detects ~105 clusters; their mass function is among the strongest current handles on the growth of structure and dark energy.
  • Weak-lensing-calibrated counts. DES, KiDS, and HSC weight cluster counts by lensing masses, sidestepping the messy baryonic mass-observable relations that limited earlier work.
  • Reionization and the first halos. The same formalism, pushed to high redshift, predicts how many ~108-solar-mass halos exist at z ~ 10–20 to host the first stars — the seeds JWST is now probing at the bright end.
  • Galaxy formation. Through the extended (conditional) version, PS supplies the merger trees that semi-analytic models hang baryonic physics on, connecting the dark skeleton to observable galaxies.

Common pitfalls and misconceptions

  • Thinking δc = 1.686 is a real, present-day overdensity. It is the linearly-extrapolated overdensity at collapse. The true nonlinear overdensity at collapse is infinite, and at virialization it is ~178–200. The 1.686 only makes sense as a threshold applied to the linear Gaussian field.
  • Forgetting why the factor of 2 is there. It is not a normalization convenience — it is the cloud-in-cloud mass that the naive counting omits, recovered exactly by the random-walk first-crossing probability.
  • Trusting the low-mass slope. Bare PS over-predicts small halos by roughly a factor of 2. For anything quantitative, use Sheth-Tormen or Tinker; reserve PS for intuition and scaling arguments.
  • Confusing the sharp-k filter with a real-space top hat. The clean random-walk derivation needs a sharp k-space filter so increments are independent; but the mass-radius relation M ∝ R³ is usually quoted for a real-space top hat. Mixing the two introduces an order-unity ambiguity in how σ(M) maps to mass.
  • Assuming universality is exact. The mass function expressed in terms of ν is nearly — but not perfectly — universal across cosmology and redshift. Percent-level work requires explicit recalibration (Tinker, Despali), and the choice of halo-finder and overdensity definition matters.
  • Treating it as a derivation of structure formation. PS does not evolve anything dynamically; it is a statistical shortcut that maps linear initial conditions onto a final abundance via the collapse threshold. It says nothing about halo spatial clustering (that needs the bias / peak-background-split argument, also due to the excursion set).

Frequently asked questions

What is the Press-Schechter formalism?

It is an analytic recipe, published by William Press and Paul Schechter in 1974, for predicting the halo mass function — how many gravitationally bound dark matter halos of each mass exist per unit volume — without running a simulation. It rests on two ideas: the primordial density contrast δ = δρ/ρ is a Gaussian random field whose variance σ²(M) follows from the smoothed power spectrum, and spherical-collapse theory says a region collapses once its linearly-extrapolated overdensity exceeds δ_c = 1.686. The fraction of mass in halos above M equals the fraction of the Gaussian field above δ_c, an error-function complement; differentiating with respect to M gives dn/dM.

Why is the collapse threshold exactly 1.686?

It comes from the spherical-collapse model in an Einstein-de Sitter universe. A uniform overdense sphere expands more slowly than the background, turns around, and collapses. At full collapse the true overdensity is infinite, but the linearly-extrapolated value is finite: δ_c = (3/20)(12π)^(2/3) = 1.686. Press-Schechter uses this linear value because the Gaussian field it counts is the linear field. In ΛCDM the number drifts only to about 1.674 today, so 1.686 is used almost universally.

What is the famous factor of 2?

Press and Schechter's original argument counted only mass directly above δ_c after smoothing — exactly half of all mass, because a Gaussian is symmetric about zero. But a point below threshold on a small scale can be embedded in a larger region above threshold, so it really is inside a halo. Ignoring this lost half the collapsed mass, so they multiplied by 2 to conserve total mass. The fudge stood for 17 years until the excursion-set picture derived it rigorously.

How does the excursion-set approach fix the factor of 2?

Bond, Cole, Efstathiou & Kaiser (1991) reframed it as a random walk. At a fixed point, track the smoothed overdensity δ(S) as the smoothing scale shrinks, where S = σ²(M) grows from zero. With a sharp k-space filter the increments are independent, so δ is a Brownian walk. A halo of mass M forms at the first crossing of the barrier δ_c. The first-crossing distribution of a Brownian walk against a constant barrier is exactly twice the naive 'fraction above the barrier' — because trajectories that crossed and came back still count. The factor of 2 is the rigorous first-passage probability.

Why does the mass function have an exponential cutoff?

Because σ(M) decreases with mass: large regions average over more of the field and fluctuate less. The collapsed fraction depends on ν = δ_c/σ(M), and the Gaussian tail above δ_c falls as exp(−ν²/2) = exp(−δ_c²/2σ²). For low-mass halos σ ≫ δ_c, so almost everything has collapsed and the mass function is a gentle power law. For cluster-mass halos σ drops below δ_c, ν exceeds 1, and the exponential crushes the abundance — which is why clusters above ~10^15 solar masses are exponentially rare and so sensitive to σ_8.

How accurate is Press-Schechter compared to simulations?

Good for an analytic formula, but not precise. It captures the shape, exponential cutoff, and redshift evolution at the tens-of-percent level, over-predicting low-mass halos by roughly a factor of 2 and under-predicting high-mass halos. Relaxing perfect spherical collapse to ellipsoidal (mass-dependent, 'moving' barrier) fixes this: Sheth & Tormen (1999) introduced it, and Tinker et al. (2008) calibrated a fitting form against simulations to a few-percent accuracy. Both reduce to Press-Schechter in the appropriate limit.

What does Press-Schechter actually predict that we can test?

The comoving number density of halos as a function of mass and redshift. Observationally this maps onto cluster abundance: counting clusters via X-ray emission, the Sunyaev-Zel'dovich decrement, or weak-lensing mass as a function of mass and redshift tests the high-mass exponential tail. Planck SZ, SPT, eROSITA, and DES cluster counts use a calibrated (Tinker-class) mass function descended from PS to constrain σ_8 and Ω_m. The extended version also supplies merger trees for semi-analytic galaxy-formation models.