Star Formation

Initial Mass Function

Why a cloud of gas makes thousands of tiny stars and only a handful of giants — and why those giants run everything

The distribution of stellar masses at birth: above ~0.5 M☉ a steep power law (Salpeter α = 2.35), flattening below. It sets a population's light and chemistry.

Salpeter slopeα = 2.35 above ~0.5 M☉ (1955)
FormdN/dm ∝ m⁻²·³⁵
Low-mass turnoverKroupa & Chabrier flatten below 0.5 M☉
Most common starM dwarf, ~0.2–0.3 M☉
The twistRare O stars dominate the light
SetsMass-to-light ratio & chemical enrichment

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

A cloud collapses, and democracy fails

When a giant molecular cloud collapses and fragments, it does not produce stars of a single size. It produces a whole spectrum of masses, from feeble red dwarfs barely able to ignite hydrogen up to blazing O stars dozens of times the mass of the Sun. The initial mass function — the IMF, written ξ(m) — is the recipe that tells you how many stars of each mass come out of that collapse, at the moment of birth, before any of them have had time to evolve or die. It is one of the most consequential single functions in all of astrophysics, and it is almost shockingly lopsided.

The lopsidedness has a precise form. Above about half a solar mass, the number of stars per unit mass falls as a steep power law,

dN/dm  ∝  m^(−α),     α = 2.35   (Salpeter 1955)

so that for every step up in mass the population thins dramatically. Multiply a star's mass by ten and you cut the number of stars per unit mass by 10^2.35 ≈ 224. The consequence is that low-mass stars are wildly more numerous than massive ones: a typical cluster makes thousands of red dwarfs, hundreds of Sun-like stars, and only a small handful of stars above 8 M☉. Star formation is not a democracy — most of the votes go to the smallest members.

And yet the rare giants run the place. That apparent contradiction — most stars are tiny, but the few massive ones dominate the light, the energy, and the chemistry — is the heart of why the IMF matters, and it is the surprise the visualization above is built to deliver.

Salpeter's 1955 power law

Edwin Salpeter could not watch stars being born, so he did something cleverer. He counted the stars that were still around in the solar neighbourhood, measured their luminosities, converted those to masses, and corrected for the fact that the most massive stars die quickly and so are under-represented among the survivors. From that present-day mass function he reconstructed the birth distribution and found the now-famous power law with exponent α = 2.35 above roughly 0.5 M☉.

It is worth being careful about notation, because two conventions coexist and a lot of confusion lives in the gap between them. Per linear mass interval the slope is α = 2.35. Per logarithmic mass interval — counting stars per decade of mass, the way the histogram in the visualization is drawn — the slope is

dN/d(log m)  ∝  m^(−Γ),     Γ = α − 1 = 1.35

Both describe the same physical distribution; they differ only by the Jacobian d(log m)/dm. When someone quotes "the Salpeter slope" they almost always mean α = 2.35 in the linear convention. The robustness of that number across seventy years of remeasurement is one of the quiet triumphs of stellar astronomy.

The low-mass turnover: Kroupa and Chabrier

A single power law extrapolated all the way down breaks. Push α = 2.35 below half a solar mass and you predict an enormous, ever-growing flood of brown dwarfs and tiny stars that simply is not observed. Real counts show the IMF flattens below ~0.5 M☉ and then turns over, so that the single most common stellar mass is a few tenths of a solar mass — an M dwarf, not a Sun. Two parameterisations capture this turnover and are used almost universally today.

Pavel Kroupa (2001) kept the power-law shape but broke it into segments with different slopes:

Kroupa (2001) broken power law:
   α = 0.3    for   m < 0.08 M☉   (brown dwarfs)
   α = 1.3    for   0.08 ≤ m < 0.5 M☉
   α = 2.3    for   m ≥ 0.5 M☉     (≈ Salpeter)

Gilles Chabrier (2003) instead replaced the low-mass branch with a smooth log-normal distribution peaking near 0.2 M☉, joined onto the Salpeter power law above about 1 M☉:

Chabrier (2003):
   ξ(log m) ∝ exp[ −(log m − log 0.2)² / (2·0.55²) ]   for m ≤ 1 M☉
   ξ(m)     ∝ m^(−2.35)                                  for m > 1 M☉

The two forms are nearly indistinguishable in practice and give almost the same integrated properties. Both share the same high-mass spine — the Salpeter slope α ≈ 2.35 — and both flatten below half a solar mass. That shared structure is what the histogram in the visualization traces: steep on the massive side, rolling over on the low-mass side.

Worked example: counting a 10,000 M☉ cluster

Numbers make the lopsidedness vivid. Take a young cluster that has turned 10,000 M☉ of gas into stars following a Kroupa IMF between 0.08 and 120 M☉. We integrate the IMF twice — once weighted by number to count stars, once weighted by mass to allocate the total — and a third time weighted by luminosity to find where the light comes from.

Cluster total: 10,000 M☉, Kroupa IMF, 0.08–120 M☉

By NUMBER of stars:
   0.08–0.5 M☉  (red dwarfs)   ≈  74%   of all stars
   0.5–1 M☉     (Sun-like)     ≈  16%
   1–8 M☉                       ≈   9%
   above 8 M☉   (will go SN)    ≈  0.3%  →  roughly 60 stars

By MASS:
   below 1 M☉                   ≈  60%   of the cluster mass
   above 8 M☉                   ≈  20%

By LUMINOSITY (light emitted today):
   stars above 10 M☉            ≳  80%   of the cluster's light
   the ~10 most massive O stars  emit more than the other ~20,000 combined

So of the ~20,000-odd stars in this cluster, roughly three out of four are red dwarfs, and only about sixty are massive enough to ever explode as core-collapse supernovae. But those sixty stars — and especially the ten brightest O stars among them — pour out the overwhelming majority of the cluster's light. A single 40 M☉ O star can outshine ten thousand red dwarfs. That is the engine of the IMF's importance: the distribution is bottom-heavy in number and in mass, but top-heavy in light.

Why the rare giants win: the mass–luminosity lever

The reason the giants dominate is a race between two power laws. The IMF removes massive stars steeply, as m^(−2.35). But the mass–luminosity relation adds luminosity even more steeply: on the main sequence L scales roughly as M^3.5 for intermediate masses (flattening toward L ∝ M near the very top, where stars approach the Eddington limit). Combine the two and ask how much light each decade of mass contributes:

light per log-mass  ∝  ξ(log m) · L(m)  ∝  m^(−1.35) · m^(3.5)  ∝  m^(+2.15)

The exponent is positive. The light-weighted distribution rises with mass even though the number-weighted distribution falls — which is exactly why the brightest few stars carry the population. A 20 M☉ star has L ≈ 20^3.5 ≈ 36,000 L☉; the IMF says there is roughly one such star for every several thousand below 1 M☉, but a rarity of a few thousand cannot defeat a brightness advantage of tens of thousands. The most massive stars also produce essentially all of the ionising ultraviolet photons, so they alone carve the HII regions, drive the stellar winds, and set the feedback that regulates the next generation of star formation.

Setting the chemical yield of a population

The same massive stars that dominate the light dominate the chemistry. Stars above ~8 M☉ end their lives as core-collapse supernovae, returning newly forged oxygen, neon, magnesium and other α-elements to the interstellar medium on timescales of millions of years; their progenitors also drive the nucleosynthesis that seeds the next generation. Intermediate-mass stars contribute carbon and nitrogen more slowly, and Type Ia supernovae from white-dwarf binaries add iron over far longer times. Because the IMF fixes how many stars land in each of these channels, it directly controls a population's chemical enrichment history — the rate at which it builds up metals, and the ratio of α-elements to iron that we read off in stellar spectra.

Tilt the IMF and you tilt the chemistry. A top-heavy IMF (relatively more massive stars) enriches faster and produces a higher α/Fe ratio early on; a bottom-heavy IMF locks more baryons into faint, long-lived dwarfs that never return their gas. The abundance patterns of the oldest stars, and the α-enhancement seen in massive elliptical galaxies, are among the few observational handles we have on whether the IMF was ever different in the past.

Universality and its exceptions

The most surprising empirical fact about the IMF is how little it varies. Across the Milky Way disk, in open and globular clusters, in the Magellanic Clouds and in nearby star-forming galaxies, the measured IMF is consistent with a Kroupa- or Chabrier-like form within the uncertainties. Star formation operates over an enormous range of densities, metallicities and turbulent conditions, yet the mass spectrum it produces looks nearly the same everywhere. Why that should be true is still not understood from first principles.

There are, however, claimed exceptions in extreme environments:

Top-heavy. Some intense starbursts, the Arches cluster near the Galactic Centre, and plausibly the very first Population III stars are argued to over-produce massive stars relative to the local norm. Metal-free gas, which cannot cool efficiently, may have fragmented into much larger clumps in the early universe.
Bottom-heavy. Gravity-sensitive spectral features in the cores of giant elliptical galaxies suggest an excess of low-mass dwarfs, implying more mass per unit light than a Milky Way IMF would predict.
Environmental trends. A few studies report a slightly steeper (more bottom-heavy) high-mass slope in low-intensity star formation and a flatter one in vigorous starbursts, hinting that the IMF may depend weakly on the star-formation rate density.

None of these is settled — each rests on subtle modelling of unresolved light, and contradictory results exist. The mainstream working assumption remains a universal IMF, with these variations as live research frontiers rather than established fact.

Comparing the standard IMF forms

IMF	Year	High-mass slope (α)	Low-mass behaviour	Peak mass	Typical use
Salpeter	1955	2.35	Single power law (over-predicts low mass)	none (rises to limit)	Historical, high-mass studies
Miller–Scalo	1979	~2.3 (steepening high)	Log-normal-like turnover	~0.1 M☉	Early Galactic disk models
Kroupa	2001	2.3	Broken power law (α=1.3, then 0.3)	~0.08–0.2 M☉	Default for resolved clusters
Chabrier	2003	2.3	Log-normal below 1 M☉	~0.2 M☉	Default for galaxy SED fitting
Top-heavy	—	~1.5–2.0	Suppressed low-mass end	shifted upward	Starbursts, Galactic Centre
Bottom-heavy	—	~3.0+	Excess of dwarfs	~0.2 M☉ (steeper)	Massive elliptical cores

Where the IMF gets used

Weighing galaxies. Most stellar mass is in faint low-mass stars; most light is in rare bright ones. The IMF is the conversion factor — the stellar mass-to-light ratio — that turns observed luminosity into stellar mass. Switching from Salpeter to Chabrier changes inferred galaxy masses by a factor of ~1.5–2.
Star-formation rates. Indicators like Hα and ultraviolet luminosity measure the output of massive stars only; converting that to a total star-formation rate requires assuming an IMF to account for all the unseen low-mass stars formed alongside them.
Reionisation. The supply of ionising photons that reheated the early universe depends sensitively on how many massive stars early galaxies made — a possibly top-heavy, possibly metal-free IMF.
Supernova and remnant rates. The fraction of the IMF above 8 M☉ sets how many supernovae a population produces, and therefore its yield of neutron stars and black holes.
Chemical evolution models. Every model that tracks a galaxy's metal build-up integrates yields over the IMF; the assumed slope sets the predicted α/Fe sequence.

Common pitfalls and misconceptions

Confusing α and Γ. The "Salpeter slope" is α = 2.35 per linear mass, equivalent to Γ = 1.35 per logarithmic mass. Quoting 1.35 as the linear slope, or 2.35 as the log slope, is a frequent and consequential slip.
Extrapolating Salpeter to zero mass. A single α = 2.35 power law diverges and grossly over-predicts brown dwarfs. The real IMF flattens and turns over below ~0.5 M☉ — that is the entire reason Kroupa and Chabrier exist.
"Most mass is in massive stars." No. Most of the light is, but most of the mass and the overwhelming majority of the star count are in low-mass stars. The IMF is top-heavy only in luminosity.
Confusing the IMF with the present-day mass function. The IMF is the birth distribution; the present-day mass function is what survives now, after massive stars have died off. They agree only for stars long-lived enough to still be on the main sequence.
Ignoring unresolved binaries. A large fraction of stars are in binary or multiple systems. Treating an unresolved pair as one brighter star biases the inferred IMF, particularly at the low-mass end.
Assuming a measured slope is the IMF. Differential dynamical effects — low-mass stars are preferentially ejected from clusters over time — can sculpt the present-day distribution away from the true initial one, especially in old dynamically evolved systems.

Quantitative analysis: where the mass and light end up

To make the "most mass below, most light above" claim exact, integrate the IMF directly. Write the number distribution as ξ(m) = dN/dm = A·m^(−α) over a mass range [m_lo, m_hi]. The number of stars in a band is

N(m₁,m₂) = ∫ A m^(−α) dm = A · [ m^(1−α) / (1−α) ]  from m₁ to m₂

and the mass locked into that band is

M(m₁,m₂) = ∫ A m · m^(−α) dm = A · [ m^(2−α) / (2−α) ]  from m₁ to m₂

With α = 2.35 the number integral is dominated by the lower limit (exponent 1 − α = −1.35, strongly bottom-heavy) while the mass integral has exponent 2 − α = −0.35, only mildly bottom-heavy — which is why most stars are tiny but the mass is more evenly spread, with a substantial fraction still above 1 M☉. Now fold in luminosity L ∝ m^β with β ≈ 3.5. The light from a band is

𝓛(m₁,m₂) = ∫ A m^β · m^(−α) dm = A · [ m^(β+1−α) / (β+1−α) ]  from m₁ to m₂

The exponent is β + 1 − α ≈ 3.5 + 1 − 2.35 = +2.15. Because it is positive, the integral is dominated by the upper limit: the light comes overwhelmingly from the most massive stars present. This is the formal statement of the surprise. The very same exponent arithmetic — flip the sign by changing β or α — is what a "top-heavy" or "bottom-heavy" IMF debate is really about. And it explains a sobering practical fact: the upper mass limit m_hi (around 120–150 M☉, set by the Eddington limit and pair instability) matters enormously for the integrated light but almost not at all for the integrated star count.

Frequently asked questions

What is the initial mass function?

The initial mass function, ξ(m), is the distribution of stellar masses at birth — how many stars a freshly formed population contains in each mass interval, before any have evolved or died. It is usually written as a number density per unit mass, dN/dm ∝ m^(−α). Edwin Salpeter introduced it in 1955 and found that above about 0.5 M☉ the slope is α = 2.35. It is one of the most important empirical inputs in astrophysics because it fixes how a cloud's mass is partitioned among stars, and therefore how much light, energy and heavy-element enrichment a population produces.

What is the Salpeter slope of 2.35?

Salpeter's 1955 power law is dN/dm ∝ m^(−2.35) for masses above roughly 0.5 M☉. The exponent α = 2.35 describes how steeply the number of stars drops as mass rises: increasing the mass tenfold reduces the number of stars per unit mass by 10^2.35 ≈ 224. Expressed per decade of logarithmic mass the slope is Γ = α − 1 = 1.35. Because luminosity scales roughly as L ∝ M^3.5, the rare massive stars outshine the multitude of low-mass stars and dominate the integrated light of any young population.

How do the Kroupa and Chabrier IMFs differ from Salpeter?

Salpeter's single power law is too steep at low masses — extrapolating it down predicts far more brown dwarfs than are observed. Kroupa (2001) used a broken power law: α = 2.3 above 0.5 M☉, α = 1.3 between 0.08 and 0.5 M☉, and α = 0.3 below 0.08 M☉. Chabrier (2003) instead fits the low-mass end with a log-normal peaking near 0.2 M☉, joining the Salpeter power law above 1 M☉. Both flatten below half a solar mass, so the most common stellar mass is a few tenths of a solar mass — an M dwarf, not a Sun.

Why do rare massive stars dominate a population's light?

Because the mass-luminosity relation is far steeper than the IMF. A 20 M☉ O star is only twenty times the Sun's mass, but its luminosity is L ∝ M^3.5 ≈ 36,000 L☉. The IMF makes such stars rare — one above 20 M☉ per several thousand below 1 M☉ — but rarity by a factor of a few thousand cannot beat a luminosity advantage of tens of thousands. So the handful of O and B stars emit most of the visible and essentially all of the ultraviolet light, ionise surrounding gas into HII regions, and inject most of the feedback, despite being a vanishing fraction of the star count.

Is the initial mass function universal?

To a remarkable degree, yes. Across the Milky Way disk, open and globular clusters, the Magellanic Clouds and nearby galaxies, the measured IMF matches a Kroupa- or Chabrier-like form within uncertainties — one of the most striking and least understood facts in star formation. Claimed departures exist: top-heavy IMFs in some starbursts, the Galactic Centre Arches cluster, and the high-redshift universe; and bottom-heavy IMFs inferred from gravity-sensitive spectral features in giant ellipticals. These remain actively debated, and the default working assumption is a universal IMF.

How is the IMF actually measured?

The cleanest method is to count stars directly in a young cluster whose members formed at nearly the same time. You measure each star's luminosity, convert to mass with stellar evolution models and the cluster's age, correct for completeness (faint stars are hard to detect) and for unresolved binaries, then bin the masses to recover ξ(m). For the most massive stars, which die quickly, you must correct for those already lost — which is why Salpeter worked with the present-day mass function of long-lived field stars and inferred the birth distribution. Modern surveys with Hubble, JWST and Gaia push counts past the hydrogen-burning limit into the brown-dwarf regime.

Why does the IMF matter for galaxies and cosmology?

The IMF is the conversion factor between starlight and stellar mass. Because most mass hides in faint low-mass stars while most light comes from rare bright ones, the assumed IMF sets the stellar mass-to-light ratio used to weigh galaxies, the calibration of star-formation-rate indicators like Hα and UV luminosity, the rate of supernovae and heavy-element production, and the ionising-photon budget for reionisation. A Salpeter and a Chabrier IMF differ by ~1.5–2× in inferred stellar mass for the same light, so the choice propagates into nearly every extragalactic measurement.