Galaxy Evolution

Stellar Population Synthesis

How astronomers read a galaxy's age, mass, and metallicity straight off its light — by stacking single-age stellar populations

Model a galaxy's light as the sum of single-age stellar populations weighted by its star-formation history, and you recover its age, mass, and metallicity.

Core equationf_gal(λ) = ∫ SSP(λ,t′,Z)·SFR(t−t′) dt′
Building blockSSP = one isochrone × an IMF
Mass from colorM/L_g ≈ 0.6 (blue) → ≈ 4 (red)
Key degeneracyΔlog(age)/Δlog(Z) ≈ 3/2
Workhorse codesBC03 · FSPS · Maraston · pPXF
IMF systematicSalpeter → Chabrier shifts mass ~0.25 dex

Interactive visualization

Press play, or step through manually. Watch single-age populations stack into a composite spectrum, then the colors get fit. Try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

One spectrum, billions of stars

Point a spectrograph at a galaxy a billion light-years away and you collect a single smear of light — one spectrum, one set of broadband colors. But that light is not from one star. It is the blended output of tens of billions of stars, spanning every mass and every age the galaxy has ever produced, all summed into one beam by distance. You cannot resolve a single star in it. And yet astronomers routinely read off that galaxy's age, its total stellar mass, and its chemical enrichment, all from this one blurred fingerprint. The technique that makes this possible is stellar population synthesis (SPS): you model the galaxy's light as the sum of simple single-age stellar populations, weighted by the galaxy's star-formation history, and you fit that model to the observed colors or spectrum.

The logic is bottom-up. Stellar physics tells us what a single star of a given mass, age, and composition looks like. Combine many stars and you get a population. Combine many populations of different ages and you get a galaxy. SPS runs that chain forward to build a model spectrum, then inverts it: given the observed light, what mixture of populations produced it? The answer is the galaxy's history written in starlight.

The building block: a simple stellar population

The atom of the whole subject is the simple stellar population (SSP) — also called a single stellar population. An SSP is an idealisation: a group of stars that all formed at the same instant, from gas of a single chemical composition. A star cluster is the nearest real-world example. To compute an SSP's spectrum, you need three ingredients:

An isochrone. Stellar evolution tracks (Padova/PARSEC, MIST) predict where stars of every mass sit in the Hertzsprung-Russell diagram at a given age. Slicing the tracks at one age gives an isochrone — the line all coeval stars occupy. As the population ages, the main-sequence turnoff slides to lower mass and cooler temperature.
An initial mass function (IMF). The IMF, ξ(m), sets how many stars formed at each mass — many low-mass stars, few high-mass ones. The standard choices are the Chabrier (2003) or Kroupa (2001) IMF; the older Salpeter (1955) power law ξ(m) ∝ m^−2.35 is still used as a reference.
A stellar spectral library. Each point on the isochrone is assigned a spectrum from a library — empirical (MILES, with 985 stars; STELIB) or theoretical (BaSeL, C3K).

Sum the spectra of every star along the IMF-weighted isochrone and you have SSP(λ, t, Z): the spectrum of one burst of star formation, as a function of wavelength λ, age t, and metallicity Z. Young SSPs are blue and luminous, dominated by hot massive stars; as the population ages those stars die and the SSP reddens and dims while red giants take over.

From SSPs to a galaxy

A real galaxy did not form all its stars in one instant. It built them up over cosmic time according to a star-formation history, SFR(t) — the mass of stars formed per year as a function of time. SPS models the galaxy's spectrum as the SSP templates convolved with this history:

f_galaxy(λ) = ∫₀^t_obs  SSP(λ, t', Z(t'))  ·  SFR(t_obs − t')  dt'

In words: take every stellar age t′ present in the galaxy today, look up the SSP spectrum for that age and the metallicity the gas had when those stars formed, weight it by how much star formation happened then, and add them all up. In practice the integral becomes a discrete sum over a grid of SSP ages, each with a weight equal to the stellar mass formed in that bin. This is the literal meaning of the headline claim: the galaxy SED is the sum of SSP(age, Z) times the star-formation history.

Run the sum forward and you have a model SED. The young SSPs contribute the blue continuum and any emission lines; the old SSPs contribute the red continuum, the 4000 Å break, and the deep metal absorption features. The observed colors and spectrum are this superposition — nothing more exotic.

Worked example: fitting a composite from two bursts

Take the simplest non-trivial galaxy: one built from just two SSPs. Suppose 90% of its stellar mass formed 11 Gyr ago at solar metallicity (an old, red bulge population) and 10% formed 0.5 Gyr ago (a young, blue disk burst). We want its broadband g−r color and its stellar mass-to-light ratio.

Component        Mass frac.   Age      g−r color   M/L_g (M_sun/L_sun)
old bulge SSP      0.90      11 Gyr     0.78          4.2
young disk SSP     0.10      0.5 Gyr    0.22          0.5

The young burst is only 10% of the mass but it is blue and luminous, so it carries a disproportionate share of the light. Computing the light fractions in the g band: the old population contributes L_old ∝ M_old / (M/L)_old = 0.90/4.2 = 0.214, and the young population L_young ∝ 0.10/0.5 = 0.200. They contribute almost equally to the light despite the 9:1 mass ratio. The composite color is the light-weighted average, landing around g−r ≈ 0.50 — distinctly bluer than the old population alone. The composite mass-to-light ratio is:

(M/L)_total = M_total / L_total
            = (0.90 + 0.10) / (0.214 + 0.200)
            ≈ 1.0 / 0.414
            ≈ 2.4

So a galaxy that is 90% old by mass has an M/L of only ~2.4, roughly half the value of its dominant old population, because the small young burst hijacks the light. This is outshining, and it is the reason mass estimates that ignore a faint old underlying population can be badly wrong: the old stars carry the mass but the young stars carry the light. It is also why M/L from colors is so useful: feed the measured g−r ≈ 0.50 into Bell & de Jong's calibration and you recover M/L_g ≈ 2.4 directly, then multiply by the measured g-band luminosity to get the stellar mass — no full spectral fit required.

Mass-to-light ratio: stellar mass on the cheap

The stellar mass is usually the single most-wanted number, and the cheapest route to it is the mass-to-light ratio from colors. The physics is intuitive: redder galaxies are older and/or more metal-rich, and their light is dominated by low-mass dwarfs and red giants that contribute mass but little luminosity, so M/L climbs. Bell & de Jong (2001) calibrated this as a near-linear relation between log(M/L) and optical color. Concretely, M/L in the g band runs from about 0.6 for a blue star-forming galaxy (g−r ≈ 0.4) to about 4 for a red passive galaxy (g−r ≈ 0.8). Multiply the observed luminosity by the color-inferred M/L and you have a stellar mass good to roughly a factor of two — the residual uncertainty set mostly by the unknown IMF and dust, not by photon noise.

The age-metallicity degeneracy

Here is the catch that haunts the entire field. Making a stellar population older moves its main-sequence turnoff to cooler, redder stars. Making it more metal-rich does almost exactly the same thing to the optical colors and to most spectral absorption indices — line-blanketing from metals reddens the integrated light. The two effects are nearly indistinguishable. Worthey (1994) quantified this as the famous 3/2 rule:

a factor-of-3 increase in age  ≈  a factor-of-2 increase in metallicity

         Δ log(age) / Δ log(Z)  ≈  3/2     (in their effect on color)

From a single broadband color, or a low-resolution spectrum, you simply cannot tell an old metal-poor galaxy from a younger metal-rich one — they sit on the same point. Age and metallicity slide against each other along a 3/2-sloped degeneracy line, and a fit that pins down only the combination leaves a long, tilted error ellipse in the age-Z plane. This is the dominant source of confusion in interpreting galaxy colors, and it is the reason "this galaxy is red, therefore old" is a dangerous shortcut.

The way out is to use features that respond differently to age and metallicity. The Balmer absorption lines (Hβ, Hδ) are produced by A- and F-type turnoff stars and are strongly age-sensitive but nearly metallicity-blind; iron and magnesium indices are metallicity-sensitive but weakly age-dependent. Plotting an age-sensitive index against a metallicity-sensitive one — classically Hβ versus the combined [MgFe] index — opens the degeneracy into a grid where age runs roughly horizontal and metallicity roughly vertical. Full-spectrum fitting (pPXF, STARLIGHT) uses the entire absorption-line forest at once, and the sheer information content lifts much of the degeneracy. Extending the baseline into the rest-frame near-UV (age-sensitive hot turnoff) and near-IR (metallicity- and giant-sensitive) widens the lever arm further.

Variants and regimes

Parametric vs. non-parametric SFH. Early SPS fits assumed a simple functional form for SFR(t) — an exponential "tau model" SFR ∝ e^−t/τ, or a delayed-tau. Modern codes (Prospector, BAGPIPES) instead fit a flexible non-parametric history in age bins. The choice matters: rigid parametric histories can underestimate stellar mass by up to 0.5 dex because they cannot accommodate an old population hidden under a recent burst.
Index fitting vs. full-spectrum fitting. The Lick/IDS index system measures a handful of absorption-line strengths and compares them to SSP grids — robust but throws away most of the spectrum. Full-spectral-fitting codes such as pPXF fit every pixel of the continuum and lines simultaneously, extracting both stellar populations and line-of-sight kinematics.
Photometric SED fitting. When only broadband photometry is available — the norm for faint, high-redshift galaxies — codes fit the SSP-built SED to a handful of fluxes (CIGALE, BAGPIPES, Prospector), trading spectral detail for the ability to reach the early universe.
TP-AGB-heavy models. Maraston (2005) emphasises the thermally-pulsing asymptotic giant branch, which dominates the rest-frame near-IR of intermediate-age (0.3-2 Gyr) populations; including it can change near-IR-derived masses by tens of percent relative to BC03.
Resolved vs. integrated populations. For nearby galaxies and clusters you can resolve individual stars and fit their color-magnitude diagram against isochrones directly — a far more powerful constraint than integrated light, but only reachable out to a few Mpc.

Common pitfalls and misconceptions

"Red equals old." Because of the age-metallicity degeneracy and dust, a red color can mean old, metal-rich, dusty, or any combination. Never read age off color alone.
Ignoring dust. Dust attenuation reddens a spectrum exactly like added age or metallicity, creating a three-way age-dust-metallicity degeneracy. Fits that omit a dust term systematically over-age galaxies.
Forgetting the IMF is assumed, not measured. Switching from a Salpeter to a Chabrier IMF shifts inferred stellar masses by about 0.25 dex (a factor of ~1.8). Comparing masses across papers without checking the IMF is comparing apples to oranges.
Light-weighted ≠ mass-weighted age. SPS naturally returns a light-weighted age, biased young by any recent star formation. The mass-weighted age — what most people mean by "how old is this galaxy" — is older, and the gap can be billions of years.
Trusting the near-IR blindly. The TP-AGB and horizontal-branch phases are poorly calibrated, so different codes disagree at the tens-of-percent level in the rest-frame near-IR where the cool giants live.
Over-interpreting a single best-fit. The likelihood surface is degenerate and often multi-modal; quoting one best-fit age and metallicity without the full posterior hides the real (large, correlated) uncertainties.

Observational status and applications

Stellar population synthesis is the engine behind nearly every quantitative statement about galaxy evolution. The cosmic stellar-mass function, the star-forming "main sequence" (the tight SFR-mass relation), the build-up of the red sequence, and the cosmic star-formation-rate density — Madau & Dickinson's (2014) census of when the universe made its stars — are all SPS products. Large spectroscopic surveys (SDSS, MaNGA, GAMA) ran full-spectrum SPS on millions of galaxies; JWST now applies the same machinery to galaxies at redshift z > 10, where the surprising abundance of massive, evolved systems in the first few hundred million years is forcing a rethink of early star formation — and stress-testing the SPS models themselves, since exotic early IMFs or strong nebular emission can masquerade as high mass.

The dominant SPS toolkits in use today are Bruzual & Charlot 2003 (BC03/GALAXEV), FSPS (Conroy, Gunn & White 2009), and Maraston 2005, fed by libraries like MILES and tracks like MIST. Inference is done with Bayesian engines — Prospector, BAGPIPES, CIGALE — and with the penalised pixel-fitting code pPXF for full-spectrum work. The systematic floor on any SPS-derived stellar mass is about 0.2-0.3 dex, set by the IMF, the stellar library, the treatment of late evolutionary phases, and the assumed SFH — comfortably larger than the formal statistical error in almost all cases.

SPS codes and ingredients at a glance

Code / tool	Type	Stellar library	Notable strength	Typical use
BC03 (GALAXEV)	Spectral synthesis	STELIB / BaSeL	The field standard since 2003	SED & spectral fitting
FSPS	Spectral synthesis	MILES / C3K	Fully flexible, open-source	Prospector backend
Maraston 2005	Spectral synthesis	BaSeL / empirical	Strong TP-AGB near-IR	High-z near-IR masses
pPXF	Full-spectrum fit	MILES (templates)	Kinematics + populations	IFU surveys (MaNGA)
Prospector	Bayesian SED fit	FSPS-based	Non-parametric SFH posteriors	JWST / HST photometry
BAGPIPES	Bayesian SED fit	BC03-based	Fast, flexible SFH priors	Photometric + spectral z
STARLIGHT	Full-spectrum fit	MILES / BC03	Linear SSP decomposition	SDSS stellar populations

Quantitative analysis: why the degeneracy has slope 3/2

The 3/2 rule is not arbitrary; it falls out of how the main-sequence turnoff responds to age and metallicity. The integrated color of an old population is set chiefly by its turnoff temperature T_TO. As a population ages, the turnoff mass decreases roughly as m_TO ∝ t^−0.3, and since cooler turnoffs are redder, the color reddens monotonically with log(age). Independently, raising metallicity adds line-blanketing — millions of metal absorption lines that suppress blue flux and redistribute it redward — which also reddens the color, and additionally cools the turnoff at fixed mass.

Worthey computed the partial derivatives of broadband colors and Lick indices with respect to both log(age) and log(Z) across a grid of SSP models and found, empirically, that for most optical diagnostics the ratio of sensitivities is nearly constant:

  ∂(color)/∂ log(age)
  ───────────────────  ≈  3/2
   ∂(color)/∂ log(Z)

⇒  a line of constant color satisfies  Δ log(age) = −(3/2) Δ log(Z)

So along the locus of constant observed color, a 0.3 dex (factor-of-2) bump in metallicity is compensated by a −0.2 dex... wait — by the 3/2 ratio, a factor-of-2 change in Z (Δlog Z ≈ 0.30) matches a factor-of-3 change in age (Δlog t ≈ 0.48), since 0.48/0.30 = 1.6 ≈ 3/2. That is precisely the "factor-of-3 in age ≈ factor-of-2 in metallicity" statement. To break it you need a diagnostic whose sensitivity ratio differs from 3/2 — the Balmer lines, which respond almost entirely to the hot turnoff and so have a much steeper age-to-metallicity sensitivity ratio. Crossing a Balmer index with a metal index gives two equations with different slopes in the (log age, log Z) plane, and their intersection pins both parameters. That single geometric fact — two diagnostics, two slopes, one intersection — is the entire strategy behind quantitative stellar population synthesis.

Frequently asked questions

What is a simple (single) stellar population?

A simple stellar population, or SSP, is the idealised building block of population synthesis: a group of stars that all formed at the same instant from gas of a single chemical composition. To build its spectrum you take an isochrone — the locus in the HR diagram occupied by stars of that age and metallicity — populate it with stars drawn from an initial mass function (Chabrier or Kroupa), assign each star a model spectrum from a stellar library such as MILES, and sum them. The result, SSP(λ, t, Z), is the spectrum of one burst of star formation as a function of wavelength, age, and metallicity. A real galaxy is modelled as a weighted sum of many SSPs of different ages.

How does adding up SSPs give a galaxy's spectrum?

A galaxy did not form all its stars at once; it built them up over time according to a star-formation history SFR(t). Population synthesis convolves the SSP templates with this history: f_galaxy(λ) = ∫ SSP(λ, t', Z(t')) · SFR(t_obs − t') dt', integrating over all stellar ages present today. In practice the integral becomes a sum over a grid of SSP ages, each weighted by the mass of stars formed in that age bin. Young SSPs contribute hot blue light and emission lines; old SSPs contribute cool red light. The observed composite spectrum is the superposition.

How do you get stellar mass and mass-to-light ratio from colors?

Broadband colors are a cheap, robust proxy for the stellar mass-to-light ratio M/L. Redder galaxies are older and/or more metal-rich, so their light is dominated by low-mass stars that carry mass but little light, and M/L rises. Bell & de Jong (2001) calibrated log(M/L) as a near-linear function of optical color: M/L in the g band runs from about 0.6 for a blue star-forming galaxy (g−r ≈ 0.4) to about 4 for a red passive galaxy (g−r ≈ 0.8). Multiply the measured luminosity by M/L and you get the stellar mass without fitting a full spectrum — accurate to roughly a factor of two.

What is the age-metallicity degeneracy?

It is the central headache of population synthesis. Making a population older shifts its turnoff to cooler, redder stars; making it more metal-rich does almost the same thing to colors and most spectral indices. Worthey (1994) quantified it as the 3/2 rule: a factor-of-3 change in age produces almost exactly the same change in optical colors as a factor-of-2 change in metallicity, so age and metallicity are anti-correlated along Δlog(age)/Δlog(Z) ≈ 3/2. From a single color you cannot tell an old metal-poor galaxy from a younger metal-rich one. Breaking the degeneracy needs features that respond differently to age and metallicity — Balmer lines versus iron and magnesium indices.

How is the degeneracy broken in practice?

By choosing features that split the age and metallicity directions. The classic lever plots an age-sensitive Balmer index such as Hβ against a metallicity-sensitive index such as [MgFe], forming a grid where age runs nearly horizontal and metallicity nearly vertical. Full-spectrum fitting codes like pPXF or STARLIGHT exploit the entire absorption-line spectrum, and the high information content lifts much of the degeneracy. Adding rest-frame near-UV light (age-sensitive) or near-IR light (metallicity- and giant-sensitive) widens the lever arm. None eliminate the correlation entirely — they tilt and shrink the error ellipse.

Which SPS codes and stellar libraries are standard?

The workhorse codes are Bruzual & Charlot 2003 (BC03/GALAXEV), the Flexible Stellar Population Synthesis code FSPS (Conroy, Gunn & White 2009), and Maraston 2005, which emphasises the TP-AGB contribution to near-IR light. They draw on stellar libraries — empirical (MILES with 985 stars, STELIB) or theoretical (BaSeL, C3K) — and on tracks such as Padova/PARSEC and MIST. Inference engines include Prospector, BAGPIPES, CIGALE, and the full-spectral-fitting code pPXF. The choice of IMF, library, and tracks introduces systematic differences of order 0.1-0.3 dex in derived stellar masses.

What limits the accuracy of SPS-derived parameters?

Several systematics dominate over photon noise. The IMF is assumed, not measured, and switching from Salpeter to Chabrier changes inferred masses by about 0.25 dex. Dust attenuation reddens a spectrum exactly like an older or more metal-rich population, creating an age-dust-metallicity degeneracy. Uncertain late stellar phases — the TP-AGB and horizontal branch — change near-IR and blue light at the tens-of-percent level. And the assumed SFH biases mass and age: a rigid parametric history can underestimate stellar mass by up to 0.5 dex because old stars hide under any recent burst, an effect called outshining.