Statistical Mechanics

Partition Function

Z = Σ exp(−βEᵢ) — the Rosetta stone that maps microscopic energy levels to all of thermodynamics

Z sums Boltzmann weights over every microstate. Free energy F = −kT·ln Z; every thermodynamic observable is a derivative of ln Z. Once you have Z, you have the thermodynamics.

DefinitionZ = Σᵢ exp(−βEᵢ), β = 1/(kT)
Free energyF = −kT·ln Z
Mean energy⟨E⟩ = −∂ln Z/∂β
EntropyS = (⟨E⟩ − F)/T
Two-level ZZ = 2·cosh(βε)
FactorizationIndependent parts: Z = Z₁·Z₂·…·Zₙ

Interactive visualization

A stack of discrete energy levels. Each level's Boltzmann weight exp(−βEᵢ) is shown as a bar. They sum to Z. Slide the temperature and watch the weights redistribute.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Definition

In the canonical ensemble (system at fixed T, V, N in contact with a heat bath), the probability of microstate i with energy Eᵢ is:

P(i) = (1/Z) · exp(−βEᵢ),     β ≡ 1/(k·T)

where the partition function Z is the normalization:

Z(T, V, N) = Σᵢ exp(−βEᵢ)

The sum runs over every accessible microstate. Equivalently, group states by energy and sum over energy levels using the density of states g(E):

Z = Σ_E g(E) · exp(−βE) = ∫ g(E) · exp(−βE) dE  (continuous case)

That's it. The whole machine. Everything else is calculus on ln Z.

From Z to thermodynamics

The Helmholtz free energy is:

F(T, V, N) = −kT · ln Z

Every thermodynamic relation flows from F. A few highlights:

Quantity	From ln Z	From F
Mean energy ⟨E⟩	−∂ln Z/∂β	F − T·∂F/∂T
Entropy S	k·(ln Z + β·⟨E⟩)	−(∂F/∂T)_V
Pressure P	(1/β)·(∂ln Z/∂V)_T	−(∂F/∂V)_T
Heat capacity Cv	k·β²·∂²ln Z/∂β²	−T·(∂²F/∂T²)_V
Variance of energy	∂²ln Z/∂β²	kT²·Cv
Magnetization M (B field)	(1/β)·∂ln Z/∂B	−(∂F/∂B)_T

Worked example — two-level system

Consider a single spin-½ in magnetic field B. Two states: aligned (E = −μB) and anti-aligned (E = +μB).

Z = exp(+βμB) + exp(−βμB) = 2·cosh(βμB)

Take the log and differentiate:

ln Z = ln 2 + ln cosh(βμB)
⟨E⟩ = −∂ln Z/∂β = −μB · tanh(βμB)
S/k = ln Z + β·⟨E⟩ = ln 2cosh(βμB) − βμB·tanh(βμB)

Limits:

High T (kT ≫ μB): tanh ≈ βμB → ⟨E⟩ ≈ −μ²B²/(kT). Curie's law follows: M ∝ 1/T.
Low T (kT ≪ μB): tanh ≈ 1 → ⟨E⟩ ≈ −μB. System frozen in ground state. S → 0.

The heat capacity Cv = k·(βμB)²·sech²(βμB) peaks at kT ≈ μB — the famous "Schottky anomaly" of two-level systems.

Factorization for independent subsystems

If the total energy is a sum of independent contributions E = E₁ + E₂ + … + E_N, then:

Z_total = Z₁ · Z₂ · … · Z_N
ln Z_total = ln Z₁ + ln Z₂ + … + ln Z_N

This is why ln Z is extensive (proportional to system size). For an ideal gas of N indistinguishable particles:

Z(N, V, T) = (1/N!) · Z_1(V, T)^N
F = −kT·ln Z = −NkT[ln(V/(N·λ³)) + 1]
   where λ = h/√(2π·m·kT) is the thermal de Broglie wavelength

From this single expression you get the ideal-gas law (PV = NkT), Sackur-Tetrode entropy, equipartition (⟨E⟩ = (3/2)NkT), and the heat capacity Cv = (3/2)Nk. Half of intro thermodynamics drops out of one logarithm.

Numerical examples

System	Energy spectrum	Z (one mode)	Heat capacity
Two-level (spin in B)	±μB	2·cosh(βμB)	Schottky anomaly at kT ≈ μB
Harmonic oscillator (quantum)	(n + ½)ħω, n = 0,1,…	1/(2·sinh(βħω/2))	Einstein C: k at high T; ∝ T at low T
Particle in 3D box	continuous	V/λ³	(3/2)k per particle (equipartition)
Rotational diatomic (rigid rotor)	ℓ(ℓ+1)·B_rot, deg 2ℓ+1	~ kT/B_rot at high T	+k per molecule above the rotational temperature
Bose gas (mode k)	n·εₖ, n = 0,1,…	1/(1 − exp(−βεₖ))	~T³ phonons at low T
Fermi gas (mode k)	0 or εₖ	1 + exp(−βεₖ)	Sommerfeld: linear in T

Concrete numerical anchor: a spin with μB = 1 meV in a field has its Schottky peak at T = μB/k ≈ 11.6 K. At room temperature (kT ≈ 25.7 meV) the same spin is in the high-T regime — strongly paramagnetic, weakly Curie.

Worked example — Curie susceptibility

For a paramagnet of N independent spin-½'s, magnetization is:

M = N·μ·tanh(βμB)
At kT ≫ μB:  M ≈ N·μ²·B/(kT)
Susceptibility χ = M/(B/μ₀) → χ = μ₀·N·μ²/(kT) = C/T
   (Curie's law; C = μ₀·N·μ²/k)

For 1 mole of Bohr-magneton spins (μ = 9.27 × 10⁻²⁴ J/T), at 300 K:

χ ≈ (4π × 10⁻⁷) · (6.022 × 10²³) · (9.27 × 10⁻²⁴)² / (1.381 × 10⁻²³ · 300)
χ ≈ 1.6 × 10⁻⁵  (SI dimensionless)

This is the order of magnitude of typical paramagnetic salts — directly derived from one cosh.

JavaScript — partition function arithmetic

const k_B = 1.380649e-23;  // J/K

// Two-level (spin-½) partition function
function Z_twoLevel(eps, T) {
  const beta = 1 / (k_B * T);
  return 2 * Math.cosh(beta * eps);
}
function E_twoLevel(eps, T) {
  const beta = 1 / (k_B * T);
  return -eps * Math.tanh(beta * eps);
}

// Energy in J ↔ μ·B in J  (μ_B ≈ 9.274e-24 J/T)
const muB = 9.274e-24;
const B = 1;  // 1 tesla
console.log(`Z(spin, 1 T, 300 K) = ${Z_twoLevel(muB * B, 300).toFixed(8)}`);
console.log(`⟨E⟩ = ${E_twoLevel(muB * B, 300).toExponential(2)} J`);

// Quantum harmonic oscillator
function Z_qho(hw, T) {
  const beta = 1 / (k_B * T);
  return 1 / (2 * Math.sinh(beta * hw / 2));
}
function E_qho(hw, T) {
  const beta = 1 / (k_B * T);
  return (hw / 2) / Math.tanh(beta * hw / 2);
}
// vibration of N–N: ħω ≈ 2.36 × 10⁻²⁰ J → T_vib ≈ 1710 K
const hw_N2 = 2.36e-20;
console.log(`Cv(N₂ vib, 300 K) factor: ${(hw_N2 / (k_B * 300))**2 / Math.sinh(hw_N2 / (k_B * 300))**2 * Math.exp(hw_N2 / (k_B * 300))}`);
// Effectively frozen out at 300 K — Cv contribution ≈ 0

// Free energy
function F_from_Z(Z, T) { return -k_B * T * Math.log(Z); }

// Ideal gas partition function (per particle) Z₁ = V / λ³
function lambda_thermal(m, T) {
  const h = 6.626e-34;
  return h / Math.sqrt(2 * Math.PI * m * k_B * T);
}
function Z1_idealGas(V, m, T) {
  const lam = lambda_thermal(m, T);
  return V / (lam * lam * lam);
}
// N₂ molecule at STP: m ≈ 4.65e-26 kg, V = 22.4 L per mol of single particle would be huge
console.log(`λ(N₂, 300 K) = ${lambda_thermal(4.65e-26, 300).toExponential(2)} m`); // ~1.9e-11

// Curie susceptibility from spin partition function
function curieSusceptibility(N, mu, T) {
  const mu0 = 4 * Math.PI * 1e-7;
  return mu0 * N * mu * mu / (k_B * T);
}
console.log(`χ(1 mol Bohr magnetons, 300 K) ≈ ${curieSusceptibility(6.022e23, muB, 300).toExponential(2)}`);

Where the partition function shows up

Equilibrium thermodynamics. Every textbook calculation of heat capacity, free energy, equation of state, magnetization, polarization — all reducible to Z and its logarithm.
Phase transitions. Singularities in F (and hence ln Z) signal phase transitions. The Ising model's partition function on a 2D square lattice was solved exactly by Onsager — a tour de force whose critical exponents agree with experiment.
Polymer physics. Partition function for random walks gives the radius of gyration, persistence length, elasticity (entropic spring constant).
Chemistry — reaction rates. Transition state theory writes k_rxn = (kT/h)·(Z‡/Z_reactant)·exp(−ΔE/kT). The partition function of the transition state controls the prefactor.
Astrophysics. Stellar atmospheres use partition functions of atomic species to compute level populations via the Saha equation.
Field theory. The path integral generalizes Z to functional sums; quantum field theory and statistical mechanics in d+1 dimensions are Wick-rotated cousins of each other.
Machine learning. Energy-based models (Boltzmann machines, RBMs, contrastive divergence) train on log-partition gradients. The Hopfield network's stability is partition-function thermodynamics.

Common mistakes

Forgetting the N! for indistinguishable particles. For identical particles you divide by N! to avoid Gibbs's paradox. Skipping it gives the wrong entropy (extensive contradictions) for ideal gases.
Using classical Z when quantization matters. Below the Einstein/Debye temperature, vibrational modes freeze out and classical equipartition overestimates Cv by factors of 2-3. Always check β·ħω vs 1.
Mixing up canonical and microcanonical. Canonical: fixed T, energy fluctuates. Microcanonical: fixed E, no heat bath. They agree at large N but you must use Z (canonical) for T-derivatives and the multiplicity Ω(E) (microcanonical) for E-derivatives.
Truncating the sum too aggressively. States above β·E ≫ 10 contribute negligibly, but if you stop at β·E ≈ 1 you can miss important contributions — particularly for systems with dense high-E spectra like rotors.
Confusing partition function with probability. Z is the NORMALIZATION; the probability is exp(−βE)/Z. Don't write P(E) = exp(−βE) without dividing by Z (a common shortcut that breaks at low T).
Differentiating Z when you should differentiate ln Z. Thermodynamic relations use ∂ln Z/∂β, ∂ln Z/∂V, etc., not ∂Z directly. The log is what makes it extensive.

Frequently asked questions

What is the partition function Z?

Z is a sum over every microstate i of the Boltzmann weight exp(−βEᵢ), where β = 1/(k·T) and Eᵢ is the state's energy. Equivalently Z = Σ_E g(E)·exp(−βE), where g(E) is the density of states at energy E. Z normalizes the probability that the system is in any given state — P(i) = exp(−βEᵢ)/Z — and serves as the generating function for every thermodynamic quantity.

How do you get free energy from Z?

The Helmholtz free energy is F = −kT·ln Z. From F all the usual thermodynamic relations follow: S = −(∂F/∂T)_V, P = −(∂F/∂V)_T, and ⟨E⟩ = F + TS. Equivalently ⟨E⟩ = −∂ln Z/∂β. Once you have Z(T, V, N), you have the complete thermodynamics of the system — no extra physics needed.

Why is ln Z extensive?

For non-interacting subsystems, the total partition function factorizes: Z_total = Z₁ · Z₂ · … · Z_N. Taking the log gives ln Z_total = Σ ln Zᵢ, which scales linearly with system size — extensive, exactly like F = −kT·ln Z. This is why partition functions are usually quoted per particle or per mode, then multiplied (or exponentiated in ln) for the full system.

What's a worked example of a partition function?

Two-level system (spin-½ in a magnetic field B): energies ±μB. Z = exp(βμB) + exp(−βμB) = 2·cosh(βμB). Average energy ⟨E⟩ = −μB·tanh(βμB). At high T (βμB ≪ 1) ⟨E⟩ ≈ 0 — paramagnetic. At low T (βμB ≫ 1) ⟨E⟩ ≈ −μB — fully aligned. Heat capacity peaks around k·T ≈ μB (a Schottky anomaly). The entire physics emerges from differentiating ln Z = ln 2cosh(βμB).

Why is it called the canonical ensemble?

The canonical ensemble describes a system at fixed (T, V, N) in thermal contact with a much larger heat reservoir. The probability of microstate i is exp(−βEᵢ)/Z — the canonical distribution. "Canonical" because Gibbs introduced it as a starting axiom whose form is uniquely fixed by requiring extensive thermodynamics and maximum entropy at fixed mean energy. The grand canonical ensemble adds chemical potential (variable N); the microcanonical ensemble fixes E exactly.

How does temperature change the Boltzmann weights?

Higher T flattens the distribution: exp(−E/kT) becomes more uniform across states. Lower T sharpens it: only the ground state and a few low-lying excited states contribute. At T → 0 the system collapses into its ground state; at T → ∞ all accessible states become equally likely. This is why the partition function captures the freezing-out of degrees of freedom as a function of temperature — high-energy modes drop from the sum exponentially as T falls.

How does Z connect to quantum statistical mechanics?

Quantum mechanically Z = Tr(exp(−βĤ)), the trace of the Boltzmann operator. The density matrix is ρ = exp(−βĤ)/Z. For non-interacting bosons or fermions you build Z from single-mode partition functions: 1/(1 − exp(−βεₖ)) for bosons, 1 + exp(−βεₖ) for fermions, which lead to Bose-Einstein and Fermi-Dirac distributions. Same machinery — sum over states with Boltzmann weight — but the eigenvalues come from quantum mechanics instead of classical phase space.