Gravitational-Wave Astrophysics
Matched Filtering
Slide a known waveform through the data, weight by the noise spectrum, and a gravitational-wave chirp buried below the noise floor leaps out as a sharp spike in signal-to-noise — the algorithm that lets LIGO hear black holes collide
Matched filtering is the signal-processing technique that finds a known waveform buried in noisier data by cross-correlating the data against a bank of template waveforms, weighting each frequency by the inverse of the noise power. It is how LIGO and Virgo detect gravitational-wave chirps that are tens of times fainter than the surrounding noise.
- Optimal forknown signal + Gaussian noise
- Filter weighth̃*(f) / S_n(f)
- GW150914 SNR≈ 24
- O3 template bank≈ 400,000 waveforms
- SNR builds as√N (cycles)
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
Finding a known shape in noise
Picture a friend whistling a specific tune in a roaring crowd. You cannot hear any single note clearly — every instant is dominated by the crowd. But if you already know the tune, you can mentally line it up against the noise and ride along with it, and after a few bars your brain locks on. That is matched filtering. It is the answer to a sharply posed question: given data that is mostly noise, and given that you know exactly what shape the signal would have if it were there, how do you extract the maximum possible evidence for its presence?
The key insight is that you are not looking for a bump that pokes above the noise at one instant. You are looking for the entire pattern spread across time. A real signal correlates with your template over its whole duration; the noise, being random, correlates with nothing in particular. Slide the template across the data and compute the running cross-correlation, and at the moment the template aligns with a genuine signal, all of that signal's power adds up coherently into a single sharp peak. The noise, adding up randomly, never builds the same way. This is why a gravitational-wave chirp whose instantaneous amplitude never exceeds the noise can still be detected with overwhelming confidence: the algorithm integrates the signal across hundreds of cycles.
The matched filter, precisely
Write the detector output as data = signal + noise, d(t) = h(t) + n(t), where h(t) is the (possibly absent) gravitational-wave strain and n(t) is the noise. The noise is characterised by its one-sided power spectral density (PSD) S_n(f), defined so that the noise's expected variance per unit frequency is set by S_n. With that, define the noise-weighted inner product between two time series a and b:
⟨a | b⟩ = 4 Re ∫₀^∞ [ ã(f) b̃*(f) / S_n(f) ] df
where the tilde denotes the Fourier transform and the star is complex conjugation. Everything in matched filtering is one statement: cross-correlate the data with the template under this inner product. The matched-filter signal-to-noise ratio as a function of a trial arrival time t₀ is
ρ(t₀) = ⟨d | h(t₀)⟩ / √⟨h | h⟩
= 4 Re ∫₀^∞ [ d̃(f) h̃*(f) / S_n(f) ] e^(2πi f t₀) df / σ
The normalisation σ = √⟨h|h⟩ ensures ρ has unit variance in pure noise, so ρ is literally measured in "standard deviations above noise." The expected SNR if a signal of optimal amplitude is present is the optimal SNR
ρ_opt = √⟨h | h⟩ = [ 4 ∫₀^∞ |h̃(f)|² / S_n(f) df ]^(1/2)
Notice the two roles of S_n(f). It whitens the data — dividing by it suppresses frequencies where the detector is noisy and amplifies frequencies where it is quiet. And it sets the optimal filter shape itself: among all linear filters, the one whose frequency response is h̃*(f)/S_n(f) maximises ρ. That optimality is not an approximation; it is the Cauchy-Schwarz inequality applied to the inner product above, and it makes matched filtering the Neyman-Pearson-optimal detector for a known signal in stationary Gaussian noise.
Why integrating cycles beats the noise
The reason a buried signal is recoverable is a square-root law. A coherent signal that persists for N cycles contributes amplitude that adds linearly — the correlation accumulates as N. Random noise, with no phase coherence, accumulates as a random walk, growing only as √N. Their ratio, the SNR, therefore improves as
ρ ∝ N / √N = √N
A compact-binary inspiral sweeps through hundreds of cycles in the LIGO band. If a single cycle has amplitude one-twentieth of the noise, integrating 400 cycles boosts the effective SNR by √400 = 20×, pulling a signal that is invisible instant-by-instant up to a confident detection. This is the same principle that lets a radio receiver lock onto a carrier far below the noise, or a pulsar timer fold thousands of rotations into one clean profile. The catch — and the reason a template bank is needed — is that the cycles only add coherently if your template's phase tracks the signal's phase across the whole observation. A phase error of even a fraction of a cycle accumulated over hundreds of cycles destroys the coherent build-up.
The template bank and the chirp
For a binary inspiral, the waveform is set by the source's parameters: the two component masses, the spins, and extrinsic factors like distance and orientation that scale or shift but do not change the morphology. The dominant parameter is the chirp mass
ℳ = (m₁ m₂)^(3/5) / (m₁ + m₂)^(1/5)
which controls how fast the frequency sweeps. To leading (Newtonian) order the inspiral frequency evolves as
df/dt = (96/5) π^(8/3) (Gℳ/c³)^(5/3) f^(11/3)
— the "chirp": frequency and amplitude both rise steeply as the two bodies spiral together, ending in a final merger and ringdown. Because the true parameters are unknown, the search filters the data against a template bank: a grid of waveforms spaced finely enough that no real signal loses more than a few percent of its SNR (a "minimal match" of typically 0.97) to its nearest neighbour. LIGO's third observing run (O3) used banks of roughly 400,000 aligned-spin templates spanning total masses from about 2 M☉ up to several hundred M☉. Each template is filtered against every detector's whitened data stream continuously, in real time, by pipelines such as PyCBC and GstLAL.
The numbers that make it real
| Quantity | Value | Note |
|---|---|---|
| GW150914 peak strain | ~1.0 × 10⁻²¹ | Below the instantaneous noise floor |
| GW150914 matched-filter SNR | ≈ 24 (network) | ≈ 13 and ≈ 20 in the two detectors |
| Detection significance | > 5.1σ (false alarm < 1 / 200,000 yr) | From time-shifted background |
| Signal duration in band | ~0.2 s (from 35 Hz) | ~8 cycles visible, >10 in band |
| Component masses | 36 M☉ and 29 M☉ | Merged into ~62 M☉ |
| Energy radiated | ~3 M☉ c² ≈ 5 × 10⁴⁷ J | Peak luminosity ~3.6 × 10⁴⁹ W |
| Most sensitive band | ~100–300 Hz | Where S_n(f) is lowest |
| O3 template-bank size | ~4 × 10⁵ | Aligned-spin compact binaries |
| Detections (through O3) | ~90 confident events | GWTC-3 catalogue, 2021 |
The single most striking line is the first two together: a strain of 10⁻²¹ — a fractional length change of the 4 km arms by about one thousandth the width of a proton — was recovered at SNR 24, a confidence of more than five sigma. None of that is possible by staring at the time series. It is matched filtering converting hundreds of coherent cycles into a peak no noise fluctuation could fake.
Worked example: how loud is a binary at 410 Mpc?
Let us estimate the matched-filter SNR for a GW150914-like binary and check that it lands near the observed value. The optimal SNR for an inspiral, in the stationary-phase approximation, scales as
ρ_opt ∝ ℳ^(5/6) / D_L × [ ∫ f^(−7/3) / S_n(f) df ]^(1/2)
where D_L is the luminosity distance. The chirp mass of a 36 + 29 M☉ binary is
ℳ = (36 × 29)^(3/5) / (36 + 29)^(1/5)
= (1044)^0.6 / (65)^0.2
≈ 64.0 / 2.31
≈ 28 M☉
This is the source-frame chirp mass; redshifted to the detector frame at z ≈ 0.09 it is closer to 30 M☉, consistent with the published value. Now the scaling: a fiducial 1.4 + 1.4 M☉ neutron-star binary (ℳ ≈ 1.22 M☉) at a "horizon" distance is the standard yardstick, with Advanced LIGO design sensitivity giving such a binary an SNR of 8 at roughly 200 Mpc. Scaling to our heavier, closer source at D_L ≈ 410 Mpc:
ρ ≈ 8 × (ℳ / 1.22)^(5/6) × (200 Mpc / 410 Mpc)
≈ 8 × (28 / 1.22)^0.833 × 0.49
≈ 8 × 13.6 × 0.49
≈ 53 (order-of-magnitude, design-sensitivity estimate)
The crude estimate overshoots because September 2015 was early O1, when the detectors ran below design sensitivity and only two detectors were online, and because the heavy-binary scaling truncates earlier (the merger leaves the band by ~250 Hz rather than chirping up through the most sensitive region). Folding in the real O1 PSD and the network of two detectors brings the prediction down to the observed network SNR ≈ 24. The exercise still makes the central point: SNR scales as ℳ^(5/6)/D_L, so a heavier or nearer binary is dramatically louder, which is exactly why the first detection was a pair of unexpectedly massive black holes rather than the long-anticipated neutron-star binary.
History: from Wiener to GW150914
The mathematics predates gravitational-wave astronomy by decades. Norbert Wiener and, independently, Andrey Kolmogorov developed optimal linear filtering theory in the 1940s; the radar engineer Dwight North formulated the "matched filter" in a classified 1943 RCA report (published openly in 1963), proving that the SNR-maximising receiver for a known pulse in white noise is one matched to the pulse's time-reversed shape. The technique became the backbone of radar and digital communications. Kip Thorne and collaborators recognised in the 1980s and 1990s that compact-binary inspirals were exquisitely modelled waveforms — perfect matched-filter targets — and post-Newtonian theory plus numerical relativity (the 2005 breakthrough by Frans Pretorius, and the Caltech-Cornell and NASA-Goddard groups) eventually produced templates accurate enough to filter against.
LIGO's two 4 km interferometers in Hanford, Washington and Livingston, Louisiana — designed largely by Rainer Weiss, Kip Thorne, and Ronald Drever, and led to detection by Barry Barish's organisational reforms — began their first Advanced LIGO observing run in September 2015. On September 14, 2015, at 09:50:45 UTC, the matched-filter pipelines flagged a coincident trigger: GW150914, the merger of two black holes 1.3 billion light-years away, recovered at network SNR ≈ 24. Weiss, Thorne, and Barish received the 2017 Nobel Prize in Physics. By the end of the third observing run (GWTC-3, 2021), matched filtering across the LIGO–Virgo network had pulled roughly 90 confident compact-binary signals out of the noise, including the August 2017 neutron-star merger GW170817 that launched multi-messenger astronomy.
Variants and limits
- Chi-squared discriminator. Allen's (2005) χ² test divides the template into frequency bands and checks whether SNR accrued band-by-band as a real chirp must. A glitch that dumps all its power at once produces a large χ² and is vetoed. Triggers are re-ranked by a re-weighted ("new SNR") statistic that penalises bad χ².
- Multi-detector coincidence and coherent statistics. Requiring a trigger in two or more detectors within the ~10 ms light-travel time, with consistent masses, slashes the false-alarm rate. Coherent searches combine the strain across detectors before computing SNR, gaining sky-localisation as a by-product.
- Continuous-wave searches. Spinning neutron stars emit nearly monochromatic signals of unknown frequency and sky position. A fully coherent year-long matched filter would need an impossibly dense template bank, so searches use semi-coherent stacking (e.g. the Hough transform, FrequencyHough, Einstein@Home).
- Unmodelled bursts. Supernovae and cosmic strings have poorly known waveforms; with no template to match, searches use excess-power and wavelet methods (cWB) that look for coherent energy bursts across detectors rather than a specific shape.
- Stochastic background. A diffuse gravitational-wave background has no single waveform; it is sought by cross-correlating two detectors' data, which is matched filtering's logic applied between instruments rather than against a template.
Where matched filtering shows up
- Gravitational-wave detection. The flagship application: PyCBC and GstLAL filter LIGO/Virgo/KAGRA strain against hundreds of thousands of compact-binary templates in low latency, issuing public alerts within seconds for electromagnetic follow-up.
- Radar and sonar. The original use case — a transmitted pulse of known shape returns buried in clutter and thermal noise; the matched receiver maximises detection range. Pulse compression with chirped radar waveforms is matched filtering by another name.
- Digital communications. Receivers use matched filters (root-raised-cosine pulse shaping) and correlators to recover symbols below the noise; the same √N integration gain underlies spread-spectrum and GPS, where the signal is decoded from below the noise floor by correlating against a known pseudorandom code.
- Pulsar searches and timing arrays. Folding many rotations of a pulsar against a timing model is matched filtering in the time domain; pulsar timing arrays cross-correlate timing residuals across many pulsars to hunt the nanohertz stochastic background.
- Seismology and medical imaging. Cross-correlation of seismograms against known source-time functions, and template-matching reconstruction in MRI and ultrasound, all rest on the same optimal-detection principle.
Common misconceptions and subtleties
- "Matched filtering proves the signal is real." A high ρ from a single template in a single detector proves only that something correlated with that template. Non-Gaussian glitches routinely produce SNR > 8. Significance comes from the χ² veto plus multi-detector coincidence plus an empirically measured background (built by time-shifting one detector relative to another so no real coincidence survives).
- "The signal must rise above the noise to be seen." No — the whole point is detecting a signal whose instantaneous amplitude is well below the noise. GW150914 was about 10⁻²¹ in a noise floor that, sample by sample, was larger. Coherent integration, not amplitude, drives detection.
- "You can match-filter without knowing the noise." The PSD
S_n(f)is essential — it whitens the data and defines the optimal filter. Using a wrong or non-stationary PSD biases ρ and inflates false alarms. Pipelines re-estimate S_n(f) continuously from the surrounding data. - "It works for any signal." Matched filtering is optimal only when you know the waveform's shape in advance. Poorly-modelled sources (core-collapse supernovae, cosmic strings) defeat it; that is why burst and stochastic searches exist as separate pipelines.
- "A bigger template bank is always better." Beyond covering the parameter space at the chosen minimal match, extra templates only add computational cost and more trials — and more trials raise the noise background you must beat. Bank design is a trade-off between coverage and the trials factor that controls false-alarm rate.
Frequently asked questions
Why is matched filtering called the "optimal" filter?
For a signal of known shape buried in additive stationary Gaussian noise, matched filtering provably maximises the signal-to-noise ratio of the output among all linear filters — this is the Neyman-Pearson optimal detector for that problem. The proof follows from the Cauchy-Schwarz inequality: the SNR is maximised when the filter's frequency response is the complex conjugate of the template divided by the noise power spectral density, h̃*(f)/S_n(f). No linear filter can do better when the noise is Gaussian and you know the signal's morphology in advance.
How can LIGO detect a signal weaker than the noise?
The instantaneous strain of GW150914 (about 10⁻²¹) was below the detector's noise at any single moment, but the chirp lasted about 0.2 seconds and swept through hundreds of cycles in the sensitive band. Matched filtering coherently adds the signal power across all those cycles while the random noise adds incoherently — signal grows as the number of cycles N while noise grows only as √N, so the effective SNR builds as √N. Integrating ~hundreds of cycles turns a buried waveform into a matched-filter SNR of about 24, far above any noise fluctuation.
What is a template bank?
A template bank is a discrete grid of pre-computed waveforms that covers the physical parameter space of expected signals — for compact binaries, the two component masses and (in modern banks) their aligned spins. Because the true source parameters are unknown, the data must be filtered against thousands to hundreds of thousands of templates. The grid is laid out so that no real signal loses more than a few percent of its SNR to the nearest template; LIGO's O3 banks contained roughly 400,000 templates spanning total masses from about 2 to several hundred solar masses.
Why does the noise power spectral density appear in the filter?
Real detector noise is coloured — far louder at low frequency (seismic, suspension thermal) and near narrow lines (mains harmonics, violin modes) than in the most sensitive band around 100–300 Hz. The matched filter weights each frequency bin by 1/S_n(f), so frequencies where the detector is quiet count heavily and frequencies where it is noisy are down-weighted. This is the "whitening" step. Ignoring S_n(f) and cross-correlating raw data would let the loud low-frequency noise swamp the real signal.
What is the chi-squared test in a gravitational-wave search?
A loud non-Gaussian glitch can produce a high matched-filter SNR even though it does not look like a real chirp. The chi-squared test splits the template into frequency bands and checks whether the SNR accumulated as a true chirp should — band by band — or arrived in one burst as a glitch would. Triggers are re-ranked by an effective SNR that penalises a bad chi-squared. This veto, combined with requiring a coincident trigger in two or more detectors, is what separates a 10⁻²¹ astrophysical chirp from terrestrial noise artefacts.
Does matched filtering work for continuous waves or stochastic backgrounds?
Pure matched filtering is ideal for short, well-modelled transients like compact-binary chirps. Continuous waves from spinning neutron stars are nearly monochromatic but unknown in frequency and sky position, so searches use semi-coherent methods (stacking power over many segments) because a fully coherent matched filter over a year of data would need an astronomically large template bank. Stochastic backgrounds and unmodelled bursts use cross-correlation between detectors or excess-power / wavelet methods instead, since there is no single known template to match against.