Digital

FIR Digital Filter

A tapped delay line of weighted samples — exactly linear-phase, always stable

A FIR (finite impulse response) digital filter computes each output as a weighted sum of the most recent input samples through a tapped delay line. Symmetric taps give exactly linear phase, and the absence of feedback makes it unconditionally stable. Found in audio crossovers, modem pulse shaping, image processing, and biomedical signal conditioning.

StructureTapped delay line + multiply-add
Impulse responseFinite — N+1 samples long
PhaseExactly linear (symmetric taps)
StabilityUnconditional (all poles at z=0)
Cost per sampleN+1 multiply-accumulates
Group delayN/2 samples, constant

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How a FIR filter works

A FIR filter has only one idea in it, and you can see the whole idea in a moving average. Keep the last five numbers you received, add them up, divide by five, and call the result your output. When the next number arrives, throw away the oldest of the five, slide the rest along, drop the new one in, and average again. That sliding window of recent samples — each one multiplied by a fixed weight before it joins the sum — is a finite impulse response filter.

The physical structure is a tapped delay line: a chain of memory registers, each holding one past sample. Every clock tick, the contents shift one register to the right (older), the newest input lands in the first register, and the oldest falls off the end. We "tap" off the value held in each register, multiply it by that register's coefficient, and sum all the products. That sum is the output for this instant. Because the line is finite — say N+1 registers — the influence of any one input sample lasts exactly N+1 samples and then is gone forever. That finiteness is the "FIR" in the name.

The defining equation is a discrete convolution of the input with the coefficient set:

           N
  y[n]  =  Σ   b[k] · x[n − k]
          k=0

  where  x[n]   = current input sample
         x[n−k] = input delayed by k samples (held in register k)
         b[k]   = the k-th tap weight (filter coefficient)
         N+1    = number of taps (N is the filter "order")

The coefficient set {b[0], b[1], …, b[N]} IS the filter's
impulse response h[k]:  feed in a single 1 followed by zeros,
and the output is exactly b[0], b[1], …, b[N], then silence.

For a 5-tap moving average, every b[k] = 1/5 = 0.2, and the output is just the running mean of the last five inputs. Change those five numbers and you change the filter: a low-pass, a high-pass, a band-pass, or a differentiator all come from the same delay line with different tap weights. The hardware never changes; only the coefficients do.

The math: from taps to frequency response

To see what frequencies a given coefficient set passes or blocks, take the z-transform of the difference equation. Each one-sample delay becomes a factor of z⁻¹:

  H(z) = b[0] + b[1]·z⁻¹ + b[2]·z⁻² + … + b[N]·z⁻ᴺ

  Frequency response (evaluate on the unit circle, z = e^{jω}):

  H(e^{jω}) = Σ b[k] · e^{−jωk}          ω = 2π·f / f_s

This is a polynomial in z⁻¹. Its only poles are at z = 0
(N of them), all at the center of the unit circle —
which is why a FIR filter can never be unstable.

The magnitude |H(e^{jω})| tells you the gain at each frequency; the angle tells you the phase shift. For the 5-tap moving average the magnitude is a sinc-like curve: unity gain at DC (ω = 0), a first null where the sum cancels, and small sidelobes after that. That is why a moving average smooths — it is a crude low-pass that strongly attenuates the fast wiggles while passing the slow trend.

Linear phase is the headline FIR property. If the coefficients are symmetric, b[k] = b[N−k], the response factors as a real amplitude times e^{−jω(N/2)} — a phase that is a straight line through frequency. A straight-line phase means a constant group delay: every frequency is delayed by the same N/2 samples, so the waveform's shape survives intact. No other filter family delivers exactly linear phase by construction.

Worked example: a 5-tap moving-average smoother

Take a noisy temperature sensor reading once per second. We want to knock down the per-sample jitter without losing the trend. Use the 5-tap average, b = [0.2, 0.2, 0.2, 0.2, 0.2], and run a step buried in noise through it:

Input x[n] (°C):  20.3  19.8  20.4  19.9  31.2  30.6  31.4  30.7  31.1
                  ←──── about 20 ────→   ←──── true step to ~31 ────→

y[4] = 0.2·(20.3+19.8+20.4+19.9+31.2) = 0.2·111.6 = 22.32
y[5] = 0.2·(19.8+20.4+19.9+31.2+30.6) = 0.2·121.9 = 24.38
y[6] = 0.2·(20.4+19.9+31.2+30.6+31.4) = 0.2·133.5 = 26.70
y[7] = 0.2·(19.9+31.2+30.6+31.4+30.7) = 0.2·143.8 = 28.76
y[8] = 0.2·(31.2+30.6+31.4+30.7+31.1) = 0.2·155.0 = 31.00

Two things to notice. First, the ±0.5 °C jitter on each level is gone — the averaged output sits cleanly near 20 then near 31. Second, the step does not snap up instantly; it ramps over five samples. That ramp is the group delay at work: the output's mid-point lands two samples (= N/2 = 4/2) after the input's edge. Smoothing and delay are the same coin: the more samples you average to kill noise, the more the output lags reality.

Quantifying the noise win: averaging M independent samples of equal-variance white noise reduces the noise standard deviation by √M. Five taps cut the RMS noise by √5 ≈ 2.24×, turning ±0.5 °C of jitter into roughly ±0.22 °C — at the cost of a 2-sample lag and a gently rounded edge.

Designing the coefficients

A moving average is the trivial case. Real filters need coefficients chosen to hit a magnitude spec — a passband, a stopband, and tolerances on each. Three methods dominate:

Window method. Start from the ideal brick-wall low-pass, whose impulse response is an infinitely long sinc. Truncate it to N+1 taps and multiply by a window (Hamming, Blackman, Kaiser) to suppress the Gibbs-phenomenon ripple that abrupt truncation causes. Fast, intuitive, and the basis of scipy.signal.firwin and MATLAB's fir1. The Kaiser window adds a single β knob that trades ripple against transition width.
Frequency-sampling method. Specify the desired response at N+1 equally spaced frequencies, then inverse-DFT to get the taps. Handy when you have an arbitrary target shape rather than a standard low/high/band spec.
Parks–McClellan (Remez exchange). The professional default for tight specs. It finds the minimum-order equiripple filter — ripple spread evenly across the band rather than piled up near the edge — that meets given passband and stopband tolerances. Exposed as scipy.signal.remez and MATLAB's firpm. For a fixed order it gives the sharpest transition of any method.

The order you need follows roughly from the Kaiser/Harris estimate:

  N ≈ (A_stop − 7.95) / (14.36 · Δf / f_s)

  A_stop = stopband attenuation, dB
  Δf     = transition-band width, Hz
  f_s    = sample rate, Hz

Example: 60 dB stopband, transition 1% of f_s:
  N ≈ (60 − 7.95) / (14.36 · 0.01) ≈ 362 taps

That is 363 multiply-accumulates per output sample —
the cost of a steep linear-phase FIR.

FIR vs IIR: the central trade-off

	FIR (finite impulse response)	IIR (infinite impulse response)
Feedback	None — output from inputs only	Yes — past outputs fed back
Impulse response	Finite (N+1 samples, then zero)	Infinite (rings/decays forever)
Stability	Unconditional (all poles at z=0)	Conditional (poles must stay inside unit circle)
Linear phase	Exact, by symmetric taps	Only approximate (needs all-pass correction)
Order for a given magnitude spec	High (often 4–50× more taps)	Low (e.g. 4th–8th order Butterworth/elliptic)
Arithmetic per sample	N+1 multiply-accumulates	~2× (order) — far fewer for steep specs
Finite-word-length sensitivity	Low — quantizing a tap just nudges the response	High — pole near unit circle can go unstable or limit-cycle
Analog counterpart	None — purely digital construct	Butterworth, Chebyshev, elliptic, Bessel
Typical home	Audio mastering, comms pulse shaping, image kernels	Cheap real-time control, low-power sensor filtering

The one-line summary: choose FIR when phase linearity or guaranteed stability matters and you can afford the arithmetic; choose IIR when you need a steep response at the lowest possible compute and you can tolerate phase distortion. An elliptic IIR might meet a spec in 8th order that costs a FIR 200 taps.

Implementation and efficient structures

Structure / technique	What it does	When to use
Direct-form (transversal)	The literal tapped delay line + MAC sum	Default; maps perfectly to a DSP's MAC unit and circular buffer
Symmetric (folded) form	Adds the mirror-pair samples first, then one multiply per pair	Linear-phase filters — halves the multiplies, free with symmetric taps
Polyphase decomposition	Splits taps into sub-filters for decimation/interpolation	Sample-rate conversion — only computes outputs you keep
FFT / overlap-add & overlap-save	Convolves via the FFT instead of direct MAC	Long filters (hundreds+ taps) — turns O(N) per sample into O(log N)
Cascaded integrator-comb (CIC)	Multiplier-free moving-average chain for huge rate changes	Front-end decimation in radios & sigma-delta ADCs
Distributed arithmetic (FPGA)	Precomputes partial sums in a LUT, no hardware multiplier	Fixed-coefficient FIRs on FPGAs short on DSP slices

On a DSP chip the direct form runs at one tap per clock thanks to a single-cycle multiply-accumulate (MAC) and a modulo address-generation unit that maintains the delay line as a circular buffer — no data actually shifts in memory, the read pointer just walks. A 256-tap filter on a 1 GHz DSP with one MAC therefore burns ~256 cycles per output, capping the throughput near 3.9 MSamples/s on one core; double-MAC or SIMD cores multiply that. Above a few hundred taps, FFT-based fast convolution (overlap-save) wins because its cost grows like log of the length rather than linearly.

Where FIR filters are used

Application	Why FIR	Typical taps
Audio mastering EQ & crossovers	Linear phase keeps transients and stereo image intact	1,000 – 65,000 (long convolution)
Modem / Wi-Fi pulse shaping (root-raised-cosine)	Controls inter-symbol interference; exact symmetry needed	20 – 100 per symbol span
Sigma-delta ADC/DAC decimation (CIC + FIR)	Multiplier-free CIC then FIR cleanup of huge rate change	Tens (after CIC)
Image blur / sharpen / edge kernels	2-D FIR; separable kernels are two 1-D FIRs	3×3 to 15×15
ECG / EEG biomedical conditioning	Linear phase preserves waveform morphology clinicians read	100 – 500
Software-defined radio channelizers	Polyphase FIR banks split a wideband stream into channels	Hundreds, polyphase
5G / LTE baseband filtering	Sharp, phase-flat band limiting before/after the FFT	Hundreds, hardware FIR

The thread running through these is "phase matters." A linear-phase audio crossover lets the woofer and tweeter outputs sum back into the original waveform instead of smearing the attack of a snare drum. A root-raised-cosine FIR at both ends of a radio link is what keeps adjacent data symbols from bleeding into each other. Where only the magnitude response matters and compute is scarce — a thermostat, a battery-gauge sensor — engineers reach for a cheap IIR instead.

Common misconceptions and pitfalls

"FIR filters have no delay." They do — a constant one. A linear-phase FIR delays the whole signal by N/2 samples. That latency is fine offline but can wreck a real-time control or active-noise-cancellation loop, where the IIR's lower delay sometimes wins despite its phase distortion.
"More taps is always better." Every added tap is another multiply-accumulate per sample and another half-sample of group delay. Doubling taps for a 1 dB ripple improvement may double your compute and latency for no audible or measurable gain. Size the filter to the spec, not to a round number.
"A moving average is a good low-pass." It is the cheapest one, but its stopband is terrible — the sinc sidelobes only roll off at 6 dB/octave and the first sidelobe is just −13 dB down. For real rejection you need a windowed or Parks–McClellan design; the moving average is a smoother, not a precision filter.
"Symmetric coefficients are optional." They are the only way to get exact linear phase. If you truncate an asymmetric impulse response or hand-tune taps without preserving b[k] = b[N−k], you reintroduce phase distortion and lose the main reason to pick FIR over IIR.
"Coefficient quantization can destabilize it." It cannot — there is no feedback, so rounding the taps only perturbs the magnitude/phase response slightly. This robustness is exactly why FIRs are favored in fixed-point hardware where an IIR's poles could drift outside the unit circle and the filter could limit-cycle or blow up.
"You must shift the whole delay line every sample." In software you don't — use a circular buffer and a moving write index. Physically shifting N samples each tick is O(N) wasted memory traffic; the circular buffer makes the update O(1).

Frequently asked questions

What is the difference between FIR and IIR filters?

A FIR (finite impulse response) filter computes each output only from past and present inputs — no feedback — so its impulse response ends after N taps and it is always stable. An IIR (infinite impulse response) filter feeds past outputs back into the sum, so its impulse response can ring on forever and it can become unstable if a pole moves outside the unit circle. The trade: an IIR meets a given magnitude spec with far fewer coefficients (often 4 to 10× fewer), but FIR offers exactly linear phase and guaranteed stability that IIR cannot match.

Why are FIR filters always stable?

Stability of a linear time-invariant filter requires that all poles of its transfer function lie strictly inside the unit circle. A FIR filter's transfer function H(z) = b0 + b1·z⁻¹ + … + bN·z⁻ᴺ is a polynomial in z⁻¹, which means every pole sits at z = 0 — the center of the unit circle — regardless of the coefficient values. There is no feedback term to create a pole anywhere else. So no choice of tap weights, and no amount of finite-precision rounding, can ever make a FIR filter unstable.

How does a FIR filter achieve linear phase?

If the tap weights are symmetric — b[k] = b[N−k] — or antisymmetric — b[k] = −b[N−k] — the frequency response factors into a real-valued amplitude term times a pure linear-phase term e^(−jωM), where M = N/2. A linear phase means the phase shift is exactly proportional to frequency, so every frequency component is delayed by the same constant number of samples. The waveform shape is preserved with no phase distortion, which is why FIR filters dominate applications like audio mastering and data communications where waveshape matters.

What is the group delay of a linear-phase FIR filter?

For a linear-phase FIR filter with N+1 taps (order N), the group delay is exactly (N)/2 samples, constant at every frequency. A 5-tap moving average (order 4) delays the signal by 2 samples; a 101-tap audio filter (order 100) delays by 50 samples. At a 48 kHz sample rate, 50 samples is about 1.04 ms. This fixed latency is the price of linear phase — it matters for live monitoring and feedback-loop control but is irrelevant for offline processing.

How do you design FIR filter coefficients?

Three common methods. The window method takes the ideal (infinite) impulse response — a sinc for a low-pass — truncates it to N taps, and multiplies by a window (Hamming, Blackman, Kaiser) to tame the Gibbs ripple. The frequency-sampling method specifies the response at evenly spaced frequencies and inverse-transforms. The Parks–McClellan (Remez) algorithm is the workhorse for tight specs: it finds the minimum-order equiripple filter that meets given passband and stopband tolerances. In practice you call scipy.signal.firwin or firls / remez, or MATLAB's fir1 / firpm.

How many taps does a FIR filter need?

Tap count scales inversely with transition-band width and with stopband attenuation. A common estimate (Kaiser/Harris) is N ≈ (A_stop − 7.95) / (14.36 · Δf / f_s), where A_stop is stopband attenuation in dB and Δf is the transition width. A filter needing 60 dB of stopband rejection over a transition band that is 1% of the sample rate needs roughly 360 taps — that is 360 multiply-accumulates per output sample. This is why a steep FIR can cost 10 to 50× the arithmetic of an IIR doing the same job.