Is the Dirac delta really a function?

No. There is no function from ℝ to ℝ that is zero almost everywhere yet has integral one. The delta is a distribution — a continuous linear functional on test functions. Physicists treat it like a function for calculation, but the rigorous definition assigns ⟨δ, φ⟩ = φ(0) directly, no integration involved.

What does the sifting property mean?

The integral ∫ f(x) δ(x − a) dx = f(a). The delta sifts out the value of f at the single point x = a; everywhere else it contributes nothing. This identity is the definition in disguise — what 'evaluation at a point' looks like inside an integral.

How is δ(x) approximated by ordinary functions?

Take any sequence of functions whose integral is one and whose mass concentrates at zero — narrow Gaussians, narrow rectangles, sinc kernels. As the width shrinks to zero, integrals against any continuous test function converge to its value at zero. The delta is the limit of such sequences, but only in the distributional sense.

What is the derivative of the unit step?

The Heaviside step u(x) is zero for x 0. As a distribution, du/dx = δ(x). The derivative concentrates at the jump and equals zero everywhere else, with total integral one — exactly the delta. This identity is fundamental to physics, where delta sources represent point charges, point masses, and instantaneous impulses.

Why does the delta keep showing up in physics?

Anywhere a quantity is concentrated at a point — a point charge in electrostatics, a point mass for a Green's function, an instantaneous impulse in mechanics — δ provides the mathematical idealisation. It is the simplest input you can give a linear system, so its response (the impulse response) characterises the entire system.

What's the Fourier transform of δ?

F{δ(x)} = 1 — the constant function. Equivalently, the inverse transform of 1 is δ. The delta has uniform spectrum: it is the impulse that contains every frequency in equal amount. That's exactly why feeding δ to a linear system reveals the system's frequency response.

Dirac Delta Function — Distribution, Sifting & Examples

The honest definition

Paul Dirac introduced δ(x) in 1927 as a function with two properties:

δ(x) = 0 for x ≠ 0
∫_{−∞}^∞ δ(x) dx = 1

Read literally, this is impossible. A function that is zero almost everywhere has Lebesgue integral zero, not one. Mathematicians filed Dirac's notation under "useful but informal" until Laurent Schwartz in 1945 built the theory of distributions that made δ rigorous.

The clean definition: δ is a continuous linear functional on the space of smooth, compactly-supported test functions φ. Its action is:

⟨δ, φ⟩ = φ(0)

That's it — δ takes a test function and returns its value at zero. There is no pointwise function δ(x); the integral notation ∫ f(x) δ(x) dx is shorthand for ⟨δ, f⟩, computed by the rule above. Similarly ∫ f(x) δ(x − a) dx = f(a) is the action of the shifted delta on f.

The sifting property — the only formula you need

Every fact about δ flows from the sifting property:

∫_{−∞}^∞ f(x) · δ(x − a) dx = f(a)

The delta selects the value of f at x = a and discards the rest. With a = 0 you recover ∫ f δ = f(0); the total-integral property comes from f ≡ 1, giving ∫ δ = 1.

Two immediate consequences worth noting. First, δ is the identity for convolution:

(f ∗ δ)(x) = ∫ f(y) δ(x − y) dy = f(x)

Convolving with δ does nothing — it returns the function unchanged. Second, the shifted delta δ(x − a) acts as a translation operator: f ∗ δ(· − a) = f(· − a).

Approximating δ by ordinary functions

To build intuition, take any family of functions whose integral is one and whose mass concentrates at zero. Three canonical choices:

Family	Formula	Limit (ε → 0)
Narrow rectangle	(1/ε) for \|x\| < ε/2, 0 otherwise	δ(x)
Gaussian	(1/(ε √(2π))) e^(−x²/(2ε²))	δ(x)
Lorentzian	(ε/π) / (x² + ε²)	δ(x)
Sinc kernel	sin(x/ε) / (πx)	δ(x)

For any of these families and any continuous f, ∫ f(x) gε(x) dx → f(0) as ε → 0. That convergence — in the weak / distributional sense — is the precise content of "gε approaches δ". No single pointwise limit function exists; only the action on test functions has a sensible limit.

Useful identities

Identity	Meaning
δ(−x) = δ(x)	Even symmetry
δ(ax) = δ(x) / \|a\|	Scaling — counterintuitive but follows from change of variables
x · δ(x) = 0	Multiplication by anything vanishing at 0 kills δ
f(x) δ(x − a) = f(a) δ(x − a)	Pointwise product replaces f by its value at a
∫ f(x) δ(g(x)) dx = ∑ₖ f(xₖ) / \|g'(xₖ)\|	Sum over zeros xₖ of g where g'(xₖ) ≠ 0
δ'(x) — distributional derivative	⟨δ', φ⟩ = −φ'(0)
du/dx = δ(x), where u is the Heaviside step	The step jumps from 0 to 1 at x = 0; its derivative is δ

The scaling identity δ(ax) = δ(x)/|a| catches almost everyone. Substitute u = ax in ∫ f(x) δ(ax) dx:

∫ f(x) δ(ax) dx = ∫ f(u/a) δ(u) du / |a| = f(0) / |a| = ∫ f(x) · δ(x)/|a| dx

so the two distributions agree as functionals.

Worked example — sifting in action

Compute ∫_{−∞}^∞ x² · δ(x − 3) dx. Use the sifting property with f(x) = x², a = 3:

∫ x² · δ(x − 3) dx = (3)² = 9

That's the entire calculation — no integration, just evaluation. Compute ∫ cos(x) δ(x − π) dx = cos(π) = −1. Or ∫ e^x δ(x) dx = e^0 = 1. The delta turns integrals into evaluations.

A more interesting example: solve x² · y(x) = 1 for the distribution y. Multiply both sides by a test function φ and integrate; you find y must equal 1/x² wherever x ≠ 0, plus arbitrary constants times δ and δ' at the origin. The answer is a sum of an ordinary function and singular delta-supported pieces — a typical distribution in the wild.

Distribution-zoo comparison

	Ordinary function f(x)	Heaviside step u(x)	Dirac delta δ(x)
Pointwise value	Defined for all x	0 for x < 0, 1 for x > 0	Undefined (no pointwise meaning)
Continuous	Sometimes	No (jump at 0)	Not even a function
Integrable on ℝ	If decaying	No (grows linearly in cumulative integral)	Yes (∫ δ = 1)
Distributional derivative	f'(x) (when f is smooth)	δ(x)	δ'(x), supported at 0
Support	Wherever f ≠ 0	[0, ∞)	{0}
Fourier transform	Decays if f is smooth	1/(iω) + π δ(ω)	1 (constant)
Convolution identity	No	Integration: (f ∗ u)(x) = ∫_{−∞}^x f	Yes: f ∗ δ = f

Notice that u and δ form a derivative-integral pair in the distributional sense, just as polynomials do in classical calculus. Heaviside engineered an "operational calculus" exploiting precisely this relationship in the 1890s — decades before Dirac and Schwartz made it rigorous.

In transforms

The Laplace transform of δ is one of the simplest:

L{δ(t)} = ∫₀^∞ δ(t) e^(−st) dt = e^(−s · 0) = 1

And L{δ(t − a)} = e^(−as). The Fourier transform mirrors this:

F{δ(x)} = ∫_{−∞}^∞ δ(x) e^(−iωx) dx = 1

F{δ(x − a)} = e^(−iωa). Conversely F{1} = 2π δ(ω) — the constant function's spectrum is concentrated at zero frequency. These identities make δ the algebraic identity for convolution in transform tables.

Where δ shows up

Impulse response of a linear system. Hit a system with δ(t) at time zero and record the output h(t). For any other input x(t), the output is the convolution y(t) = (x ∗ h)(t). The entire behaviour of a linear time-invariant system is encoded in its impulse response, which is why δ is the single most important "input" in engineering analysis.
Point sources in physics. A point charge q at position r₀ has charge density ρ(r) = q · δ³(r − r₀); a point mass produces gravitational potential satisfying ∇²Φ = 4πG · m · δ³(r − r₀). Green's functions — solutions to PDEs with a delta source — are how partial differential equations are solved by superposition.
Sampling and the Dirac comb. The infinite sum ∑ₙ δ(t − nT) (a "Dirac comb") models ideal sampling at intervals T. The Fourier transform of a Dirac comb is another Dirac comb in frequency, with spacing 1/T — the formal statement of the sampling theorem.
Probability — point masses. A discrete random variable concentrated at a, with no continuous part, has density f(x) = δ(x − a). Mixed distributions (a coin that lands heads with probability ½ and otherwise gives a uniform random number) combine continuous densities with delta spikes — a routine occurrence in financial models and queueing theory.
Quantum mechanics. Position eigenstates |x₀⟩ have the wavefunction ψ(x) = δ(x − x₀); momentum eigenstates have plane-wave wavefunctions whose Fourier transforms are deltas. The whole ket-bra calculus is shot through with deltas — formal, but rigorously underwritten by the theory of rigged Hilbert spaces.

Common mistakes

Treating δ as a function with value ∞ at zero. No real number is δ(0). The delta has no pointwise values; only its integrals against test functions are defined. Any computation that needs δ(0) by itself is wrong.
Multiplying two deltas, δ(x) · δ(x). Generally undefined. Distributions form a vector space, not an algebra — you can't multiply two of them in general. δ(x − a) · δ(x − b) is fine when a ≠ b (it's zero) but not when a = b.
Forgetting the |a| in δ(ax). δ(2x) is δ(x)/2, not δ(x). Drop the |a| and your scaling factors are off, often invisibly until something blows up.
Using δ outside an integral or pairing. A standalone formula like δ(x) = 5 is meaningless. Always remember that δ derives its meaning from how it acts on test functions; equations involving δ are equations of distributions.
Confusing the Kronecker delta δᵢⱼ with the Dirac delta δ(x). The Kronecker delta is a discrete object — 1 if i = j, else 0 — for sums. The Dirac delta is a continuous-distribution object, for integrals. Same symbol, different worlds; mixing them is a classic exam-marker giveaway.

Dirac Delta Function

Watch the 60-second explainer

The honest definition

The sifting property — the only formula you need

Approximating δ by ordinary functions

Useful identities

Worked example — sifting in action

Distribution-zoo comparison

In transforms

Where δ shows up

Common mistakes

Frequently asked questions

Watch the 60-second explainer

The honest definition

The sifting property — the only formula you need

Approximating δ by ordinary functions

Useful identities

Worked example — sifting in action

Distribution-zoo comparison

In transforms

Where δ shows up

Common mistakes

Frequently asked questions

Related concepts