Microeconomic Theory

Envelope Theorem

Why dV*/dθ = ∂V/∂θ at the optimum — and why every comparative-static result depends on it

The envelope theorem says that at an optimum, the total derivative of the value function with respect to a parameter equals the partial — the optimal choice's reaction can be ignored. It anchors comparative statics.

StatementdV*/dθ = ∂L/∂θ at the optimum
Why it worksFOC ∂L/∂x = 0 kills the choice-reaction term
Hotelling 1932∂π*/∂p = q* — supply from profit
Roy 1947x*ᵢ = −(∂V/∂pᵢ) / (∂V/∂m)
Shephard 1953∂c*/∂wᵢ = xᵢʰ — demand from cost
Milgrom-Segal 2002Non-smooth generalization

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How the envelope theorem works

Start with an unconstrained maximization problem. The objective f(x, θ) depends on a choice variable x and a parameter θ. The decision-maker picks x*(θ) to maximize f, giving the value function V*(θ) = f(x*(θ), θ). The question: how does V* change when θ moves?

The chain rule gives two terms: dV*/dθ = ∂f/∂x · dx*/dθ + ∂f/∂θ. The first term — the choice-reaction term — is the source of all the trouble. Computing dx*/dθ usually requires implicit differentiation through the first-order condition, then careful bookkeeping. For complicated problems this is intractable.

The envelope theorem rescues the calculation. At an interior optimum, the first-order condition ∂f/∂x = 0 already holds. So the choice-reaction term vanishes identically. Whatever dx*/dθ equals, it's being multiplied by zero. The total derivative simplifies to the partial:

dV*/dθ = ∂f/∂θ evaluated at (x*(θ), θ).

You compute the partial as if x* were frozen at its current value, ignore the response of the optimal choice, and get the right answer. Every comparative-statics result in microeconomics rests on this. Without it, you would need to differentiate every demand function, every input choice, every policy response — through implicit definitions — just to learn how a value moves.

Constrained version: the Lagrangian envelope

Most economic problems involve constraints — budget, technology, resource. The constrained envelope theorem says: for max f(x, θ) s.t. g(x, θ) = 0, define the Lagrangian L(x, λ, θ) = f(x, θ) − λ g(x, θ). Then at the optimum:

dV*/dθ = ∂L/∂θ evaluated at (x*(θ), λ*(θ), θ).

Both the choice reaction dx*/dθ and the multiplier reaction dλ*/dθ drop out, killed by the FOCs ∂L/∂x = 0 and ∂L/∂λ = 0. The shadow value λ* tells you the marginal value of relaxing the constraint, which is itself a corollary: dV*/dε = λ* where ε shifts the right-hand side of the constraint by one unit.

Envelope-theorem corollaries across economics

	Hotelling's lemma	Roy's identity	Shephard's lemma	Generalized envelope (Milgrom-Segal)	Dynamic-programming envelope	Constraint shadow value
Year / authors	Hotelling 1932	Roy 1947	Shephard 1953	Milgrom & Segal 2002	Bellman 1957; Benveniste-Scheinkman 1979	Lagrange / Kuhn-Tucker
Value function	Profit π*(p, w)	Indirect utility V*(p, m)	Cost c*(w, y)	V*(θ) — any optimum	V*(s) — state value	V*(ε) — slack
Parameter	Output price p	Price pᵢ or income m	Input price wᵢ	Any θ	State s	Constraint shift ε
Identity	∂π/∂p = q	x*ᵢ = −(∂V/∂pᵢ)/(∂V/∂m)	∂c*/∂wᵢ = xᵢʰ	V(θ) − V(θ₀) = ∫∂f/∂θ dθ	V'(s) = ∂F/∂s	dV/dε = λ
What it saves you	Re-solve for q*	Re-solve Marshallian demand	Re-solve cost-minimizing input	Smooth-objective assumption	Tracking infinite-horizon dynamics	Recomputing the whole problem
Used in	Producer theory, supply estimation	Welfare measurement, consumer surplus	Production theory, factor demands	Auction theory, mechanism design	Macro, Ramsey models	Tariff design, capacity planning
Domain of failure	Boundary supply	Corner solutions	Boundary inputs	—	Non-smooth value functions	Binding kinks

Worked example: profit-maximizing firm

A firm produces output q at cost c(q) = q² and sells at price p. Profit: π(q, p) = pq − q². The FOC p − 2q = 0 gives the optimal supply q*(p) = p/2, and the value function π*(p) = p·(p/2) − (p/2)² = p²/4.

Question: what is dπ*/dp? The brute-force route is to differentiate π*(p) = p²/4 directly: dπ*/dp = p/2. Notice that p/2 is exactly q*(p). Coincidence?

The envelope theorem says no. Apply it: dπ*/dp = ∂π/∂p evaluated at q*. The partial ∂π/∂p = q, evaluated at q* = p/2, equals p/2 — the supply. That's Hotelling's lemma. You never needed q*(p) = p/2 to read off the price-sensitivity of profit; the supply is the price-sensitivity.

Concrete numbers: at p = 10, the firm produces q* = 5 and earns π* = 25. If price ticks up to p = 10.1, optimal output shifts to q* = 5.05 and profit becomes (10.1)²/4 = 25.5025. The change is 0.5025. The envelope prediction: q* · Δp = 5 · 0.1 = 0.5, accurate to first order. The remaining 0.0025 is the second-order curvature you'd capture by also computing dq*/dp = 1/2 and the second derivative — but for a marginal-effect calculation, the envelope answer is exact.

Why every comparative-statics result depends on it

Hotelling's lemma. Supply equals price-derivative of profit. Read q* directly off π*(p), no separate maximization.
Roy's identity. Marshallian demand from indirect utility: x*ᵢ = −(∂V/∂pᵢ)/(∂V/∂m). Backs out behavior from welfare measures.
Shephard's lemma. Conditional factor demand from cost function: ∂c*/∂wᵢ = xᵢʰ. Workhorse of production economics.
Bellman / Benveniste-Scheinkman. Dynamic-programming envelope: V'(s) = ∂F(s, c*(s))/∂s. Used in every macro Euler-equation derivation.
Mechanism design. Myerson's revenue equivalence theorem and the entire monotone-comparative-statics literature depend on envelope-style arguments.
Welfare measurement. Compensating and equivalent variation are integrals of demand functions — Roy's identity makes this tractable.

Variants and refinements

Smooth envelope (textbook version). Requires interior optimum, differentiable objective, unique maximizer. Mas-Colell-Whinston-Green, Chapter 3 / Appendix M.K.
Constrained envelope. Replace f with Lagrangian L; multiplier λ* is itself the shadow value of the constraint.
Milgrom-Segal (2002) generalization. Extends to non-differentiable choice sets and discontinuous reactions. Critical for auction theory and matching: V*(θ) − V*(θ₀) = ∫_{θ₀}^θ ∂f/∂θ dθ where the integrand is evaluated at the current optimizer.
Sub-gradient envelope. When V* is convex but non-differentiable (multiple optima), the sub-gradient set equals the set of partial derivatives across the optimizers — Danskin's theorem.
Stochastic envelope. In dynamic programming with random shocks, the envelope condition becomes an expectation: V'(s) = E[∂F/∂s]. Used throughout modern macro.
Multivariate envelope. For vector parameters θ ∈ ℝⁿ, the gradient ∇V*(θ) = ∇_θ L evaluated at the optimum — each component is its own envelope identity.

A brief history

Harold Hotelling proved his lemma in 1932 in a Journal of Political Economy paper on demand functions. René Roy stated his demand identity in De l'Utilité (1942) and refined it in 1947. Ronald Shephard's 1953 book Cost and Production Functions introduced the cost-function approach that bears his name. Paul Samuelson's 1947 Foundations of Economic Analysis unified these results under the envelope framework, making the general theorem the central piece of comparative-statics machinery.

The modern non-smooth generalization comes from Paul Milgrom and Ilya Segal's 2002 Econometrica paper "Envelope Theorems for Arbitrary Choice Sets," which extended the result to mechanism-design settings where the choice space is discrete, the action space is infinite, or the objective is non-differentiable. Their version is the standard reference in contemporary auction theory.

Common pitfalls

Forgetting the optimum. The envelope identity holds only at x*(θ). If you evaluate the partial at the wrong x, you get garbage.
Boundary solutions. If x*(θ) sits on a corner (e.g., zero output), the FOC may not hold and the choice term doesn't vanish.
Multiple optima. When the argmax is a set, V* may be non-differentiable. Use sub-gradients (Danskin) or the integral form (Milgrom-Segal).
Constraint switching. If θ crosses a threshold where binding constraints change, V* has a kink. The envelope still works locally on each side.
Confusing with chain rule. The envelope theorem uses the chain rule plus the FOC to simplify; it's not a separate calculus identity.
Reading too much into "envelope". The name comes from the geometric picture: V*(θ) is the upper envelope of the family {f(x, θ) : x ∈ X}, tangent to each member at its argmax. Useful intuition but not part of the theorem itself.
Assuming uniqueness. Standard versions require a single optimizer; with multiple, you get a correspondence and the envelope becomes set-valued.

When the envelope theorem matters in practice

Welfare economics. Computing consumer-surplus changes from price moves uses Roy's identity directly.
Cost-benefit analysis. Shadow prices from constrained envelopes value scarce resources without re-solving the planner's problem.
Public finance. Marginal cost of public funds: dW/dτ can be read off the envelope without re-deriving optimal labor supply.
Mechanism design. Revenue equivalence theorems rely on envelope identities for the bidder's interim utility.
Macroeconomic dynamics. The Benveniste-Scheinkman envelope condition replaces brute-force differentiation in every modern Euler-equation derivation.
Empirical demand estimation. Recovering preferences from observed choices uses envelope identities to back out unobserved parameters.

Frequently asked questions

What does the envelope theorem say?

If V*(θ) = max_x f(x, θ) and x*(θ) is the optimizer, then dV*/dθ = ∂f/∂θ evaluated at (x*(θ), θ). The total derivative equals the partial. The term ∂f/∂x · dx*/dθ that you would expect from the chain rule drops out because the first-order condition ∂f/∂x = 0 holds at the optimum. Constrained version: dV*/dθ = ∂L/∂θ at the optimum, where L is the Lagrangian.

Why does the choice reaction drop out?

At an interior optimum, the objective is flat in x — that's what first-order conditions mean. Any tiny change in the choice x produces a second-order-small change in the objective. So when θ moves and the optimum responds (dx*/dθ), the value moves by approximately zero from the choice change. Only the direct effect — ∂f/∂θ holding x* fixed — survives at first order.

What's Hotelling's lemma?

Hotelling's lemma (1932): the supply function equals the partial derivative of the profit function with respect to price. ∂π*(p, w)/∂p = q*(p, w). Direct application of the envelope theorem to π(q, p, w) = pq − c(q, w) at the optimal q. The firm's optimal quantity adjustment doesn't contribute to profit changes when prices move — only the direct revenue effect does.

What's Roy's identity?

Roy's identity (1947): Marshallian demand equals the negative ratio of partial derivatives of indirect utility. x*ᵢ(p, m) = −(∂V/∂pᵢ) / (∂V/∂m). Derived by applying the envelope theorem to V*(p, m) = max u(x) s.t. p·x = m. Lets you recover demand from indirect utility without solving the optimization again — useful in welfare measurement and consumer-surplus calculations.

What's Shephard's lemma?

Shephard's lemma (1953): the conditional input demand equals the partial derivative of the cost function with respect to the input price. ∂c*(w, y)/∂wᵢ = xᵢʰ(w, y). Envelope theorem applied to c(x, w, y) = w·x at the cost-minimizing x. Workhorse of production theory: you can read demand off the cost function by differentiation rather than re-solving.

Does it apply to constrained problems?

Yes — with a small modification. For max f(x, θ) s.t. g(x, θ) = 0, the constrained envelope theorem gives dV*/dθ = ∂L/∂θ at the optimum, where L = f − λg is the Lagrangian. The multiplier λ captures the shadow value of relaxing the constraint. Most economic applications (utility, profit, cost) use the constrained form.

When does the envelope theorem fail?

Three failure modes. (1) Boundary optima: if x*(θ) sits on a corner, the FOC may not hold and the choice term doesn't vanish. (2) Non-differentiable value functions: if multiple optima exist or the objective has kinks, dV*/dθ may not exist. (3) Discontinuous changes: if θ crosses a threshold where the active constraint set switches, the envelope formula has jumps. Milgrom-Segal (2002) generalizes to non-smooth cases.