Microeconomic Theory
Envelope Theorem
Why dV*/dθ = ∂V/∂θ at the optimum — and why every comparative-static result depends on it
The envelope theorem says that at an optimum, the total derivative of the value function with respect to a parameter equals the partial — the optimal choice's reaction can be ignored. It anchors comparative statics.
- StatementdV*/dθ = ∂L/∂θ at the optimum
- Why it worksFOC ∂L/∂x = 0 kills the choice-reaction term
- Hotelling 1932∂π*/∂p = q* — supply from profit
- Roy 1947x*ᵢ = −(∂V/∂pᵢ) / (∂V/∂m)
- Shephard 1953∂c*/∂wᵢ = xᵢʰ — demand from cost
- Milgrom-Segal 2002Non-smooth generalization
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How the envelope theorem works
Start with an unconstrained maximization problem. The objective f(x, θ) depends on a choice variable x and a parameter θ. The decision-maker picks x*(θ) to maximize f, giving the value function V*(θ) = f(x*(θ), θ). The question: how does V* change when θ moves?
The chain rule gives two terms: dV*/dθ = ∂f/∂x · dx*/dθ + ∂f/∂θ. The first term — the choice-reaction term — is the source of all the trouble. Computing dx*/dθ usually requires implicit differentiation through the first-order condition, then careful bookkeeping. For complicated problems this is intractable.
The envelope theorem rescues the calculation. At an interior optimum, the first-order condition ∂f/∂x = 0 already holds. So the choice-reaction term vanishes identically. Whatever dx*/dθ equals, it's being multiplied by zero. The total derivative simplifies to the partial:
dV*/dθ = ∂f/∂θ evaluated at (x*(θ), θ).
You compute the partial as if x* were frozen at its current value, ignore the response of the optimal choice, and get the right answer. Every comparative-statics result in microeconomics rests on this. Without it, you would need to differentiate every demand function, every input choice, every policy response — through implicit definitions — just to learn how a value moves.
Constrained version: the Lagrangian envelope
Most economic problems involve constraints — budget, technology, resource. The constrained envelope theorem says: for max f(x, θ) s.t. g(x, θ) = 0, define the Lagrangian L(x, λ, θ) = f(x, θ) − λ g(x, θ). Then at the optimum:
dV*/dθ = ∂L/∂θ evaluated at (x*(θ), λ*(θ), θ).
Both the choice reaction dx*/dθ and the multiplier reaction dλ*/dθ drop out, killed by the FOCs ∂L/∂x = 0 and ∂L/∂λ = 0. The shadow value λ* tells you the marginal value of relaxing the constraint, which is itself a corollary: dV*/dε = λ* where ε shifts the right-hand side of the constraint by one unit.
Envelope-theorem corollaries across economics
| Hotelling's lemma | Roy's identity | Shephard's lemma | Generalized envelope (Milgrom-Segal) | Dynamic-programming envelope | Constraint shadow value | |
|---|---|---|---|---|---|---|
| Year / authors | Hotelling 1932 | Roy 1947 | Shephard 1953 | Milgrom & Segal 2002 | Bellman 1957; Benveniste-Scheinkman 1979 | Lagrange / Kuhn-Tucker |
| Value function | Profit π*(p, w) | Indirect utility V*(p, m) | Cost c*(w, y) | V*(θ) — any optimum | V*(s) — state value | V*(ε) — slack |
| Parameter | Output price p | Price pᵢ or income m | Input price wᵢ | Any θ | State s | Constraint shift ε |
| Identity | ∂π*/∂p = q* | x*ᵢ = −(∂V/∂pᵢ)/(∂V/∂m) | ∂c*/∂wᵢ = xᵢʰ | V*(θ) − V*(θ₀) = ∫∂f/∂θ dθ | V'(s) = ∂F/∂s | dV*/dε = λ* |
| What it saves you | Re-solve for q* | Re-solve Marshallian demand | Re-solve cost-minimizing input | Smooth-objective assumption | Tracking infinite-horizon dynamics | Recomputing the whole problem |
| Used in | Producer theory, supply estimation | Welfare measurement, consumer surplus | Production theory, factor demands | Auction theory, mechanism design | Macro, Ramsey models | Tariff design, capacity planning |
| Domain of failure | Boundary supply | Corner solutions | Boundary inputs | — | Non-smooth value functions | Binding kinks |
Worked example: profit-maximizing firm
A firm produces output q at cost c(q) = q² and sells at price p. Profit: π(q, p) = pq − q². The FOC p − 2q = 0 gives the optimal supply q*(p) = p/2, and the value function π*(p) = p·(p/2) − (p/2)² = p²/4.
Question: what is dπ*/dp? The brute-force route is to differentiate π*(p) = p²/4 directly: dπ*/dp = p/2. Notice that p/2 is exactly q*(p). Coincidence?
The envelope theorem says no. Apply it: dπ*/dp = ∂π/∂p evaluated at q*. The partial ∂π/∂p = q, evaluated at q* = p/2, equals p/2 — the supply. That's Hotelling's lemma. You never needed q*(p) = p/2 to read off the price-sensitivity of profit; the supply is the price-sensitivity.
Concrete numbers: at p = 10, the firm produces q* = 5 and earns π* = 25. If price ticks up to p = 10.1, optimal output shifts to q* = 5.05 and profit becomes (10.1)²/4 = 25.5025. The change is 0.5025. The envelope prediction: q* · Δp = 5 · 0.1 = 0.5, accurate to first order. The remaining 0.0025 is the second-order curvature you'd capture by also computing dq*/dp = 1/2 and the second derivative — but for a marginal-effect calculation, the envelope answer is exact.
Why every comparative-statics result depends on it
- Hotelling's lemma. Supply equals price-derivative of profit. Read q* directly off π*(p), no separate maximization.
- Roy's identity. Marshallian demand from indirect utility:
x*ᵢ = −(∂V/∂pᵢ)/(∂V/∂m). Backs out behavior from welfare measures. - Shephard's lemma. Conditional factor demand from cost function:
∂c*/∂wᵢ = xᵢʰ. Workhorse of production economics. - Bellman / Benveniste-Scheinkman. Dynamic-programming envelope:
V'(s) = ∂F(s, c*(s))/∂s. Used in every macro Euler-equation derivation. - Mechanism design. Myerson's revenue equivalence theorem and the entire monotone-comparative-statics literature depend on envelope-style arguments.
- Welfare measurement. Compensating and equivalent variation are integrals of demand functions — Roy's identity makes this tractable.
Variants and refinements
- Smooth envelope (textbook version). Requires interior optimum, differentiable objective, unique maximizer. Mas-Colell-Whinston-Green, Chapter 3 / Appendix M.K.
- Constrained envelope. Replace
fwith LagrangianL; multiplierλ*is itself the shadow value of the constraint. - Milgrom-Segal (2002) generalization. Extends to non-differentiable choice sets and discontinuous reactions. Critical for auction theory and matching:
V*(θ) − V*(θ₀) = ∫_{θ₀}^θ ∂f/∂θ dθwhere the integrand is evaluated at the current optimizer. - Sub-gradient envelope. When
V*is convex but non-differentiable (multiple optima), the sub-gradient set equals the set of partial derivatives across the optimizers — Danskin's theorem. - Stochastic envelope. In dynamic programming with random shocks, the envelope condition becomes an expectation:
V'(s) = E[∂F/∂s]. Used throughout modern macro. - Multivariate envelope. For vector parameters
θ ∈ ℝⁿ, the gradient∇V*(θ) = ∇_θ Levaluated at the optimum — each component is its own envelope identity.
A brief history
Harold Hotelling proved his lemma in 1932 in a Journal of Political Economy paper on demand functions. René Roy stated his demand identity in De l'Utilité (1942) and refined it in 1947. Ronald Shephard's 1953 book Cost and Production Functions introduced the cost-function approach that bears his name. Paul Samuelson's 1947 Foundations of Economic Analysis unified these results under the envelope framework, making the general theorem the central piece of comparative-statics machinery.
The modern non-smooth generalization comes from Paul Milgrom and Ilya Segal's 2002 Econometrica paper "Envelope Theorems for Arbitrary Choice Sets," which extended the result to mechanism-design settings where the choice space is discrete, the action space is infinite, or the objective is non-differentiable. Their version is the standard reference in contemporary auction theory.
Common pitfalls
- Forgetting the optimum. The envelope identity holds only at
x*(θ). If you evaluate the partial at the wrong x, you get garbage. - Boundary solutions. If
x*(θ)sits on a corner (e.g., zero output), the FOC may not hold and the choice term doesn't vanish. - Multiple optima. When the argmax is a set,
V*may be non-differentiable. Use sub-gradients (Danskin) or the integral form (Milgrom-Segal). - Constraint switching. If
θcrosses a threshold where binding constraints change,V*has a kink. The envelope still works locally on each side. - Confusing with chain rule. The envelope theorem uses the chain rule plus the FOC to simplify; it's not a separate calculus identity.
- Reading too much into "envelope". The name comes from the geometric picture:
V*(θ)is the upper envelope of the family{f(x, θ) : x ∈ X}, tangent to each member at its argmax. Useful intuition but not part of the theorem itself. - Assuming uniqueness. Standard versions require a single optimizer; with multiple, you get a correspondence and the envelope becomes set-valued.
When the envelope theorem matters in practice
- Welfare economics. Computing consumer-surplus changes from price moves uses Roy's identity directly.
- Cost-benefit analysis. Shadow prices from constrained envelopes value scarce resources without re-solving the planner's problem.
- Public finance. Marginal cost of public funds:
dW/dτcan be read off the envelope without re-deriving optimal labor supply. - Mechanism design. Revenue equivalence theorems rely on envelope identities for the bidder's interim utility.
- Macroeconomic dynamics. The Benveniste-Scheinkman envelope condition replaces brute-force differentiation in every modern Euler-equation derivation.
- Empirical demand estimation. Recovering preferences from observed choices uses envelope identities to back out unobserved parameters.
Frequently asked questions
What does the envelope theorem say?
If V*(θ) = max_x f(x, θ) and x*(θ) is the optimizer, then dV*/dθ = ∂f/∂θ evaluated at (x*(θ), θ). The total derivative equals the partial. The term ∂f/∂x · dx*/dθ that you would expect from the chain rule drops out because the first-order condition ∂f/∂x = 0 holds at the optimum. Constrained version: dV*/dθ = ∂L/∂θ at the optimum, where L is the Lagrangian.
Why does the choice reaction drop out?
At an interior optimum, the objective is flat in x — that's what first-order conditions mean. Any tiny change in the choice x produces a second-order-small change in the objective. So when θ moves and the optimum responds (dx*/dθ), the value moves by approximately zero from the choice change. Only the direct effect — ∂f/∂θ holding x* fixed — survives at first order.
What's Hotelling's lemma?
Hotelling's lemma (1932): the supply function equals the partial derivative of the profit function with respect to price. ∂π*(p, w)/∂p = q*(p, w). Direct application of the envelope theorem to π(q, p, w) = pq − c(q, w) at the optimal q. The firm's optimal quantity adjustment doesn't contribute to profit changes when prices move — only the direct revenue effect does.
What's Roy's identity?
Roy's identity (1947): Marshallian demand equals the negative ratio of partial derivatives of indirect utility. x*ᵢ(p, m) = −(∂V/∂pᵢ) / (∂V/∂m). Derived by applying the envelope theorem to V*(p, m) = max u(x) s.t. p·x = m. Lets you recover demand from indirect utility without solving the optimization again — useful in welfare measurement and consumer-surplus calculations.
What's Shephard's lemma?
Shephard's lemma (1953): the conditional input demand equals the partial derivative of the cost function with respect to the input price. ∂c*(w, y)/∂wᵢ = xᵢʰ(w, y). Envelope theorem applied to c(x, w, y) = w·x at the cost-minimizing x. Workhorse of production theory: you can read demand off the cost function by differentiation rather than re-solving.
Does it apply to constrained problems?
Yes — with a small modification. For max f(x, θ) s.t. g(x, θ) = 0, the constrained envelope theorem gives dV*/dθ = ∂L/∂θ at the optimum, where L = f − λg is the Lagrangian. The multiplier λ captures the shadow value of relaxing the constraint. Most economic applications (utility, profit, cost) use the constrained form.
When does the envelope theorem fail?
Three failure modes. (1) Boundary optima: if x*(θ) sits on a corner, the FOC may not hold and the choice term doesn't vanish. (2) Non-differentiable value functions: if multiple optima exist or the objective has kinks, dV*/dθ may not exist. (3) Discontinuous changes: if θ crosses a threshold where the active constraint set switches, the envelope formula has jumps. Milgrom-Segal (2002) generalizes to non-smooth cases.