Calculus

Derivative Definition

The instantaneous rate of change — limit of slope as the interval shrinks

The derivative of a function at a point is the limit of (f(x+h) − f(x))/h as h approaches zero — the slope of the tangent line at that point. It captures instantaneous rate of change, the central idea of calculus. Velocity, acceleration, marginal cost, optimization, and every concept built on calculus traces back to this single limit definition.

  • Definitionf'(x) = lim_{h→0} (f(x+h) − f(x))/h
  • Geometric meaningSlope of the tangent line at x
  • Notationf'(x), df/dx, dy/dx, D_x f
  • Differentiability requiresContinuity at the point AND limit exists from both sides
  • Originator (independently)Newton (1666) and Leibniz (1675)
  • Modern rigorous definitionCauchy (1820s) using the ε-δ limit

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The limit definition

The derivative of a function f at a point x is defined as:

f'(x) = lim_{h→0}  (f(x + h) − f(x)) / h

If this limit exists, f is differentiable at x and f'(x) is its derivative. The function f' (taking x to f'(x)) is the derivative function.

Geometrically — (f(x+h) − f(x))/h is the slope of the line through (x, f(x)) and (x+h, f(x+h)) — the secant line. As h shrinks, this secant rotates toward the tangent line at x. The limit IS the slope of the tangent.

Computing a derivative from first principles

Let's compute d/dx(x²):

f(x) = x²
f(x+h) − f(x) = (x+h)² − x² = x² + 2xh + h² − x² = 2xh + h²

(f(x+h) − f(x))/h = (2xh + h²)/h = 2x + h     (assuming h ≠ 0)

f'(x) = lim_{h→0} (2x + h) = 2x

So the derivative of x² is 2x. The limit can be evaluated by direct substitution at the end because 2x + h has no singularity at h = 0 — but we couldn't substitute h = 0 in the original (f(x+h) − f(x))/h, which would give 0/0.

Standard derivatives — proven once, applied forever

Function f(x)Derivative f'(x)Notes
c (constant)0Constants don't change
x^nn · x^(n−1)The power rule
e^xe^xThe unique self-deriving function
ln(x)1/xFor x > 0
sin(x)cos(x)Radians required
cos(x)−sin(x)Radians required
a^xa^x · ln(a)Reduces to e^x when a = e
log_a(x)1/(x ln a)Reduces to 1/x when a = e

Each of these can be derived from the limit definition. Once derived, they're applied mechanically — never re-derived.

The four arithmetic rules

RuleFormula
Sum(f + g)' = f' + g'
Constant multiple(c·f)' = c·f'
Product(f·g)' = f'·g + f·g'
Quotient(f/g)' = (f'·g − f·g') / g²
Chain rule(f(g(x)))' = f'(g(x)) · g'(x)

Combined with the standard derivatives, these rules cover almost every function you'll meet. The derivation of each (from the limit definition) is a one-page exercise; once proven, you apply mechanically.

When derivatives don't exist

A function is non-differentiable at a point in three classic ways:

  • Sharp corners. f(x) = |x| has slope +1 from the right, −1 from the left at x = 0. The two-sided limit doesn't agree; not differentiable at 0.
  • Vertical tangent. f(x) = x^(1/3) has infinite slope at x = 0. Limit is +∞, not finite; not differentiable.
  • Discontinuity. Any jump or removable discontinuity destroys differentiability automatically (continuity is required).
  • Pathological cases. Functions like Weierstrass's W(x) = Σ a^n cos(b^n πx) are continuous everywhere but differentiable nowhere — they're "fractal" with self-similar wiggling at all scales.

JavaScript: numerical differentiation

// Central difference — better than forward difference
function derivative(f, x, h = 1e-6) {
  return (f(x + h) - f(x - h)) / (2 * h);
}

derivative(x => x*x, 3);          // ≈ 6 (analytical: 2·3 = 6)
derivative(Math.sin, Math.PI/4);  // ≈ 0.707 (analytical: cos(π/4) = √2/2)
derivative(Math.exp, 1);          // ≈ 2.718 (analytical: e^1 = e)

// Bigger h: less precision but faster convergence to the analytical
// Smaller h: more precision but eventual catastrophic cancellation around h ≈ 1e-8

// For symbolic derivatives, you'd need a CAS like math.js or sympy.
// For real production code, autodiff libraries (TensorFlow, JAX, PyTorch)
// compute exact derivatives by tracing operations.

Why the derivative matters

  • Velocity is the derivative of position. v(t) = ds/dt. Acceleration is the derivative of velocity. Newton's second law uses derivatives — F = m · d²s/dt².
  • Optimization. Maxima and minima of differentiable functions occur where f'(x) = 0 (critical points). Solve for them, classify, and you've found extrema.
  • Marginal economics. Marginal cost is the derivative of total cost. Marginal revenue is the derivative of total revenue. Microeconomics uses derivatives constantly.
  • Approximation. Linearizing — f(x) ≈ f(a) + f'(a)(x − a) — is the basis of Newton's method, Taylor series, and most numerical algorithms.
  • Differential equations. Models in physics, biology, engineering all express relationships between quantities and their rates of change. The derivative is the language.
  • Machine learning — gradient descent. Compute ∂Loss/∂weights for every weight in a neural network; update weights against the gradient. Modern AI runs on derivatives.

Newton vs Leibniz

Both invented calculus independently in the 1660s-1670s. Newton's approach was geometric and physical — instantaneous velocity, motion under gravity. Leibniz's was symbolic and algebraic — the dy/dx and ∫ notation we still use today.

The bitter priority dispute (Royal Society found Leibniz had plagiarized; modern historians disagree) split mathematical traditions for a century. English mathematicians used Newton's awkward "fluxion" notation; Continental mathematicians used Leibniz's elegant differentials and made faster progress in the 18th century. By 1820, England had switched too. The notational war was decided; we use Leibniz's everywhere.

Common mistakes

  • Trying to evaluate the difference quotient at h = 0. That gives 0/0, undefined. The limit operation is what extracts the meaningful value as h approaches 0.
  • Treating dy/dx as a fraction. Sometimes it works (chain rule looks like fractions); sometimes it fails (you can't multiply by dx as if dividing). Use the rules; treat dy/dx as one symbol unless you've studied differential forms.
  • Forgetting that radians are required. d/dx(sin x) = cos x ONLY when x is in radians. In degrees, the derivative gets a factor of π/180. This is why pure mathematics always uses radians.
  • Not checking continuity before differentiating. A function not continuous at a point isn't differentiable there. Sharp jumps, removable discontinuities, and asymptotes all break differentiability.
  • Numerical differentiation with too-small h. Floating-point round-off dominates when h is below ~10^−8. The "optimal" h depends on f and machine precision; central difference (f(x+h) − f(x−h))/(2h) is more stable than forward difference.
  • Confusing derivative and antiderivative. The derivative is one direction; integration is the inverse. d/dx(x²) = 2x. ∫2x dx = x² + C. The constant C disappears under differentiation, recovering the loss of information.

Frequently asked questions

Why divide by h then take the limit as h goes to 0?

(f(x+h) − f(x))/h is the slope of the secant line through (x, f(x)) and (x+h, f(x+h)). As h shrinks, the secant rotates toward the tangent line. The limit gives the slope of the tangent — the instantaneous rate of change at x. We can't just plug in h = 0 because that gives 0/0 (indeterminate); the limit captures the meaningful value as h approaches but never reaches zero.

What does it mean for a function to be differentiable?

The limit f'(x) = lim (f(x+h) − f(x))/h must exist and be finite. Practically — the function is continuous AND the left-hand and right-hand limits of the difference quotient agree. Sharp corners (like |x| at x = 0) and vertical tangents (like ∛x at x = 0) are not differentiable. Smooth-looking functions almost always are.

Is every continuous function differentiable?

No. Continuity is necessary but not sufficient. The Weierstrass function is continuous everywhere but differentiable nowhere — built from infinitely many compounding sine waves whose oscillations make the function jagged at every scale. Counter-intuitive but rigorous; took 80 years from Newton to construct such examples.

How do you compute derivatives "from first principles"?

Apply the limit definition directly. For f(x) = x² — f(x+h) − f(x) = (x+h)² − x² = 2xh + h². Divide by h — 2x + h. Take limit as h → 0 — 2x. So d/dx(x²) = 2x. The point of derivative rules (power rule, chain rule, product rule) is to skip this calculation by abstracting it once and applying mechanically thereafter.

Why is dy/dx not really a fraction?

Historically, Leibniz wrote dy/dx as if it were a ratio of "infinitesimal" changes — treated literally as fractions. Modern analysis defines it as a limit, not a quotient. dy/dx is one symbol for the derivative; treating it as a fraction sometimes works (chain rule looks like fraction multiplication) but can mislead in subtle cases. Differential forms in advanced math give dy/dx fraction-like properties rigorously.

What's the difference between dy/dx and ∂y/∂x?

dy/dx is the ordinary derivative — y depends only on x. ∂y/∂x is the partial derivative — y depends on multiple variables, and we differentiate with respect to one while holding others constant. Used when functions have multiple inputs (e.g., temperature as a function of position and time). Same limit definition, applied per-variable.

Who invented calculus first — Newton or Leibniz?

Newton invented it in 1666 (his "annus mirabilis") but didn't publish until 1693. Leibniz independently developed it 1675-1684 and published first. The bitter priority dispute that followed split English mathematics from Continental for a century. Modern view — both invented it independently. Leibniz's notation (dy/dx, ∫) won; Newton's geometric approach is forgotten outside historical interest.