Logic
Occam's Razor
Among rival explanations, prefer the one that buys less ontology
Occam's Razor — also called the principle of parsimony or lex parsimoniae — says: when two hypotheses explain the same evidence, prefer the one that posits fewer entities or assumptions. Named for the 14th-century English Franciscan William of Ockham (c. 1287–1347), though the idea predates him in Aristotle and the medieval scholastics. It is a heuristic for hypothesis selection, not a logical proof — simpler theories are easier to test, easier to refute, and historically more often correct, but reality is sometimes irreducibly complex.
- Named forWilliam of Ockham (c. 1287–1347)
- Latin tagLex parsimoniae
- Standard formDon't multiply entities without necessity
- TypeHeuristic, defeasible
- Modern formalizationBayesian Occam factor, MDL, AIC/BIC
- Common abuse"Simpler" used to mean "more familiar"
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
What the Razor actually says
Most popular renderings — "the simplest explanation is usually the right one" — are paraphrases that introduce a claim Ockham never made. The genuine principle is more cautious. It applies only when two or more hypotheses are evidentially indistinguishable: they predict the same observations, fit the same data, and cannot be told apart by any test currently available. In that situation, and only in that situation, prefer the one that requires fewer kinds of thing to exist.
This caveat matters. The Razor is not a method for adjudicating between a well-tested theory and a poorly-tested one. It is not an argument from "I find this implausible" to "therefore it is false". It is a tiebreaker, deployed when ordinary evidential reasoning has run out of leverage.
Ockham's own examples were theological and metaphysical. He wished to dispense with the rich Platonic ontology of universals — abstract objects like Redness or Triangleness floating somewhere outside any particular red triangle — that had been imported into Christian thought through the church fathers. If everything we want to say about red triangles can be said using only particular red triangles, then universals are doing no explanatory work and should not be admitted into the catalogue of what exists.
History of the idea
The intuition behind parsimony is older than its medieval namesake. Aristotle wrote in the Posterior Analytics that "we may assume the superiority ceteris paribus of the demonstration which derives from fewer postulates or hypotheses". Ptolemy stated something similar in the Almagest. Maimonides invoked it in the 12th century. Ockham's contribution was to wield the principle systematically against the proliferating ontological commitments of late scholastic theology — angels, essences, real qualities, intelligible species — in service of his nominalist program.
The blade-shaped metaphor came later. The phrase "Ockham's Razor" appears in print in 1852 in the work of Sir William Hamilton; the catchy Latin formula entia non sunt multiplicanda praeter necessitatem was probably coined around 1639 by the Irish Franciscan John Punch in his commentary on Duns Scotus. Ockham himself wrote, more soberly, frustra fit per plura quod potest fieri per pauciora — "it is futile to do with more what can be done with fewer".
Worked example: Neptune vs Vulcan
In 1846, anomalies in the orbit of Uranus suggested two candidate hypotheses. Hypothesis A: there is an unseen planet beyond Uranus tugging on it. Hypothesis B: Newton's law of gravitation needs to be modified at large distances. Both could fit the data. The Razor recommends A because A introduces one new entity (a planet) into a system already known to contain planets, while B requires modifying a fundamental law that has worked everywhere else. Le Verrier and Adams predicted the new planet's position; Galle pointed his telescope and found Neptune within a degree of the prediction.
Twelve years later the same trick failed. Mercury's perihelion also precessed anomalously, and the same Le Verrier proposed an inner planet — Vulcan — to explain it. Astronomers searched for fifty years and found nothing. The correct answer turned out to be hypothesis B after all: Einstein's general relativity in 1915 modified the law of gravitation at high curvature and recovered Mercury's precession exactly. Same situation, opposite verdict — because what counts as "necessary" changed when a deeper theory became available. The Razor shaved Vulcan but only after relativity gave us a competing simpler picture.
Razor vs other explanatory virtues
| Virtue | What it prizes | Failure mode |
|---|---|---|
| Parsimony (Occam) | Fewer entities, fewer free parameters | May shave away genuine but unfamiliar structure |
| Predictive accuracy | Fit to observation, especially novel predictions | Overfitting to existing data |
| Falsifiability (Popper) | Risky, refutable claims | Some true theories are hard to falsify directly |
| Explanatory scope | Unifies many disparate phenomena | Can be satisfied by vacuous "explains everything" |
| Coherence | Consistency with established theory | Conservative — punishes genuine paradigm shifts |
| Fruitfulness | Generates new research questions | Hindsight-only; cannot be assessed in advance |
| Inference to the best explanation | Whichever hypothesis would, if true, most likely produce the evidence | "Best" is comparative; the truth may not be on the list |
Working scientists trade these virtues against each other. The Razor is one weight in a multi-criterion decision, never the whole balance.
Formalizations: how to count "simplicity"
"Simpler" is suggestive but vague — and twentieth-century philosophy of science put real work into making it precise.
- Free-parameter counting. A polynomial of degree 2 has three coefficients; degree 5 has six. Among curves that fit the data equally well, the lower-degree one is simpler.
- Akaike Information Criterion (Akaike 1973). Penalize log-likelihood by twice the number of parameters:
AIC = 2k − 2 ln L. Minimum AIC wins. The Bayesian Information Criterion (BIC) and the Minimum Description Length (MDL) principle are kindred frameworks. - Kolmogorov complexity. Define the simplicity of a hypothesis as the length of the shortest computer program that generates it. Solomonoff induction makes this rigorous (and uncomputable).
- Bayesian Occam factor. A model with more free parameters must spread its prior probability over a larger possibility space. When data arrives, the marginal likelihood automatically penalizes diffuse, over-flexible models in favour of tight, restrictive ones — even with a flat prior over models. This was developed by Harold Jeffreys, popularized by Edwin Jaynes, and given its modern form by David MacKay in Information Theory, Inference, and Learning Algorithms (2003).
Counterargument: the Anti-Razor and Newton's caveat
Walter of Chatton, a 14th-century rival of Ockham, proposed an explicit Anti-Razor: "if three things are not enough to verify an affirmative proposition about things, a fourth must be added, and so on". The point was that the Razor cannot tell you when you have too few entities. Reality has the resolution it has, and the universe will not subsidise our preference for short explanations.
Newton himself, in the General Scholium to the Principia, gave the Razor in a hedged form: "we are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances". The "sufficient" clause is the safety catch. Einstein paraphrased it more crisply: "everything should be made as simple as possible, but no simpler".
The deeper anti-Razor argument comes from Nelson Goodman's grue puzzle and from underdetermination of theory by data. Goodman showed that any finite body of observations is compatible with infinitely many "simple" generalizations using ad-hoc predicates ("grue" = green-before-time-T-and-blue-after). Choosing among them already requires a prior judgement about which predicates are natural — a judgement Occam's Razor cannot supply on its own.
Variants and rivals
- Hanlon's Razor — never attribute to malice that which can be adequately explained by stupidity. A social-reasoning cousin.
- Hitchens's Razor — what can be asserted without evidence can be dismissed without evidence.
- Hickam's Dictum (medical) — patients can have as many diseases as they please, an explicit anti-parsimony principle for differential diagnosis where multiple comorbidities are common.
- Sagan Standard — extraordinary claims require extraordinary evidence. Treats the Razor's parsimony cost as an evidential threshold.
- Crabtree's Bludgeon — no set of mutually inconsistent observations is so complex that the human mind cannot construct from it a coherent theory. A satirical inversion warning that simplicity is too easy to fake.
Common confusions
- "Simplest" ≠ "easiest to imagine". The geocentric model felt simpler because we are on Earth and see the sky move. It was not actually simpler in equations — Ptolemy needed dozens of epicycles. Familiarity is not parsimony.
- The Razor does not select among unequal hypotheses. If theory A makes sharper predictions than theory B, the Razor is irrelevant — predictive accuracy already decides.
- It is not anti-pluralism. Ockham was a nominalist, not a monist. He happily admitted as many entities as the evidence demanded — what he refused was entities the evidence did not demand.
- Counting matters. "One creator who made everything" is not automatically simpler than "many particles obeying laws" — the creator-hypothesis hides enormous complexity inside a single name. Parsimony tracks ontological commitment, not vocabulary length.
Frequently asked questions
Did William of Ockham actually say "entities should not be multiplied beyond necessity"?
Not in those exact words. The Latin formulation entia non sunt multiplicanda praeter necessitatem was coined later, most likely by the Irish Franciscan John Punch around 1639. Ockham's own phrasings include pluralitas non est ponenda sine necessitate (plurality is not to be posited without necessity) and frustra fit per plura quod potest fieri per pauciora (it is futile to do with more what can be done with fewer).
Is Occam's Razor a law of logic?
No. It cannot be derived from pure logic and reality offers no guarantee that simpler is truer. It is a defeasible heuristic — useful when two theories make the same predictions, but easily overturned by new evidence. Quantum mechanics, plate tectonics, and the chemical periodic table all replaced "simpler" predecessors that turned out to be wrong.
What does "simpler" mean — fewer assumptions or fewer entities?
Both interpretations exist. Ockham himself targeted entities — particularly the Platonic universals he wished to nominalize away. Modern usage focuses on assumptions, free parameters, or Kolmogorov complexity. In practice all three roughly agree: a theory positing fewer kinds of thing usually also makes fewer independent claims.
How does Occam's Razor relate to Bayesian reasoning?
Bayesian model comparison contains an automatic Occam factor: a model with more free parameters spreads its prior probability mass over a wider range of possible data, so when the data actually arrives, the simpler model's likelihood is concentrated more tightly and wins the posterior comparison — provided both models fit. This is sometimes called Bayesian Occam's Razor.
When should I distrust Occam's Razor?
When the simpler theory only looks simpler because you have ignored data, when "simple" is being used to mean "familiar", or when a more complex theory makes sharper predictions. The history of physics is full of cases — Tycho's geocentric system was technically simpler than Kepler's ellipses but lost on predictive power; phlogiston was conceptually simpler than oxygen chemistry but wrong.
Is the Razor anti-religious?
Ockham was a Franciscan friar; the Razor was originally a tool of theology, used to argue that God could achieve His ends with the smallest possible ontology. Modern atheists sometimes invoke it against deities, but the Razor itself takes no stand on what counts as a "necessary" entity — that judgment is prior to the heuristic.