Combinatorics

Pigeonhole Principle

Put 11 pigeons in 10 holes — some hole has 2 — the simplest counting truth

The pigeonhole principle says — if you put n+1 objects into n boxes, at least one box has at least 2 objects. Trivial-sounding, profound in implications. Used to prove that any group of 367 people must include two with the same birthday, that any sequence of n²+1 distinct integers contains a monotonic subsequence of length n+1, and many "must exist" results in combinatorics, number theory, and computer science.

  • Statementn+1 objects in n boxes → some box has ≥ 2 objects
  • Generalized formkn+1 objects in n boxes → some box has ≥ k+1 objects
  • Famous useErdős–Szekeres theorem (monotonic subsequences)
  • Hash collision implicationsWith more keys than slots, collisions are guaranteed
  • Birthday paradoxPigeonhole guarantees collision; probability is much earlier
  • First explicit statementDirichlet (1834) — also called Dirichlet's box principle

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The principle

If you place n+1 objects into n boxes, at least one box must contain at least 2 objects.

Or more generally — if you place kn+1 objects into n boxes, at least one box must contain at least k+1 objects.

"Pigeonhole" comes from the metaphor — n+1 pigeons returning to n nesting holes; some hole must have two birds.

Why it's true

Suppose, for contradiction, that every box has at most one object. Then the total number of objects is at most n (one per box). But we have n+1 objects — contradiction. So at least one box has at least 2 objects.

The generalized form follows similarly — if every box had at most k objects, the total would be at most kn. With kn+1 objects, some box must have at least k+1.

The proof is trivial. The applications are non-trivial.

Famous applications

Birthday matching — large group

Among any 367 people, at least two share a birthday. Why — only 366 possible birthdays (including February 29). By pigeonhole, two people fall in the same "birthday box."

This is a cleaner version of the famous birthday paradox. The paradox itself shows that with just 23 people, the probability of a shared birthday exceeds 50% — a separate (probabilistic) argument. Pigeonhole gives a guarantee at 367; probability gives a quick rise much earlier.

Hair count in NYC

NYC population > 8,000,000. Maximum hair count on a human head ~ 150,000. By pigeonhole, at least two people in NYC have exactly the same number of hairs.

Five points in a unit square

Among any 5 points in a unit square, at least two are within distance √2/2 ≈ 0.707 of each other.

Proof — divide the square into 4 quadrants of side 0.5. By pigeonhole, two points fall in the same quadrant. The diagonal of a 0.5-side square is √(0.5² + 0.5²) = √0.5 ≈ 0.707. So those two points are at most that far apart.

Erdős–Szekeres theorem

Any sequence of n²+1 distinct real numbers contains a monotonic (increasing or decreasing) subsequence of length n+1.

Proof sketch — for each position i, label it with (a, b) where a is the longest increasing subsequence ending at i, and b is the longest decreasing. If neither subsequence reached length n+1, then a, b ∈ {1, 2, ..., n} — only n² possible labels. With n²+1 positions, two share a label. Show this leads to contradiction with the strict subsequence requirements.

Subset sums

Any 10 distinct positive integers below 100 contain two disjoint subsets with the same sum.

Proof sketch — there are 2¹⁰ = 1024 subsets. The maximum subset sum is at most 91 + 92 + ... + 99 + 100 = 955 (well, not exactly, since they're distinct from 0 to 99). The point — there are more subsets (1024) than possible sums (under 1000). By pigeonhole, two subsets have the same sum. Some adjustments give two DISJOINT subsets with the same sum.

In computer science — hashing

A hash function maps potentially infinite keys to finite slots. Mathematically, this is mapping ∞ pigeons to a finite number of holes — collisions are guaranteed (and infinitely many).

For a hash table with m slots and n keys (n > m), pigeonhole guarantees collision. The birthday-paradox argument shows that with a "good" hash function, expected first collision is around √m keys. So:

  • Hash table with 1 million slots — guaranteed collision after 1,000,001 keys (pigeonhole), expected first collision after ~1,000 keys (birthday).
  • Cryptographic hash with 256 bits — pigeonhole at 2²⁵⁶ + 1, expected first collision at 2¹²⁸ (birthday). The latter is the practical security bound.

JavaScript — pigeonhole demonstrations

// Demonstrate the principle — for n+1 random items in n boxes, some box has duplicates
function pigeonholeDemo(n) {
  const boxes = new Array(n).fill(0);
  for (let i = 0; i < n + 1; i++) {
    const box = Math.floor(Math.random() * n);
    boxes[box]++;
  }
  // By pigeonhole, at least one box has >= 2
  const maxCount = Math.max(...boxes);
  console.log(`Boxes: ${JSON.stringify(boxes)}, max count: ${maxCount}`);
  return maxCount;
}

pigeonholeDemo(5);  // 6 items in 5 boxes — some box has >= 2

// Birthday matching guarantee — 367 people guarantees match
function birthdayMatchExists(people) {
  if (people > 366) return 'Pigeonhole guarantees match';
  // Could still have match by chance; pigeonhole guarantees only at 367+
  return 'Match possible but not guaranteed';
}

// Detect hash collision via pigeonhole
function detectCollision(keys, hashSlots) {
  if (keys > hashSlots) {
    return `Pigeonhole guarantees collision: ${keys} keys, ${hashSlots} slots`;
  }
  return `${keys} keys, ${hashSlots} slots — no guaranteed collision`;
}

console.log(detectCollision(1001, 1000));  // guarantee
console.log(detectCollision(999, 1000));   // no guarantee

Where pigeonhole shows up

  • Combinatorics — existence proofs. "Some configuration must exist" arguments. Erdős-Szekeres, Ramsey theory, and many others use pigeonhole as the core argument.
  • Computer science — hash collisions. Why hash functions need collision handling. Why cryptographic hashes need 256+ bits for security.
  • Number theory — Diophantine approximation. Existence of good rational approximations to irrationals. Continued fraction theory builds on pigeonhole-style arguments.
  • Probability — birthday paradox bounds. Pigeonhole guarantees collision at n+1 people for n possible birthdays. Probability gives the much earlier expected first collision.
  • Geometry — point distribution arguments. "5 points in unit square have two close together," etc.
  • Graph theory — Ramsey-style results. "Any graph with R(s, t) vertices has either a clique of size s or an independent set of size t" — proven via repeated pigeonhole arguments.
  • Real-world non-mathematical examples. "Two people in NYC have the same number of hairs." "In a group of 13, two share a birth-month."

Common mistakes

  • Confusing pigeonhole with probability. Pigeonhole guarantees collision when n+1 > n boxes. Probability talks about expected collisions much earlier (birthday paradox at √n). Don't conflate guarantees with expectations.
  • Forgetting to count the holes correctly. "366 possible birthdays" includes Feb 29 (in leap years). "23 hairs on a head" is wrong; the upper bound is around 150,000. Always state the n carefully.
  • Trying to specify which hole has the duplicate. Pigeonhole says some hole does; it doesn't say which. For uniqueness arguments, use an additional principle.
  • Applying when n+1 ≤ n. Trivially, can't conclude anything if items don't exceed boxes. The principle requires strict majority.
  • Generalized form mis-stated. kn+1 items in n boxes → at least one with k+1, NOT k items. Off-by-one is easy.
  • Mixing up "boxes" and "balls." Always identify what's the pigeon and what's the hole. Items go into boxes; pigeonhole is about boxes (the hole) needing multiple items.

Frequently asked questions

Why is the pigeonhole principle considered profound?

Because despite being obvious, it powers existence proofs that look very deep. "Any 5 points in a unit square have two within distance √2/2" — proved by dividing the square into 4 quadrants (4 holes for 5 pigeons). "Among any 367 people, two share a birthday" — by pigeonhole over 366 possible birthdays. The principle's strength is its generality; it forces an outcome without specifying which.

How is the generalized form proven?

If every box had ≤ k objects, then total objects ≤ kn. But we have kn+1 objects — contradiction. So some box must have at least k+1.

What's the connection to hashing?

A hash function maps n keys to m slots. If n &gt; m, by pigeonhole at least two keys map to the same slot — a "collision." This is why hash tables need collision handling (chaining or open addressing). With m = n bits, you can hash at most n distinct items without guaranteed collision. With m = 2^k bits and a "well-spread" hash, expected first collision is around √(2^k) keys (birthday paradox).

How is this used in number theory?

Diophantine approximation. Among any N integer multiples of an irrational α (modulo 1), at least two differ by less than 1/N. This is proved by pigeonhole — consider the N+1 fractional parts of 0, α, 2α, ..., Nα; they fall in N intervals of length 1/N, so two share an interval. Bound used to prove rationals are dense, irrationals are well-approximable.

What's the Erdős–Szekeres theorem?

Any sequence of (m−1)(k−1)+1 distinct real numbers contains either an increasing subsequence of length m or a decreasing subsequence of length k. Proved by pigeonhole — for each position i, label it (a, b) where a = length of longest increasing subsequence ending at i and b = length of longest decreasing. There are (m−1)(k−1) possible labels for length restrictions; by pigeonhole, with one more position than labels, two share a label, leading to contradiction with strict subsequence requirements.

When does a strong pigeonhole bound apply?

When the distribution is very uniform. The principle says "≥ 1 box has ≥ k+1 objects." If you want to argue that a specific number of boxes have many objects, or about average behavior, the principle alone isn't enough — you need finer combinatorial arguments. Pigeonhole gives existence; quantitative claims need more.

What's a non-obvious pigeonhole application?

The "exists two people in NYC with the same number of hairs on their head." NYC population &gt; 8 million. Maximum hairs on a human head ~ 150,000. By pigeonhole, at least two people share an exact hair count. The number isn't important; just that 8M &gt; 150K.