Portfolio Theory

Kelly Criterion

The exact bet fraction f* = (bp − q)/b that maximizes long-run wealth — and the geometric reason why betting more makes you broke

John Kelly Jr.'s 1956 Bell Labs formula gives the bet fraction that maximizes long-run log-wealth growth: f* = (bp − q)/b. For 60% win × even money: bet 20% of bankroll each time.

  • AuthorJohn L. Kelly Jr. (Bell Labs)
  • PublishedBell System Technical Journal, 1956
  • Formulaf* = (bp − q)/b
  • Even-money casef* = p − q = 2p − 1
  • Famous userEd Thorp — blackjack, Princeton-Newport
  • In practiceFractional Kelly (0.25× – 0.5×)

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The formula and its derivation

Suppose you have a repeated, independent bet. Each round, you stake a fraction f of your current bankroll. On a win (probability p), you receive net odds b — so a $1 stake returns $b net. On a loss (probability q = 1 − p), you lose the stake. After one round, wealth multiplies by (1 + bf) with probability p or (1 − f) with probability q. After N rounds:

W_N / W_0 = ∏_{i=1}^N M_i ,  where M_i ∈ {1 + bf, 1 − f}

The long-run growth rate is (1/N)·log(WN/W0) = (1/N)·Σ log(Mi), which by the law of large numbers converges to E[log(M)] = p·log(1 + bf) + q·log(1 − f) — the Kelly objective. Differentiate with respect to f and set to zero:

d/df [p·log(1+bf) + q·log(1−f)] = pb/(1+bf) − q/(1−f) = 0

Solving gives f* = (bp − q)/b. Equivalently, f* = (p(b+1) − 1)/b — the edge per dollar bet, divided by the loss per dollar bet.

A worked example: 60/40 even-money

You have a bet where p = 0.60 (win probability), q = 0.40, and b = 1 (even-money payoff). Kelly says:

f* = (1 × 0.60 − 0.40) / 1 = 0.20

Bet 20% of bankroll each round. Starting at $1,000, after one round you're at $1,200 (win) or $800 (loss). Expected log-growth per round:

g(0.20) = 0.6·log(1.2) + 0.4·log(0.8) = 0.6·0.182 + 0.4·(−0.223) = 0.020

Two percent per round, compounding. Over 100 rounds, expected bankroll growth ≈ e2 ≈ 7.4×. Now compare with betting 10% (under-Kelly) or 40% (over-Kelly, 2× full):

fRound outcome (win)Round outcome (loss)E[log-growth]/round100-round factor
0.05 (¼ Kelly)+5.0%−5.0%0.0094≈ 2.55×
0.10 (½ Kelly)+10.0%−10.0%0.0152≈ 4.59×
0.20 (Kelly)+20.0%−20.0%0.0204≈ 7.71×
0.30+30.0%−30.0%0.0166≈ 5.27×
0.40 (2× Kelly)+40.0%−40.0%0.0000≈ 1.00× (no growth)
0.50 (over)+50.0%−50.0%−0.0203≈ 0.13× (loses money)
0.80+80.0%−80.0%−0.0959≈ 6.8 × 10−5 (ruin)

Two things to notice. First, the growth curve is concave with a peak at f = 0.20 exactly. Second, betting more than 2× Kelly turns expected growth negative — you lose money on average despite having a 60% win rate. The 80% bet leads to near-certain ruin.

Kelly vs other position-sizing rules

Fixed dollarFixed fraction (non-Kelly)KellyHalf-KellyMean-variance / Merton
Sizing ruleSame $ each betSame % each bet(bp−q)/b each bet0.5 × Kellyμ/(γσ²) for power utility γ
Account for edge?NoImplicitYes — edge in numeratorYes — half-strengthYes
Long-run growthLinear (eventually)Below KellyMaximum~75% of KellyMaximum at γ=1
Variance / drawdownLowModerateHighModerateAdjustable via γ
Sensitivity to edge mis-estimationLowLowHigh — exponential damageModerateLinear in γ
Used byConservative retailIndex funds (rebalance to fixed weight)Pure quant — theoreticalThorp, professionalsPension funds, robo-advisors
First-principles foundationNoneHeuristicE[log W] max (Kelly 1956)Same, dampenedE[u(W)] for general u

Bell Labs to Las Vegas to Wall Street

John Larry Kelly Jr. was a 33-year-old Bell Labs physicist working on information theory when he derived the formula. The 1956 paper "A New Interpretation of Information Rate" was written as an information-theoretic exercise — Kelly observed that the growth rate of a horse-racing bankroll under optimal sizing equals the mutual information between race odds and inside knowledge, in Shannon's sense. Claude Shannon, his Bell Labs colleague and the founder of information theory, suggested the gambling framing. The paper has 1,800+ citations and seeded both quantitative finance and information theory's gambling-theoretic interpretation.

Kelly died of a stroke in 1965 at age 41 without seeing his formula go mainstream. Edward O. Thorp, an MIT mathematician working on blackjack card-counting, found Kelly's paper and applied it to casino play in his 1962 best-seller Beat the Dealer. Thorp later co-founded Princeton-Newport Partners (1969–1988), the first true statistical-arbitrage hedge fund, returning roughly 19% net annually for two decades using Kelly-sized positions on small-edge trades.

Kelly's formula re-entered academic finance through Mossin (1968), Markowitz (1976), and the consumption-CAPM literature. In continuous time it becomes Merton's optimal-portfolio share (1969): f* = μ/(γσ²) for power utility u(W) = W1−γ/(1−γ), reducing to Kelly when γ = 1 (log utility). Modern quant funds — Renaissance, Two Sigma, D.E. Shaw — implicitly or explicitly use Kelly-derived sizing on each independent signal, then aggregate.

Continuous-time Kelly (the Merton share)

Suppose returns follow a Brownian motion with drift μ and volatility σ. The continuous-time analogue of Kelly is the Merton share:

f* = μ / σ²    =  Sharpe ratio / σ

where the Sharpe ratio is (μ − r)/σ and r is the risk-free rate. A strategy with Sharpe 1.0 and 20% vol calls for 25% leverage; Sharpe 2.0 same vol calls for 50%. The Kelly objective in continuous time becomes the geometric growth rate μ·f − (σ·f)²/2, maximized at f = μ/σ². The half-variance term is why over-betting hurts — volatility drag eats arithmetic returns.

Variants and extensions

  • Fractional Kelly (Thorp 1969). Bet kf* for k ∈ (0,1). Half-Kelly (k=0.5) captures ~75% of growth with half the variance and is far more robust to errors in estimating p and b. The professional default.
  • Multi-bet Kelly. When N independent bets are available, solve for the joint optimal allocation. Closed-form when bets are independent (sum of individual Kellys scaled by total available wealth).
  • Continuous Kelly with stochastic volatility. When σ is itself random, the Merton share becomes f* = E[μ]/Var[μ·dt + σ·dW], picking up additional precautionary terms.
  • Tail-risk Kelly (Vince 1990, MacLean-Thorp-Ziemba 2010). Adjustments for non-Gaussian return distributions; explicit drawdown constraints add risk-off triggers.
  • Generalized Kelly under uncertainty. Robust Kelly bets the lower bound of a confidence interval for p — protects against edge mis-estimation explicitly.
  • Pure information-theoretic Kelly (Algoet-Cover 1988). The fundamental limit of any betting strategy is the mutual information between side information and outcomes — Kelly bets achieve it.

Real-world applications

  • Quant trading. Renaissance Medallion is widely believed to size positions via Kelly-like rules on each independent alpha signal. Reported gross returns of ~66% annually since 1988.
  • Sports betting. Bill Benter's Hong Kong horse-racing fund used fractional Kelly to extract an estimated $1 billion over 1984–2006. Haralabos Voulgaris built one of NBA gambling's biggest bankrolls in the 2000s using similar logic.
  • Blackjack card-counting. Ed Thorp's original 1962 application. Counters increase bet size proportional to the running count, approximating Kelly on each hand's edge.
  • Venture capital portfolio sizing. Some VCs explicitly use Kelly-derived allocation across deals; concentrated portfolios at high-conviction stages reflect the formula's tendency to load up when edge is large.
  • Insurance underwriting. Lloyd's syndicates implicitly Kelly-size individual underwriting decisions — when the loss ratio implies an edge, larger exposure is acceptable.
  • Cryptocurrency leverage. Perpetual futures traders on Binance, dYdX, GMX use Kelly-style sizing on signal strength; over-leveraging is what kills retail traders during volatility spikes.
  • Personal finance heuristic. Stock-allocation rules of thumb ("100 minus age") implicitly approximate fractional Kelly given long-run equity Sharpe ratios and life-cycle risk aversion.

Common pitfalls and critiques

  • Estimating p too aggressively. Kelly is brutally sensitive to overestimating edge — a 60% perceived edge that's really 55% leads to 2× Kelly betting and zero long-run growth. Use fractional Kelly, always.
  • Treating one big bet as repeated. Kelly's optimality is asymptotic; in finite samples with one bet, expected-utility maximization (with a non-log u) gives different sizing.
  • Forgetting correlations across bets. If your N "independent" trades are actually correlated, summed Kelly allocations exceed the joint optimum. De-correlate before summing.
  • Ignoring transaction costs. Continuous rebalancing to maintain a fixed Kelly fraction incurs costs that eat the edge for small-edge / high-volatility strategies.
  • Confusing Kelly with risk-neutral profit maximization. Kelly grows wealth fastest in the long run, but in any short window expected dollar return is higher under more aggressive sizing — at the cost of much higher variance and ruin probability.
  • Ignoring the drawdown distribution. Full Kelly has expected maximum drawdown of roughly 50% over a 100-bet horizon; many funds cannot survive that. Half-Kelly cuts the expected drawdown to ~25% with only ~25% loss of growth.

Frequently asked questions

What is the Kelly formula?

For a repeated bet that pays net odds b for a $1 stake on a win, with win probability p and loss probability q = 1 − p, the optimal fraction of bankroll to bet is f* = (bp − q)/b = (p(b+1) − 1)/b. For an even-money bet (b = 1), this simplifies to f* = 2p − 1 = p − q — the edge. Example: 60% win / 40% loss, even money → f* = 0.60 − 0.40 = 0.20, or 20% of bankroll. Bet more and long-run growth falls. Bet 2× Kelly (40% here) and the expected log-return goes to zero — you grow no faster than break-even on average.

Why log utility?

Kelly's derivation maximizes the expected logarithm of wealth after each bet. That's equivalent to maximizing the long-run geometric growth rate, by the law of large numbers: lim (1/N) log(W_N / W_0) = E[log(1 + f·R)] almost surely. So Kelly is the unique strategy that asymptotically maximizes wealth almost surely — even more compelling than any one-shot expected-utility argument. It's also identical to Bernoulli's 1738 St. Petersburg solution applied repeatedly, which is no coincidence — Kelly was unaware of Bernoulli's paper but rederived the same logic.

Should I always use full Kelly?

Almost never in practice. Full Kelly assumes you know p and b exactly. Real estimates have error, and overestimating your edge leads to over-betting — which goes to zero growth faster than under-betting. Most practitioners use fractional Kelly: bet 0.25× to 0.5× of Kelly. Half-Kelly captures about 75% of the growth rate with about half the variance. Ed Thorp, who applied Kelly to blackjack card-counting in the 1960s, recommended 'less than half' for typical errors in edge estimation. Variable Kelly approaches (Vince 1990, MacLean-Thorp-Ziemba 2010) add adaptive guards.

What's the downside of over-betting?

The expected log-growth function g(f) = p·log(1 + bf) + q·log(1 − f) is concave in f, peaks at f* = (bp−q)/b, and crosses zero at f = 2f* (the no-growth fraction). Beyond 2f* the expected log-growth turns negative — you lose money in the long run despite a positive-EV edge. Worse, your drawdowns get exponentially deeper. For a 60/40 even-money bet, betting 40% of bankroll each round gives long-run growth zero; betting 50% gives negative expected log growth; betting 80% is near-certain ruin within a few hundred rounds even if you 'should' have positive edge.

Who has used Kelly in practice?

Ed Thorp pioneered it in casino blackjack (Beat the Dealer, 1962) and later in Princeton-Newport Partners, which compounded ~20% net annually from 1969 to 1988 using Kelly-sized statistical-arbitrage trades. Warren Buffett's portfolio sizing is qualitatively Kelly-like — concentrated positions sized to expected edge. Jim Simons's Renaissance Technologies uses Kelly-derived sizing on its short-horizon signals. In sports betting: Bill Benter's horse-racing fund used fractional Kelly to extract estimated $1B from Hong Kong racing markets 1984-2006. Haralabos Voulgaris built one of NBA betting's largest bankrolls in the 2000s using a similar approach.

How does Kelly relate to mean-variance and Sharpe ratio?

For continuous trading with Brownian returns of drift μ and volatility σ, Kelly's continuous-time leverage is f* = μ/σ² — also called the Merton share. Equivalently f* = Sharpe ratio / σ. A Sharpe-1 strategy with 20% vol calls for ~25% leverage; a Sharpe-2 same-vol strategy calls for ~50%. Mean-variance optimization gives an identical answer for Gaussian returns and log utility. Kelly differs from mean-variance only when distributions are fat-tailed (Kelly's geometric growth penalizes drawdowns more) or when utility is non-log. Sharpe ratio is what Kelly maximizes once leverage is optimized.