Sampling Algorithms

Fisher-Yates Shuffle

The only correct way to randomly permute an array

Fisher-Yates produces every one of n! permutations with equal probability in O(n) time using one swap per position. Knuth's algorithm 3.4.2 — the only correct way to shuffle an array.

  • TimeO(n)
  • SpaceO(1) (in place)
  • DistributionUniform over n! permutations
  • Random drawsn − 1 ints from shrinking range
  • StableN/A (every output is a permutation)
  • OriginalFisher & Yates 1938; Durstenfeld 1964 in-place

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Fisher-Yates works

The algorithm walks the array from the last index down to index 1. At each position i, it picks a random index j ∈ [0, i] and swaps arr[i] with arr[j]. After the swap, position i is locked in — never touched again. The "unshuffled" prefix shrinks by one per step.

The pseudocode:

for i from n − 1 down to 1:
    j ← uniform random integer in [0, i]   (inclusive on both ends)
    swap arr[i] and arr[j]

Exactly n − 1 swaps, exactly n − 1 random draws. The number of possible random sequences is n × (n − 1) × (n − 2) × ... × 2 = n! — the same as the number of permutations. The map from sequences to permutations is a bijection: every permutation corresponds to exactly one sequence of j choices.

Trace: shuffle [A, B, C, D]

The array has length 4, so we make 3 random draws.

Initial: [A, B, C, D]

i = 3, range [0, 3], suppose j = 1
  Swap arr[3]=D with arr[1]=B
  → [A, D, C, B]
  Position 3 locked: B

i = 2, range [0, 2], suppose j = 0
  Swap arr[2]=C with arr[0]=A
  → [C, D, A, B]
  Position 2 locked: A

i = 1, range [0, 1], suppose j = 1 (no actual swap)
  Swap arr[1]=D with arr[1]=D
  → [C, D, A, B]
  Position 1 locked: D
  Position 0 locked by default: C

Final: [C, D, A, B]

Three random draws produced one of 4! = 24 permutations. Every permutation is reachable by exactly one sequence of (j₃, j₂, j₁) choices, and there are 4 × 3 × 2 = 24 such sequences. Bijection confirmed.

Why is it uniform?

The cleanest proof is by induction. Claim: after iteration i, the elements at positions i, i+1, ..., n−1 are a uniform random size-(n−i) sample from the original array.

Base case i = n − 1: position n − 1 holds arr[j] for j uniform in [0, n−1]. So each element appears with probability 1/n — uniform.

Inductive step: assume positions i+1, ..., n−1 are a uniform sample. After swapping arr[i] with arr[j] for j uniform in [0, i], position i now holds a uniform sample from the i + 1 elements still in the prefix, which together with the locked suffix forms a uniform sample of size n − i + 1. Done.

Equivalently: P(specific permutation) = (1/n) × (1/(n−1)) × ... × (1/2) = 1/n! — independent of the input order.

When to use Fisher-Yates

  • Card shuffling. Online poker, blackjack, and casino simulators run Fisher-Yates on a 52-element deck before every hand. Anything else (random sort, partial swaps) is biased and detectable by audits.
  • Randomized algorithms. Quickselect, randomized quicksort, and randomized incremental algorithms (e.g., 2D convex hull) start by shuffling the input to guarantee expected complexity regardless of adversarial ordering.
  • A/B testing bucket assignment. Random ordering of users to control vs. treatment when the assignment is precomputed. For streaming assignment, hash the user ID instead.
  • Cross-validation splits. ML pipelines shuffle the dataset before partitioning into k folds to avoid order bias.
  • Random sample without replacement. Shuffle and take the first k elements — uniform random subset of size k in O(n) time and O(1) extra space.

If you need k uniformly random samples from n with k « n, partial Fisher-Yates is faster: stop after k iterations instead of going all the way. The first k positions are a uniform random size-k sample.

Fisher-Yates vs alternatives

Fisher-YatesRandom sortNaive swap (uniform range)Reservoir sampleSattolo's algorithmRandom projection
TimeO(n)O(n log n)O(n)O(n)O(n)O(n)
SpaceO(1)O(1) or O(n)O(1)O(k)O(1)O(n)
UnbiasedYesNoNoYesYes (but only n-cycles)Yes
StreamingNoNoNoYesNoNo
In placeYesYesYesNoYesNo
Best forFull shuffle, array in memoryNeverNever (biased)Stream of unknown lengthRandom cyclic permutationTagging + sorting tags

The "naive swap" row refers to the common mistake of writing j = random(0, n − 1) instead of j = random(0, i). The fix is one character; the bias is severe and uniformly skews toward "near-identity" permutations.

What "O(n)" means in nanoseconds

Each iteration is one random integer draw plus one swap. On modern hardware with a fast PRNG (xoshiro256, PCG, or V8's Math.random):

nFisher-YatesRandom sort (TimSort + Math.random)Speedup
100~1 µs~30 µs30×
1,000~10 µs~500 µs50×
10,000~100 µs~9 ms90×
100,000~1 ms~150 ms150×
1,000,000~12 ms~2.5 s200×

Random sort is slower for two reasons: it's O(n log n) algorithmically, and the random comparator forces the sort algorithm to do more work because it can't take comparison results at face value (most sort algorithms assume consistent ordering, which a random comparator violates). The gap grows with n in proportion to log n.

JavaScript implementation

// Durstenfeld in-place Fisher-Yates.
function shuffle(arr) {
  for (let i = arr.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * (i + 1));   // [0, i] inclusive
    [arr[i], arr[j]] = [arr[j], arr[i]];
  }
  return arr;
}

// Crypto-secure variant (cards, gambling, anywhere bias is exploitable)
function secureShuffle(arr) {
  const buf = new Uint32Array(1);
  for (let i = arr.length - 1; i > 0; i--) {
    let j;
    // Rejection sampling to avoid modulo bias
    const max = Math.floor(2**32 / (i + 1)) * (i + 1);
    do { crypto.getRandomValues(buf); } while (buf[0] >= max);
    j = buf[0] % (i + 1);
    [arr[i], arr[j]] = [arr[j], arr[i]];
  }
  return arr;
}

// Functional variant returning a fresh array
const shuffled = arr => { const c = [...arr]; shuffle(c); return c; };

// Partial shuffle — first k elements are a uniform random sample
function sample(arr, k) {
  const c = [...arr], n = c.length;
  for (let i = 0; i < k; i++) {
    const j = i + Math.floor(Math.random() * (n - i));
    [c[i], c[j]] = [c[j], c[i]];
  }
  return c.slice(0, k);
}

The Math.random() * (i + 1) trick floors to a uniform int in [0, i]. For tiny i this is fine; for large arrays in cryptographic settings, use the rejection-sampling variant to eliminate modulo bias.

Famous problem: shuffle a 52-card deck

function newDeck() {
  const suits = ['♠', '♥', '♦', '♣'];
  const ranks = ['A','2','3','4','5','6','7','8','9','10','J','Q','K'];
  return ranks.flatMap(r => suits.map(s => r + s));
}

const deck = newDeck();
shuffle(deck);
console.log(deck.slice(0, 5));  // → a random 5-card hand

52! ≈ 8 × 10⁶⁷ permutations. Even at 10¹⁵ shuffles per second, the universe wouldn't see half of them before the heat death of the sun. Every Fisher-Yates shuffle of a real deck is almost certainly the first time that exact ordering has ever existed.

A common bug in gambling code: using Math.random() with insufficient PRNG state. V8's Math.random() uses xorshift128+ with 128 bits of internal state, supporting about 2¹²⁸ distinct sequences — far less than 52!. To truly cover the permutation space you need a CSPRNG seeded with at least 226 bits (since ⌈log₂(52!)⌉ = 226).

Python implementation

import random

def fisher_yates(arr):
    for i in range(len(arr) - 1, 0, -1):
        j = random.randint(0, i)        # inclusive on both ends
        arr[i], arr[j] = arr[j], arr[i]
    return arr

# Standard library — uses Fisher-Yates internally
random.shuffle(arr)                     # in-place
random.sample(arr, k)                   # uniform random k elements, no mutation

# Crypto-secure (for cards, lotteries, etc.)
import secrets
def secure_shuffle(arr):
    for i in range(len(arr) - 1, 0, -1):
        j = secrets.randbelow(i + 1)
        arr[i], arr[j] = arr[j], arr[i]
    return arr

# NumPy — fastest for very large arrays
import numpy as np
rng = np.random.default_rng()
rng.shuffle(arr)                        # in-place, Permutation-Congruential
rng.permutation(arr)                    # returns a fresh array

The classic biased shuffle bug

The most common shuffle bug, repeated across StackOverflow answers and even some textbooks:

// WRONG — biased shuffle
function badShuffle(arr) {
  for (let i = 0; i < arr.length; i++) {
    const j = Math.floor(Math.random() * arr.length);   // [0, n) instead of [0, i]
    [arr[i], arr[j]] = [arr[j], arr[i]];
  }
  return arr;
}

This produces n^n equally likely random sequences. There are only n! permutations. For n = 3, the algorithm has 27 sequences mapping to 6 permutations. 27 is not divisible by 6, so some permutations are reached by more sequences than others. Test by shuffling [A, B, C] one million times:

Bad shuffle results (1,000,000 trials, n=3):
  A B C: 185,300  (expected 166,667)
  A C B: 148,200  (expected 166,667)
  B A C: 148,100  (expected 166,667)
  B C A: 185,500  (expected 166,667)
  C A B: 185,400  (expected 166,667)
  C B A: 147,500  (expected 166,667)

Bias: ~25% over-representation of some permutations.

Even worse is the arr.sort(() => Math.random() - 0.5) pattern. The bias here depends on the sort algorithm's comparison pattern; Chrome's TimSort produces dramatic skew on small inputs (some permutations 6× more likely than others on n = 6).

Sattolo's variant: random cyclic permutation

One-character change to Fisher-Yates produces all and only the n-cycles (permutations with no fixed points), useful for derangements:

function sattolo(arr) {
  for (let i = arr.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * i);   // [0, i) NOT [0, i]
    [arr[i], arr[j]] = [arr[j], arr[i]];
  }
  return arr;
}

The range is [0, i) instead of [0, i] — so arr[i] never swaps with itself. The result is a uniform random cyclic permutation: 1 → 2 → 3 → ... → n → 1, where each "→" is "moves to the position formerly held by." Useful for Secret Santa drawings (everyone gets exactly one different person).

Common bugs and edge cases

  • Off-by-one in the range. random(0, i) must be inclusive of both endpoints. JavaScript's Math.floor(Math.random() * (i + 1)) handles this; Python's random.randint(0, i) handles it; but random.randrange(0, i) excludes i and yields Sattolo, not Fisher-Yates.
  • Modulo bias. rand() % (i + 1) in C is biased when RAND_MAX isn't a multiple of i + 1. For large i this matters. Use rejection sampling.
  • PRNG state too small for the permutation space. A 32-bit PRNG covers only 2³² ≈ 4 × 10⁹ distinct shuffles, so 13! = 6 × 10⁹ already exceeds what's reachable. For 52-card decks, you need at least 226 bits of seed entropy.
  • Reusing a shuffled deck. After dealing, the remaining cards are still shuffled — but if you re-shuffle only the discards and append them, the result is no longer uniform. Always shuffle the whole deck.
  • Shuffling references vs values. Shuffling an array of objects rearranges references; the objects themselves are unchanged. Don't be surprised if mutations via the original references show through.

Frequently asked questions

Why does the random range have to shrink with i?

If you pick j from [0, n) at every step, the algorithm has n^n equally likely random sequences. There are only n! permutations, and n^n is not divisible by n! for n ≥ 3. By the pigeonhole principle, some permutations get more random sequences than others, so the shuffle is biased. Picking j from [0, i] makes the count exactly n × (n-1) × ... × 1 = n!, a bijection.

Is Math.random shuffle in JavaScript safe?

Math.random() is fine for games, A/B tests, and visual shuffles, but it's not cryptographically secure — its 56-bit internal state means after about 2^28 outputs you can predict the next value with some effort. For card-game RNG, cryptocurrency, or anywhere bias is exploitable, use crypto.getRandomValues() to draw j.

What's the difference between Fisher-Yates and Durstenfeld's variant?

Fisher and Yates's original 1938 algorithm built a new list by removing elements one at a time from the source — O(n²) because each removal is O(n). Durstenfeld's 1964 in-place variant swaps within the original array in O(n). Modern usage of 'Fisher-Yates' usually means Durstenfeld's algorithm. Knuth published the same approach in TAOCP Vol. 2 as Algorithm P.

Why is sort with a random comparator wrong?

Array.prototype.sort((a, b) => Math.random() - 0.5) is biased and unstable. The exact bias depends on the sort implementation (V8 uses TimSort), but in every case, some permutations appear more often than 1/n!. A widely-reported test ran the random-comparator shuffle on [A, B, C] 600,000 times and found permutation A,B,C appearing 50% more often than C,B,A in some browsers.

How do I shuffle without modifying the original array?

Copy first, then shuffle the copy: const shuffled = arr.slice(); fisherYates(shuffled). Don't be tempted to shuffle inline with .slice().sort(() => Math.random() - 0.5) — the comparator approach is biased. The slice + Fisher-Yates pattern costs O(n) time and O(n) space.

Can I shuffle a stream of unknown length?

Yes — reservoir sampling generalizes Fisher-Yates to streams. Keep a reservoir of size k. For each new element at position i (0-indexed), pick j ∈ [0, i]; if j < k, replace reservoir[j] with the new element. After the stream ends, the reservoir is a uniform random sample of size k from the n elements seen. Setting k = n gives a full streamed shuffle.