Error Correction

Hamming Code

Single-error-correcting linear codes — the first family that lets you find and fix a flipped bit

Hamming codes add k parity bits to 2^k - k - 1 data bits. The XOR syndrome of three parity checks points directly at the flipped bit. Hamming(7,4) corrects one error in seven bits.

  • Hamming(7,4)4 data + 3 parity = 7 bits
  • Correction1 error per block (t = 1)
  • Parity positionsPowers of 2: 1, 2, 4, 8…
  • Family sizes(7,4), (15,11), (31,26), (63,57), (127,120)
  • Decode complexityO(n) XOR — nanoseconds in HW
  • Used byECC RAM (SECDED), flash inner codes, radio links

Interactive visualization

Encode four data bits as a seven-bit Hamming codeword, flip one bit during transit, and watch the syndrome point exactly at the error.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How Hamming codes work

Richard Hamming worked at Bell Labs in the late 1940s. Bell ran batch jobs on a relay computer that would halt and refuse to continue if it detected a parity error. Hamming, frustrated by losing weekends to single-bit errors that the machine could detect but not fix, asked the question every engineer eventually asks: if the machine knows something is wrong, why can't it figure out what's wrong? The answer he published in 1950 is the foundation of every modern error-correcting code.

The construction. Take a block of n bit positions numbered 1, 2, 3, …, n. Reserve every position whose number is a power of two — positions 1, 2, 4, 8, … — for parity bits. All other positions carry data bits. Each parity bit covers every position whose binary representation has the corresponding bit set:

  • Parity at position 1 covers positions 1, 3, 5, 7, 9, 11, … (binary index ends in …1).
  • Parity at position 2 covers positions 2, 3, 6, 7, 10, 11, … (binary index has bit-2 set).
  • Parity at position 4 covers positions 4, 5, 6, 7, 12, 13, … (binary index has bit-4 set).
  • Parity at position 8 covers positions 8-15, 24-31, … (binary index has bit-8 set).

Each parity bit is set so the XOR of all bits in its coverage set is zero. That's the encoding rule, applied k times for k parity bits.

Worked example — Hamming(7,4)

Encode the data nibble 1011. The seven transmitted positions are:

position:  1    2    3    4    5    6    7
role:      p1   p2   d1   p4   d2   d3   d4
value:     ?    ?    1    ?    0    1    1

Compute parity p1 (covers 1, 3, 5, 7): p1 ⊕ 1 ⊕ 0 ⊕ 1 = 0 → p1 = 0.
Compute p2 (covers 2, 3, 6, 7): p2 ⊕ 1 ⊕ 1 ⊕ 1 = 0 → p2 = 1.
Compute p4 (covers 4, 5, 6, 7): p4 ⊕ 0 ⊕ 1 ⊕ 1 = 0 → p4 = 0.

Transmitted codeword: 0 1 1 0 0 1 1 (positions 1-7).

Now suppose noise flips position 5: the received word is 0 1 1 0 1 1 1. The decoder computes three syndrome bits:

s1 = pos1 ⊕ pos3 ⊕ pos5 ⊕ pos7 = 0 ⊕ 1 ⊕ 1 ⊕ 1 = 1
s2 = pos2 ⊕ pos3 ⊕ pos6 ⊕ pos7 = 1 ⊕ 1 ⊕ 1 ⊕ 1 = 0
s4 = pos4 ⊕ pos5 ⊕ pos6 ⊕ pos7 = 0 ⊕ 1 ⊕ 1 ⊕ 1 = 1

Read the syndrome bits as a binary number with s4 as the high bit: s4 s2 s1 = 1 0 1 = 5. The error is at position 5. Flip it back to get the original codeword 0 1 1 0 0 1 1, extract the data bits at positions 3, 5, 6, 7 — 1 0 1 1 — and you've recovered the original. The syndrome was a tiny computer that pointed straight at the corruption.

Why the syndrome is the error location

The magic is structural. Every data position's binary index is the XOR of some subset of {1, 2, 4} — because positions 1, 2, 4 are exactly the basis. Position 5 = 4 + 1, so a flip at position 5 affects parities p1 and p4, but not p2. Position 3 = 2 + 1 affects p1 and p2, not p4. Position 7 = 4 + 2 + 1 affects all three.

Reading the syndrome (s4, s2, s1) as a binary number tells you which parities the flipped bit affected — and by construction, that pattern is exactly the binary representation of the flipped bit's position. The single-error correction is built into the choice of parity coverage; the decoder is doing nothing more than reverse arithmetic.

Hamming distance and code rate

CodeData bitsParity bitsTotal bitsRateCorrects
Hamming(3,1)12333%1 error
Hamming(7,4)43757%1 error
Hamming(15,11)1141573%1 error
Hamming(31,26)2653184%1 error
Hamming(63,57)5766390%1 error
Hamming(127,120)120712794%1 error

Code rate climbs toward 100% as blocks grow, but the correction capability stays at one error per block. That's the fundamental Hamming trade: longer blocks are more efficient but also more vulnerable to multi-bit corruption. ECC RAM uses Hamming(72,64) — slightly extended to detect double errors — because in practice a single 64-bit cache line rarely sees more than one error within its lifetime.

Hamming vs other error-correcting codes

CodeErrors correctedBurst toleranceDecoding costUsed by
Parity bit0 (detect only)1 bit1 XORUART parity, naive checks
Hamming(7,4)1 per 7 bits1 bit3 XOR + lookupTextbook, embedded radio
SECDED Hamming(72,64)1 + detect 21 bit~8 XOR per 64-bit wordECC RAM
Reed-Solomon RS(255,223)16 byte errors per 255 bytes~125 contiguous bitsPolynomial divisionCDs, QR codes, deep space
BCH(15,7)2 per 15 bits2 bitsBerlekamp-Massey decodeFlash NAND, satellite
LDPCApproaches Shannon limitBurst-tolerant via interleavingIterative belief propagationWi-Fi 6, 5G, modern SSDs
Convolutional + ViterbiContinuous correctionVariableTrellis tracebackCellular voice, GPS

Hamming sits at the cheap-and-cheerful end: minimal overhead, one-error correction, two-error detection with the SECDED variant. For anything more demanding — burst errors, multiple flips per block, or near-Shannon-limit performance — you move to Reed-Solomon, BCH, or LDPC. Those families are direct intellectual descendants of Hamming, just generalizing the parity-check idea to larger field sizes and more powerful syndrome decoders.

Pseudo-code — Hamming(7,4)

// Hamming(7,4) encode.
function encode(d1, d2, d3, d4):
    // Positions: 1=p1, 2=p2, 3=d1, 4=p4, 5=d2, 6=d3, 7=d4
    p1 = d1 ⊕ d2 ⊕ d4         // covers pos 3, 5, 7
    p2 = d1 ⊕ d3 ⊕ d4         // covers pos 3, 6, 7
    p4 = d2 ⊕ d3 ⊕ d4         // covers pos 5, 6, 7
    return [p1, p2, d1, p4, d2, d3, d4]

// Hamming(7,4) decode.
function decode(c):  // c indexed 1..7
    s1 = c[1] ⊕ c[3] ⊕ c[5] ⊕ c[7]
    s2 = c[2] ⊕ c[3] ⊕ c[6] ⊕ c[7]
    s4 = c[4] ⊕ c[5] ⊕ c[6] ⊕ c[7]
    syndrome = s4 * 4 + s2 * 2 + s1
    if syndrome != 0:
        c[syndrome] = 1 - c[syndrome]   // flip the error
    return [c[3], c[5], c[6], c[7]]      // extract data bits

JavaScript implementation

// Hamming(7,4): encode 4 data bits as 7-bit codeword, correct 1 error.
function hammingEncode(d) {
  // d is a 4-element array of 0/1 bits: [d1, d2, d3, d4].
  const [d1, d2, d3, d4] = d;
  const p1 = d1 ^ d2 ^ d4;
  const p2 = d1 ^ d3 ^ d4;
  const p4 = d2 ^ d3 ^ d4;
  // 1-indexed positions: [_, p1, p2, d1, p4, d2, d3, d4]
  return [p1, p2, d1, p4, d2, d3, d4];
}

function hammingDecode(c) {
  // c is a 7-element 1-indexed-ish array starting at index 0 = pos1.
  // Compute syndrome.
  const s1 = c[0] ^ c[2] ^ c[4] ^ c[6];  // pos 1,3,5,7
  const s2 = c[1] ^ c[2] ^ c[5] ^ c[6];  // pos 2,3,6,7
  const s4 = c[3] ^ c[4] ^ c[5] ^ c[6];  // pos 4,5,6,7
  const syndrome = (s4 << 2) | (s2 << 1) | s1;
  if (syndrome !== 0) c[syndrome - 1] ^= 1;
  return { data: [c[2], c[4], c[5], c[6]], errorAt: syndrome };
}

const original = [1, 0, 1, 1];
const codeword = hammingEncode(original);   // [0,1,1,0,0,1,1]
codeword[4] ^= 1;                            // flip position 5
const recovered = hammingDecode(codeword.slice());
console.log(recovered.errorAt);              // 5
console.log(recovered.data);                 // [1, 0, 1, 1]

Python implementation

def hamming_encode(data):
    """Encode 4 data bits as a 7-bit Hamming(7,4) codeword."""
    d1, d2, d3, d4 = data
    p1 = d1 ^ d2 ^ d4
    p2 = d1 ^ d3 ^ d4
    p4 = d2 ^ d3 ^ d4
    # positions 1..7
    return [p1, p2, d1, p4, d2, d3, d4]

def hamming_decode(codeword):
    c = list(codeword)
    s1 = c[0] ^ c[2] ^ c[4] ^ c[6]
    s2 = c[1] ^ c[2] ^ c[5] ^ c[6]
    s4 = c[3] ^ c[4] ^ c[5] ^ c[6]
    syndrome = (s4 << 2) | (s2 << 1) | s1
    if syndrome:
        c[syndrome - 1] ^= 1
    return [c[2], c[4], c[5], c[6]], syndrome

original = [1, 0, 1, 1]
codeword = hamming_encode(original)            # [0, 1, 1, 0, 0, 1, 1]
codeword[4] ^= 1                                # flip position 5
data, err_pos = hamming_decode(codeword)
print(f"error at position {err_pos}, data = {data}")
# error at position 5, data = [1, 0, 1, 1]

Common Hamming bugs and edge cases

  • Off-by-one indexing. Hamming numbers positions 1-based — position 1 is the first bit. Translating to 0-based arrays is the most common bug source. Always state in comments whether your c[i] is position i or position i+1.
  • Forgetting to extract data after correction. Don't return the corrected 7-bit word — return only the data bits at positions 3, 5, 6, 7. Otherwise downstream code thinks the parity bits are data.
  • Double-error masquerade. Two flipped bits produce a syndrome that points at a third innocent bit. The decoder happily "corrects" it, silently corrupting the data. Use SECDED if double errors are possible.
  • Syndrome bit ordering. Some textbooks order the syndrome as (s1, s2, s4), others as (s4, s2, s1). The position calculation flips entirely between the two. Pick a convention and stick to it.
  • Trusting Hamming on a burst channel. A burst error that flips 5 consecutive bits within one block annihilates Hamming(15,11) silently. Always interleave or use a burst-tolerant code on channels where bursts dominate.
  • Confusing rate with strength. Hamming(127,120) has 94% rate but still only corrects one error per 127 bits. The total number of errors a long block tolerates does not grow with block length — only with the family generalization to BCH or LDPC.

Performance in real systems

  • ECC DDR4/5 RAM: Uses Hamming(72,64) SECDED — corrects 1 bit per 64-bit word, detects 2 bits, with ~12.5% memory overhead (8 ECC bits per 64 data bits). Bit-error rate is ~1 error per GB-month; ECC catches essentially all of them.
  • SLC flash: Some controllers use Hamming-style inner codes (4 ECC bits per 512-byte sector) for fast per-page parity; outer BCH catches multi-bit failures.
  • Hardware decode latency: One XOR-tree depth for parity computation plus a 7-input mux for the syndrome lookup — about 3-5 logic levels, sub-nanosecond on modern silicon.
  • Software decode: ~10 ns per 7-bit block in scalar code; SIMD batches 32+ blocks per cycle for software ECC.
  • SECDED Hamming(72,64) overhead: 12.5% memory; ~0.5% performance hit on memory-bound workloads from the ECC bit logic.

The takeaway: Hamming codes solved an open problem in 1950 — finding and fixing errors with provably minimum overhead — and remain useful 75 years later because they're nearly free to implement in hardware. Every more advanced ECC scheme can be understood as a generalization of "parity bits whose check pattern uniquely identifies the failed location." Hamming was the first to realize that the index of an error is a number you can compute.

Frequently asked questions

Why are parity bits placed at positions 1, 2, 4, 8…?

The trick that makes Hamming codes elegant: place parity bits at positions whose indices are powers of two (1, 2, 4, 8, 16…), and place data bits at all other positions. Each parity bit then 'covers' the set of positions whose binary index has that bit set. Parity 1 covers positions 1, 3, 5, 7, 9… (binary index ends in 1). Parity 2 covers 2, 3, 6, 7, 10, 11… (binary index has bit-2 set). Decoding becomes trivial: compute each parity's XOR; the failing parities, read as a binary number, directly point at the error's position.

How does Hamming(7,4) correct one error?

Encode 4 data bits as 7 transmitted bits with 3 parity bits at positions 1, 2, 4. On receive, compute s1 = parity over positions 1,3,5,7; s2 = parity over 2,3,6,7; s4 = parity over 4,5,6,7. If all three are zero, no error occurred. If any are non-zero, treat (s4 s2 s1) as a 3-bit binary number giving the error position — flip that bit and you have the original codeword. Correctness rests on the fact that every single bit flip produces a unique non-zero syndrome.

What does Hamming distance mean?

Hamming distance between two bit strings is the number of positions where they differ. A code's minimum Hamming distance d is the smallest distance between any two valid codewords. To correct t errors, you need d ≥ 2t+1 — every received word lies within Hamming radius t of exactly one codeword. Hamming(7,4) has d=3, so it can correct t=1 error or detect up to d-1=2 errors (but not both at the same time).

Can Hamming codes detect two errors?

Plain Hamming(7,4) cannot reliably handle two errors: a double flip may produce a syndrome that points at a third innocent bit, and 'correcting' that bit silently corrupts the data. SECDED Hamming (Single Error Correction, Double Error Detection) adds one extra overall parity bit — Hamming(8,4) or Hamming(72,64) — which lets you detect a double flip even when single-error correction is also performed. ECC RAM uses SECDED Hamming-style codes.

Where is Hamming used today?

ECC RAM (server-grade memory) uses an extended Hamming SECDED code — typically Hamming(72,64), correcting one error per 64 data bits and detecting two. Some flash controllers use a Hamming-style page-protection layer alongside a stronger BCH or LDPC outer code. Small embedded systems sometimes hand-roll Hamming(7,4) for low-rate radio links. Modern high-bandwidth channels prefer Reed-Solomon or LDPC for their stronger burst-error handling, but Hamming remains the textbook example.

What sizes do Hamming codes come in?

For k parity bits, a Hamming code has 2^k - 1 total bits, of which 2^k - k - 1 are data bits. So k=3 gives (7,4), k=4 gives (15,11), k=5 gives (31,26), k=6 gives (63,57), k=7 gives (127,120). Code rate (data/total) approaches 1 as k grows — Hamming(127,120) is 94.5% efficient. The single-error-correction guarantee holds regardless of size; only the block length and code rate change.

What's the complexity of Hamming encode/decode?

O(n) for both encode and decode where n is the block length, using bitwise operations and XOR. The encoder computes k parity bits with k XOR chains over masked subsets; the decoder computes the syndrome with k XOR chains and indexes a single bit-flip in constant time. Hamming(7,4) decode runs in nanoseconds on any CPU — fast enough that ECC RAM imposes near-zero performance penalty. Hardware encoders/decoders are a handful of XOR gates.