Error Correction
CRC (Cyclic Redundancy Check)
Polynomial-division checksum — the error-detection workhorse of Ethernet, ZIP, and PNG
CRC treats the message as a polynomial over GF(2) and emits the remainder of dividing by a fixed generator. CRC-32 detects all single bit errors and all bursts up to 32 bits in any frame.
- CRC-32 polynomial0x04C11DB7 (Ethernet, ZIP, PNG)
- Single-bit detection100% — guaranteed
- Burst detectionAll bursts ≤ 32 bits in CRC-32
- Random-error miss rate~1 in 2³² (≈4.3 billion)
- Compute speed~8 bytes/cycle with CRC32 instruction
- Used byEthernet, ZIP, PNG, gzip, MPEG-2, ISO 3309
Interactive visualization
Watch CRC-8 with generator polynomial 0x07 grind through a small message via repeated XOR — the remainder at the end is the checksum.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How CRC works
CRC's central trick is to treat a sequence of bits as the coefficients of a polynomial over the field GF(2) — bits as values 0 or 1, addition as XOR, no carries. Polynomial multiplication and division work the way you'd expect from school algebra, except every operation is XOR. In this world, division has a clean remainder, and the remainder is what the CRC is.
The algorithm:
- Pick a generator polynomial
g(x)of degreen. Standards specify these (CRC-32 uses 0x04C11DB7, an order-32 polynomial). - Take the message
m(x). Appendnzero bits at the end — this ism(x) · x^n. - Divide the shifted message by
g(x)using XOR-based long division. - The remainder
r(x)— ann-bit value — is the CRC. - Transmit the message followed by the CRC. The complete codeword
m(x)·x^n + r(x)is, by construction, divisible byg(x).
The receiver does the same division on the received codeword. If the result is zero, no detectable error occurred. If non-zero, an error is present — though CRC says nothing about where or what, only that something is wrong.
Worked example — CRC-8 with poly 0x07
The generator polynomial 0x07 in binary is 00000111, but since CRC-8 has degree 8, we conceptually prepend a 1 and treat the divisor as the 9-bit value 100000111. The message: a single byte 0xAB = 10101011.
Step 1: append 8 zeros to make room for the CRC:
message + 8 zeros: 1010 1011 0000 0000
Step 2: long-divide by 1 0000 0111 (the 9-bit divisor). At each step, align the divisor's high bit with the next 1 bit in the working register, XOR, and slide forward.
10101011 00000000 (dividend, 16 bits)
100000111 (divisor aligned at bit 15)
---------
00101101 11000000 (XOR result; high bit cleared)
100000 111 (slide to next 1, position 13)
---------
001101 00100000 (XOR; bit 13 cleared)
100 000111 (next 1 at position 10)
---------
001 00010100 (XOR; bit 10 cleared)
1 00000111 (next 1 at position 8)
---------
0 00010011 (8-bit remainder)
The CRC-8 of 0xAB with generator 0x07 is 0x13. Transmit 0xAB 0x13 together; the receiver divides the whole 16-bit codeword by 0x107 and gets remainder 0, confirming no detectable error.
Modern implementations don't actually do bit-by-bit XOR. They use a 256-entry precomputed table that processes 8 bits per memory access:
// Pseudo: precomputed table-driven CRC-8 (Sarwate's algorithm).
function crc8(bytes):
crc = 0
for b in bytes:
crc = TABLE[crc XOR b]
return crc
Modern x86 CPUs have a CRC32 instruction that processes 8 bytes per cycle in hardware for CRC-32C. Standard CRC-32 (Ethernet/ZIP polynomial) lacks a dedicated instruction but achieves 20+ GB/s with PCLMULQDQ folding tricks.
Common CRC standards
| Name | Width | Polynomial | Used by | Notes |
|---|---|---|---|---|
| CRC-8 (CCITT) | 8 bits | 0x07 | I²C, SMBus, ATM cell header | Cheap, sufficient for short frames |
| CRC-16 (ARC) | 16 bits | 0x8005 | Modbus, USB data packets | Common in industrial protocols |
| CRC-16-CCITT | 16 bits | 0x1021 | HDLC, Bluetooth, XMODEM | Telecom standard |
| CRC-32 (Ethernet) | 32 bits | 0x04C11DB7 | Ethernet, ZIP, gzip, PNG, MPEG-2 | The default 32-bit CRC |
| CRC-32C (Castagnoli) | 32 bits | 0x1EDC6F41 | iSCSI, SCTP, ext4, Btrfs | Better Hamming distance; Intel hardware support |
| CRC-64-ECMA | 64 bits | 0x42F0E1EBA9EA3693 | XZ archive format, large files | Used when 32 bits feels too small |
| CRC-64-ISO | 64 bits | 0x000000000000001B | ISO 3309 HDLC alternate | Lower-weight variant |
The width-32 family dominates. CRC-32 has been Ethernet's frame check sequence since 1980 and has been the most-deployed error-detection code in history. CRC-32C exists because Castagnoli proved it has slightly better Hamming distance for messages up to 2,094 bytes — better burst protection for typical IP packets — and Intel added a hardware instruction in SSE 4.2 that computes it natively at billions of bytes per second.
CRC vs other integrity checks
| Check | Detection rate | Speed | Tamper-evident? | Use case |
|---|---|---|---|---|
| Single parity bit | All odd-count flips | Trivial | No | UART, naive checks |
| Sum / Fletcher checksum | ~99.6% random errors | Fast | No | TCP/UDP (16-bit), Adler-32 (zlib) |
| CRC-16 | ~99.9985% random | Fast (table lookup) | No | USB, Modbus, HDLC |
| CRC-32 | ~99.99999998% | Very fast (HW instruction) | No | Ethernet, ZIP, PNG |
| MD5 | ~100% but cryptographically broken | ~600 MB/s | No (collisions feasible) | Legacy file hashes |
| SHA-256 | ~100% | ~200-1500 MB/s | Yes (against random) | Content addressing, Git |
| HMAC-SHA256 | ~100% | ~200 MB/s | Yes — against adversary | Authenticated integrity |
The key distinction is random errors vs adversarial tampering. CRC is excellent against random bit flips — wire noise, cosmic rays, scratched media — and useless against an attacker who deliberately wants to forge a packet with a matching CRC (trivially computable). For tamper detection, use HMAC; for content addressing, use SHA-256; for "did the wire flip a bit", CRC is the right tool.
When to use CRC
- Wire-level frame check. Every Ethernet frame has a 32-bit CRC trailer. The NIC computes and verifies it in hardware — software never sees a frame that fails.
- Archive integrity. ZIP, gzip, PNG, and many other container formats embed CRC-32 to detect storage corruption. The cost is 4 bytes per file or chunk; the benefit is a hard guarantee of single-byte detection.
- Storage layer. ZFS, Btrfs, ext4 metadata, and modern SSDs use CRC-32C (with hardware acceleration) for per-block integrity.
- Industrial protocols. Modbus, CAN bus, USB, MIL-STD-1553 all use CRC-16 or CRC-32 for frame check on the bus.
- UDP/TCP segment validation. Different algorithm (one's-complement sum), but the same role: catch wire errors.
Skip CRC when you need cryptographic integrity (use HMAC instead), when you need error correction (use Hamming or Reed-Solomon), or when you need a content-addressable hash for deduplication (use SHA-256 or BLAKE3).
Pseudo-code
// Bit-by-bit CRC computation (slow but instructive).
function crcBitwise(data, polynomial, width):
crc = 0
topbit = 1 << (width - 1)
for each byte b in data:
crc = crc XOR (b << (width - 8))
for i in 1..8:
if crc AND topbit:
crc = (crc << 1) XOR polynomial
else:
crc = crc << 1
crc = crc AND ((1 << width) - 1) // mask to width bits
return crc
// Table-driven CRC (Sarwate's algorithm) — 8 bits per iteration.
function crcTable(data):
crc = 0
for each byte b in data:
crc = TABLE[crc XOR b]
return crc
// Build the 256-entry table once at startup.
function buildTable(polynomial):
for byte = 0..255:
c = byte
for i in 1..8:
c = (c >> 1) XOR (polynomial if (c AND 1) else 0)
TABLE[byte] = c
JavaScript implementation
// CRC-32 (Ethernet/ZIP polynomial, reflected form 0xEDB88320).
const CRC32_TABLE = new Uint32Array(256);
(function buildTable() {
for (let i = 0; i < 256; i++) {
let c = i;
for (let j = 0; j < 8; j++) {
c = c & 1 ? 0xEDB88320 ^ (c >>> 1) : c >>> 1;
}
CRC32_TABLE[i] = c >>> 0;
}
})();
function crc32(bytes) {
let crc = 0xFFFFFFFF;
for (let i = 0; i < bytes.length; i++) {
crc = (crc >>> 8) ^ CRC32_TABLE[(crc ^ bytes[i]) & 0xFF];
}
return (crc ^ 0xFFFFFFFF) >>> 0;
}
const msg = new TextEncoder().encode('hello');
console.log(crc32(msg).toString(16)); // 3610a686
// CRC-8 with polynomial 0x07 (no precomputed table for clarity).
function crc8(bytes, poly = 0x07) {
let crc = 0;
for (const b of bytes) {
crc ^= b;
for (let i = 0; i < 8; i++) {
crc = crc & 0x80 ? ((crc << 1) ^ poly) & 0xFF : (crc << 1) & 0xFF;
}
}
return crc;
}
console.log(crc8(new Uint8Array([0xAB])).toString(16)); // 13
Python implementation
import zlib
# CRC-32 via the standard library (Ethernet/ZIP polynomial).
print(hex(zlib.crc32(b'hello'))) # 0x3610a686
# Manual CRC-32 implementation for instruction.
def build_crc32_table(poly=0xEDB88320):
table = []
for i in range(256):
c = i
for _ in range(8):
c = (c >> 1) ^ poly if c & 1 else c >> 1
table.append(c)
return table
CRC32 = build_crc32_table()
def crc32_manual(data: bytes) -> int:
crc = 0xFFFFFFFF
for b in data:
crc = (crc >> 8) ^ CRC32[(crc ^ b) & 0xFF]
return crc ^ 0xFFFFFFFF
print(hex(crc32_manual(b'hello'))) # 0x3610a686
# CRC-8 with polynomial 0x07 (used by SMBus and some I2C devices).
def crc8(data: bytes, poly=0x07) -> int:
crc = 0
for b in data:
crc ^= b
for _ in range(8):
crc = ((crc << 1) ^ poly) & 0xFF if crc & 0x80 else (crc << 1) & 0xFF
return crc
print(hex(crc8(bytes([0xAB])))) # 0x13
Common CRC bugs and edge cases
- Wrong polynomial form. CRC-32 has at least four equivalent representations: 0x04C11DB7 (forward), 0xEDB88320 (reflected), 0x82608EDB (recipricoal), 0xDB710641 (Koopman). Mixing them produces incompatible CRCs. Always document which form and which initial value.
- Endianness. Some formats store the CRC big-endian, some little-endian. PNG uses big-endian; ZIP and gzip use little-endian. Wrong endianness fails verification even when the math is correct.
- Initial value. Standard CRC-32 starts with all-ones (0xFFFFFFFF) and XORs the final result with all-ones again — important for distinguishing zero-length and all-zero input. Skipping these XORs silently breaks compatibility with reference tables.
- Reflected input vs non-reflected. CRC-32 reflects each input byte; CRC-32 with no reflection produces different values for the same data. Catwiki lists the exact reflection rules for each named CRC.
- Using CRC for authentication. Don't. An attacker can forge a packet with the same CRC trivially — CRC is linear, so flipping bits at known positions XORs known deltas into the CRC. Use HMAC for tamper detection.
- Insufficient bit width. CRC-8 has a 1-in-256 collision probability for random errors. For large packets, this can mean catastrophic missed errors. Use CRC-32 (1 in 4 billion) or CRC-64 (1 in 1.8 × 10¹⁹) when the cost matters.
Performance in real systems
- Intel CRC32 instruction (SSE 4.2): CRC-32C at 8 bytes per cycle — ~30 GB/s on a 4 GHz core. Used everywhere CRC-32C appears (iSCSI, ext4, Btrfs).
- PCLMULQDQ-based folding: Implements standard CRC-32 (Ethernet polynomial) at 15-25 GB/s on modern x86 — fast enough to verify 100 Gbps Ethernet in software.
- Ethernet NICs: Hardware FCS generation runs at full line rate (100 Gbps Cisco ASIC, 400 Gbps Broadcom) with sub-microsecond latency added to the wire.
- gzip CRC overhead: ~3% of total decompression time on a typical text file — negligible compared to the Huffman and LZ77 stages.
- PNG chunk integrity: Each chunk's CRC catches storage corruption; decoders detect bit-rotted images instantly.
- Bluetooth Low Energy: 24-bit CRC on every packet — chosen for compactness against the 250 kbps physical layer.
The takeaway: CRC is one of those rare algorithms where the right answer hasn't changed since the 1960s. Polynomial division over GF(2), implemented with a 1KB table or a single CPU instruction, catches every kind of random error you'll see at the wire. It's free, it's deterministic, and it's everywhere.
Frequently asked questions
Why call it 'cyclic'?
Because the codes form a cyclic group under bit rotation: any cyclic shift of a valid codeword is also a valid codeword. This property comes from the polynomial structure — multiplying a polynomial by x (which shifts the bits) and reducing modulo the generator stays within the codespace. Cyclic codes are a foundational family in coding theory that includes CRCs, BCH codes, and Reed-Solomon, all sharing similar algebraic machinery.
How does CRC compute the remainder?
Treat the input bits as coefficients of a polynomial over GF(2). Append n zero bits to make room for the CRC (where n is the degree of the generator). Then divide by the generator polynomial using XOR-based long division: align the high bit of the divisor with the high bit of the dividend, XOR, slide right, repeat. The final remainder — a polynomial of degree less than n — is the CRC. Hardware does this with a linear feedback shift register; software uses precomputed table lookups for 8 or 16 bits at a time.
What does CRC-32 detect that simpler checksums miss?
CRC-32 catches: all single-bit errors (100%), all two-bit errors within a generator-dependent distance, all odd numbers of bit errors (because the generator includes x+1), all burst errors of length ≤32 bits (100%), 99.999999977% of bursts of length 33, and roughly 99.9999999% of all longer random error patterns. By contrast, simple sum checksums miss byte-level transpositions; CRC-32's structure detects them all. The miss rate for random errors is approximately 1 in 2^32 = 1 in 4.3 billion.
What's the standard CRC-32 polynomial?
0x04C11DB7 (also written as 0xEDB88320 in reflected form). This is the polynomial x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1. Used by Ethernet, ZIP, gzip, PNG, MPEG-2, ISO 3309, and almost every other CRC-32 specification. CRC-32C (Castagnoli, 0x1EDC6F41) is an alternative used by iSCSI, SCTP, and modern Intel CPUs (the CRC32 instruction implements it natively).
Is CRC cryptographically secure?
No — and it isn't trying to be. CRC is a hash designed to detect random transmission errors, not malicious tampering. Given a message m and its CRC c, an attacker can compute a modified m' with the same CRC c in seconds (CRC is a linear function — flipping bits at known positions produces known XOR deltas in the CRC). Use HMAC, GMAC, or Poly1305 for tamper-evident integrity. CRC is for the wire, not the adversary.
What's the complexity of computing a CRC?
O(n) bits or O(n/8) bytes using a precomputed 256-entry table. Hardware shift registers do one bit per clock; lookup-table software does 8 bits per memory access. On modern CPUs, the dedicated CRC32 instruction (SSE 4.2) processes 8 bytes per cycle for CRC-32C. PCLMULQDQ-based folding implementations reach 20+ GB/s — fast enough that CRC is effectively free on Ethernet at line rate.
When does CRC fail to detect an error?
When the error pattern is itself a multiple of the generator polynomial. CRC-32 misses an error pattern with probability ~1/2^32 ≈ 0.000000023% for uniformly random bit flips. The structure of CRCs guarantees zero false-negatives for the error patterns the generator was designed to detect (single bits, short bursts, odd numbers of flips), and probabilistic detection elsewhere. In practice, CRC-32 has been wrong for less than one out of every four billion packets — good enough for Ethernet, certainly not good enough to substitute for a cryptographic MAC.