Distributed Systems

Change Data Capture

Stream every row change without polling — straight from the WAL

CDC tails a database's write-ahead log and emits every row change as an event. Debezium, Maxwell, Postgres logical replication. No polling, no missed updates, sub-second lag.

  • Typical lag~50–500 ms end-to-end
  • Throughput10k–100k events/sec/connector
  • Primary DB load~5–15% extra WAL volume
  • CapturesInserts, updates, deletes, schema
  • ToolsDebezium, Maxwell, Fivetran, AWS DMS
  • SourcePostgres WAL, MySQL binlog, SQL Server TX log

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How CDC works

Every transactional database writes a log of every change before applying it. In Postgres it's the write-ahead log (WAL); in MySQL it's the binlog; in SQL Server it's the transaction log. The database itself uses this log for crash recovery — replay the log from the last checkpoint and the data files are consistent again — and for streaming replication to read replicas.

Change Data Capture is the realization that this log is the perfect event stream. Every commit is already there, in order, durable, with the full before-and-after image of every changed row. You don't have to ask the database what changed — the database wrote it down for you. CDC tools just connect to the log and read it.

Postgres exposes the WAL via a logical replication slot: an in-database pointer that survives restarts and tracks how far a consumer has read. The connector — Debezium, in the most common stack — reads the slot, decodes binary WAL into rows, and publishes one Kafka message per row change. The slot ensures no events are lost on connector restart; the lag is whatever the network round-trip plus the broker publish takes, typically 50–200 ms.

// Postgres logical replication, conceptually
CREATE PUBLICATION my_pub FOR TABLE orders, customers, line_items;

// Debezium connector binds a replication slot
debezium.connect(
  slot_name = 'debezium_orders',
  publication = 'my_pub',
  start_lsn = last_committed_lsn,
)

// Every commit produces a stream of:
//   { op: 'c' | 'u' | 'd', table, before, after, lsn, ts_ms }

What a CDC event looks like

The standard Debezium event for an update carries both the previous row state and the new row state, plus metadata about the source LSN and commit time:

{
  "op": "u",
  "ts_ms": 1716054200123,
  "source": {
    "db": "shop",
    "schema": "public",
    "table": "orders",
    "lsn": "0/16B3748",
    "txId": 4928301
  },
  "before": { "id": 42, "status": "PENDING", "total": 99.00 },
  "after":  { "id": 42, "status": "PAID",     "total": 99.00 }
}

Deletes carry a non-null before and a null after plus a Kafka tombstone (null value) so log-compacted topics drop the key. Inserts have before=null. Schema changes come on a separate topic; some tools (Maxwell) inline them.

When to use CDC

  • You're populating analytics warehouses — Snowflake, BigQuery, ClickHouse. CDC keeps them within seconds of the OLTP primary without a nightly batch.
  • You're building event-driven services and don't want every team to add explicit event publication on every code path. CDC catches inserts that legacy code forgot to instrument.
  • You need cross-region replication beyond what the DB offers natively, including filtered or transformed replication.
  • You're invalidating caches downstream — when a row changes in Postgres, a Kafka consumer evicts the Redis key without the app code having to coordinate.
  • You're decommissioning a monolith — CDC the monolith's tables and let new services consume the event stream while the monolith keeps writing.

You probably don't need CDC if your service already writes events explicitly (the outbox pattern owns this case), if your read load is low enough that polling is fine, or if you have no downstream subscribers — CDC pays its overhead only when something consumes the stream.

CDC vs polling vs outbox vs trigger-based capture

CDC (WAL/binlog)Polling SELECTOutbox patternTrigger-based
Captures deletesYesNo (row is gone)Yes (explicit)Yes
Misses intra-poll changesNoYes (snapshot gap)NoNo
Latency50–500 ms typical0.5 × poll interval50–500 ms~10 ms
Load on primaryWAL retention onlySELECT load every interval+1 row per changeTrigger overhead per write
Schema couplingStrong (row = event)StrongLoose (event table)Configurable
Code changes requiredNone (operate at infra)Per-table queriesAdd outbox writesAdd triggers
Ordering preservedPer-DB commit orderApproximatePer-aggregate (key)Per-table
Throughput ceiling10k–100k events/secLimited by poll costLimited by poller10k–50k typical

The pragmatic combo is CDC against an outbox table — you write the explicit event in your transaction (loose coupling) and Debezium streams it (low latency, no polling).

What CDC actually costs

On the primary database, the only direct cost is WAL retention until the connector has consumed it. A 5-minute connector lag with 10 MB/s of WAL traffic means 3 GB of WAL pinned on disk — small. But if the connector falls behind for hours, you can run the primary out of disk, which is why every CDC deployment monitors pg_replication_slots.confirmed_flush_lsn lag obsessively.

Network: roughly the same bandwidth as your write rate, plus protocol overhead. 1k inserts/sec at 1 KB per row is 1 MB/s of CDC events — trivial.

Initial snapshot: a one-time cost when you first connect to a multi-GB table. Snapshotting 100 GB at 50 MB/s takes 33 minutes; incremental snapshotting (introduced in Debezium 1.6+) lets you snapshot in chunks without stopping the world.

Operational complexity: a Debezium connector is a stateful component — Kafka Connect workers, replication slots that survive failover, schema-history topics. Most teams spend more time operating CDC than coding around it.

Pseudo-code

// Conceptual CDC connector loop.

connect(db, slot_name):
    if not slot_exists(slot_name):
        snapshot_lsn = db.create_replication_slot(slot_name)
        for table in tables:
            for row in db.select_consistent(table, snapshot_lsn):
                publish(table, op='c', before=null, after=row, lsn=snapshot_lsn)
        commit_offset(snapshot_lsn)

    cursor = slot.start_replication(from = last_committed_lsn)
    for wal_record in cursor:
        match wal_record.op:
            'INSERT':
                publish(table, op='c', before=null, after=wal_record.after)
            'UPDATE':
                publish(table, op='u', before=wal_record.before, after=wal_record.after)
            'DELETE':
                publish(table, op='d', before=wal_record.before, after=null)
                publish_tombstone(table, key=wal_record.before.pk)
        commit_offset(wal_record.lsn)

Python: a minimal Postgres logical-decoding consumer

import psycopg2
import psycopg2.extras
import json

# Connect using a replication-aware connection
conn = psycopg2.connect(
    "host=localhost dbname=shop user=postgres "
    "replication=database",
    connection_factory=psycopg2.extras.LogicalReplicationConnection,
)
cur = conn.cursor()

SLOT = "cdc_slot"
PUB  = "cdc_pub"

# Idempotent slot creation
try:
    cur.create_replication_slot(SLOT, output_plugin="pgoutput")
except psycopg2.errors.DuplicateObject:
    pass

cur.start_replication(
    slot_name=SLOT,
    options={"publication_names": PUB, "proto_version": "1"},
    decode=True,
)

def handle(msg):
    payload = json.loads(msg.payload) if msg.payload else {}
    print(f"LSN {msg.data_start}  {payload}")
    msg.cursor.send_feedback(flush_lsn=msg.data_start)

cur.consume_stream(handle)

This is functional but bare-bones — production deployments use Debezium (Kafka Connect), Maxwell (MySQL), or AWS DMS, which handle schema evolution, exactly-once via offsets, failover, and metric emission. Building from scratch is a six-month project; using Debezium is a one-week project.

Variants and extensions

  • Debezium. The reference open-source CDC, built on Kafka Connect. Connectors for Postgres, MySQL, MongoDB, SQL Server, Oracle, Cassandra. The de-facto standard for self-hosted CDC.
  • Maxwell. Lightweight MySQL-only CDC; emits JSON events directly to Kafka or Kinesis without Kafka Connect. Easier to operate at small scale.
  • AWS DMS / GCP Datastream. Managed CDC. You configure source and target endpoints; the cloud handles failover, slots, and snapshotting. Higher cost, lower operational burden.
  • Native logical replication. Postgres can replicate logical changes directly to another Postgres or any subscriber that speaks the wire protocol — no Kafka required, but no Kafka guarantees either.
  • Incremental snapshotting. Snapshot in chunks bounded by signaling rows; doesn't block ongoing replication. Debezium 1.6+, Fivetran, Striim.
  • Heartbeats and signal tables. CDC tools insert periodic heartbeat events into a known table so consumers can detect "no changes" vs "connector died" — and so replication slots advance on idle databases.
  • Event router (outbox). Debezium's EventRouter SMT reads from an outbox table and routes by the outbox row's type column to per-topic Kafka destinations. Bridges CDC and outbox patterns.

Common pitfalls and edge cases

  • Replication slot leaks. A stopped connector with an active slot pins WAL on the primary forever. Postgres won't recycle it. The disk fills, the primary halts. Always alert on slot-lag.
  • Lost slot on failover. Older Postgres versions didn't replicate slot state to standbys; a primary failover wiped the slot and forced a full re-snapshot. Postgres 16+ has logical-replication failover support — but check your specific version.
  • Schema changes break decoding. An ALTER TABLE that adds a column before the connector reads it can produce events the consumer doesn't understand. Schema registries (Confluent, Apicurio) enforce compatibility rules.
  • At-least-once + duplicates after restart. CDC offsets are flushed periodically. A crash between event publish and offset commit will re-publish the last batch. Consumers must dedupe — exactly the same constraint as the outbox pattern.
  • Captured tables have no primary key. Logical decoding requires a key for UPDATE/DELETE to identify rows. Postgres REPLICA IDENTITY DEFAULT uses the PK; tables without one emit only inserts.
  • Coupling consumers to row schema. Renaming a column in the DB renames it in every downstream Kafka topic. Without versioning, consumers break silently. Outbox-shaped events insulate you.

Frequently asked questions

How is CDC different from polling for changes?

Polling runs SELECT queries that compare snapshots — it misses updates that happen and revert between polls, scales poorly because every poll scans rows, and adds query load to the primary. CDC tails the database's own write-ahead log, sees every row change exactly as the DB wrote it (including deletes), has zero query load on the primary, and lag is typically 50–500 ms instead of the polling interval.

What's the WAL / binlog and why is it the right source?

Every transactional database writes a log of every row change before applying it to the data files — that's how it recovers from crashes. Postgres calls it the WAL, MySQL the binlog, SQL Server the transaction log. CDC re-uses that existing infrastructure: the changes are already serialized in commit order, durable, and require no extra writes from the application. Tail the log, get every change for free.

What latency does CDC add?

Typical end-to-end is 50–500 ms — the time between commit, WAL flush, log shipping over TCP, decoding, and broker publish. Debezium specifically lands around 50–200 ms for well-tuned Postgres setups. Sub-50 ms is achievable on the same host but uncommon in production because the broker hop dominates.

Does CDC support deletes?

Yes — that's one of its main advantages. Polling on a delete-from-table-X query can't observe rows that no longer exist; CDC sees the DELETE record in the log with the full old row, exactly as the database wrote it. Debezium emits tombstone events on deletes, which let Kafka topic compaction remove the keys cleanly.

How does CDC handle the initial snapshot?

On first connect, the connector takes a consistent snapshot of every captured table — a long-running SELECT under a repeatable-read transaction — and streams every row as a synthetic INSERT event. Once snapshotting completes, it switches to streaming the WAL from the LSN it recorded at snapshot start. Snapshot of a multi-TB table can take hours; some tools support incremental snapshotting to avoid that one-shot pain.

What can break CDC?

Failovers that don't preserve the replication slot's LSN (lost slot = full re-snapshot), schema changes that the connector doesn't recognize, replication-slot retention filling the disk because the connector fell behind, and inadequate WAL retention causing the connector to fall off the end of the log. Operators monitor replication-slot lag the way they monitor disk space.

CDC vs outbox pattern — when to pick which?

Direct CDC streams your domain tables: simple, automatic, but the event schema is your row schema — refactoring the table changes the wire format. The outbox pattern uses an explicit event table you control, which decouples event schema from storage at the cost of one extra write per change. Many teams run Debezium against an outbox table — getting the explicit schema AND the WAL-driven latency.