Distributed Systems

The PACELC Theorem

CAP's sequel — the tradeoff you pay every millisecond, not just during a partition

PACELC extends the CAP theorem: if there is a network Partition (P), a system trades Availability (A) for Consistency (C); Else (E), in normal operation, it trades Latency (L) for Consistency (C).

  • Proposed byDaniel Abadi, 2010
  • FormalizedIEEE Computer, 2012
  • Partition branchA vs C
  • Else branchL vs C
  • ClassificationsPA/EL · PC/EC · PC/EL · PA/EC

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The one-sentence intuition

The CAP theorem tells you what your database does during a network partition — a moment when nodes can't talk to each other. But partitions are rare. A well-run cloud cluster might see one for a few minutes a quarter. CAP says nothing about the other 99.9% of the time, when the network is healthy and your database is still making a quiet, constant tradeoff on every single request.

That gap is exactly what PACELC fills. Daniel Abadi proposed it in 2010 (in a blog post and the paper "Consistency Tradeoffs in Modern Distributed Database System Design," later published in IEEE Computer in 2012) because he noticed that engineers kept defending design choices with CAP that CAP didn't actually justify. Why does Amazon's Dynamo serve stale reads when the network is fine? CAP can't answer that — there's no partition. The real answer is latency: waiting for all replicas to agree on every read would add a cross-region round trip, and Amazon decided a 100 ms shopping-cart delay was worse than occasionally showing a slightly old cart.

PACELC reads as a literal sentence: if Partition, then Availability or Consistency; Else, Latency or Consistency. The two letters before the slash say what you give up under partition; the two after say what you give up when everything is working. A system is summarized as one of four labels — PA/EL, PC/EC, PC/EL, or PA/EC.

How the two branches work

Think of every replicated write as a decision node. The system asks one question: can all my replicas hear each other right now?

The "P" branch — there is a partition. Some replicas are unreachable. You now physically cannot have both availability and strong consistency, because to keep data consistent you'd have to refuse the write until the missing replicas confirm — and they can't. This is exactly CAP's forced choice:

  • Choose A (availability): accept the write on the reachable side, let the two sides diverge, and reconcile later. The user is never blocked, but two clients can briefly see different values.
  • Choose C (consistency): refuse writes (or reads) on the minority side until the partition heals. No client ever sees stale data, but some clients get errors or timeouts.

The "E" branch — else, no partition. Every replica is reachable. There's no impossibility here — you can have consistency. But it isn't free: to guarantee a read sees the latest write, the write has to be acknowledged by a quorum (or all) of replicas before it returns. That acknowledgment is a network round trip, and the round trip is bounded by physics — light takes ~50 ms to cross from New York to London and back through fiber.

  • Choose L (latency): acknowledge the write after the local replica applies it, replicate to the others in the background, and serve reads from whichever replica is closest. Fast, but a read might miss a just-finished write.
  • Choose C (consistency): block the write until enough replicas confirm. Every read is correct, but you pay the round trip on each operation.

The crucial insight: the "E" branch is the one you live in almost all the time, so the L-vs-C choice defines your everyday performance profile. CAP's headline A-vs-C choice only bites during the rare partition.

The four PACELC classifications

Combining the two binary choices gives four labels. Two are common, one is common-but-overlooked, and one is essentially contradictory.

  • PA/EL — give up consistency for availability during partitions, and give up consistency for latency when healthy. The "always fast, eventually consistent" camp. Examples: Amazon Dynamo, Apache Cassandra (default), Riak, Couchbase.
  • PC/EC — keep consistency under partition (sacrificing availability) and keep consistency when healthy (sacrificing latency). The "always correct, sometimes slow or unavailable" camp. Examples: VoltDB/H-Store, Google Spanner (it leans here, paying latency via TrueTime), classic single-primary SQL with synchronous replication.
  • PC/EL — refuse stale data during a partition, but serve fast possibly-stale reads when healthy. The pragmatic middle. Example: Yahoo's PNUTS, and many primary-backup systems with asynchronous read replicas.
  • PA/EC — sacrifice consistency only during a partition but enforce it strictly otherwise. Logically valid but rare: if you accept divergence under partition, you've usually already accepted asynchronous replication for latency, so you'd be EL not EC.

Many systems are tunable rather than fixed. Cassandra and DynamoDB let you pick a consistency level per query — so the same cluster can behave as PA/EL for one request and PC/EC for the next, depending on whether you ask for QUORUM reads/writes or ONE.

PACELC classification of real systems

SystemPACELC classUnder partitionWhen healthyMechanism
Dynamo / Cassandra (default)PA/ELAvailable, may divergeFast, eventually consistentAsync replication, read-repair, hinted handoff
Riak / CouchbasePA/ELAvailable, may divergeLow-latency local readsVector clocks / CRDTs reconcile conflicts
Google SpannerPC/ECMinority unavailableSlower writes (commit-wait)Paxos + TrueTime bounded clock uncertainty
VoltDB / H-StorePC/ECStalls minoritySynchronous, latency-boundSingle-threaded, fully synchronous replication
Yahoo PNUTSPC/ELNo stale readsFast local (timeline-consistent) readsPer-record master, async replicas
MongoDB (majority writes)PC/EC*Minority steps downQuorum-acked, latency-boundRaft-like replica set, write concern
Cassandra (QUORUM/QUORUM)PC/ECBlocks below quorumR + W > N round tripTunable per-query consistency

The asterisk on MongoDB is the honest caveat that runs through this whole table: a system's class depends on its configuration. Default write concern, read preference, and consistency level can flip a database between quadrants. PACELC describes a design's default disposition, not an immutable law about the product.

What the latency actually costs

The "L vs C" tradeoff is not abstract — it's measured in milliseconds set by geography and the speed of light. Light in fiber travels at roughly 200,000 km/s (about ⅔ of c), so distance translates directly into a floor on round-trip time (RTT) that no engineering can beat:

  • Same datacenter: RTT ≈ 0.2–0.5 ms. Strong consistency is nearly free here — this is why single-region quorum systems feel fast.
  • Same region, different AZ: RTT ≈ 1–2 ms. A QUORUM write across three availability zones adds a couple of milliseconds.
  • Cross-continent (US East ↔ US West): RTT ≈ 60–70 ms. Every strongly-consistent write pays this.
  • Intercontinental (Virginia ↔ London): RTT ≈ 75–90 ms; Virginia ↔ Sydney ≈ 200+ ms.

Now the cost of choosing C over L in the "E" branch is concrete. A PA/EL system serving a read from the local replica answers in well under 1 ms. A PC/EC system spanning two continents must wait for a quorum acknowledgment — call it 80 ms. That's not a 2× slowdown; it's an 80–400× slowdown on that operation. For a page that fans out to 20 backend reads, the difference between EL and EC can be the difference between a 50 ms and a 1.6 s response. That is precisely why Amazon's Dynamo paper justified eventual consistency on latency grounds, with no partition in sight — the very observation that motivated PACELC.

Modeling the decision in JavaScript

PACELC isn't an algorithm you run; it's a classification of policy. The clearest way to "implement" it is a decision function that, given the current network state and a system's two policies, tells you what guarantee a request gets. This makes the four quadrants executable:

// A PACELC policy: choices for the partition branch and the else branch.
// pBranch: 'A' (availability) or 'C' (consistency)
// eBranch: 'L' (latency)      or 'C' (consistency)
function classify(pBranch, eBranch) {
  return `P${pBranch}/E${eBranch}`;     // e.g. "PA/EL"
}

// Given a system policy and the live network state, decide how a write behaves.
function handleWrite(policy, { partitioned, replicas, region }) {
  if (partitioned) {
    // CAP's forced choice — we literally cannot have both.
    if (policy.pBranch === 'A') {
      return { accepted: true,  consistent: false,
               note: 'accepted on reachable side; reconcile on heal' };
    }
    return { accepted: false, consistent: true,
             note: 'refused on minority side until partition heals' };
  }

  // No partition: the choice is consistency vs latency, and latency is physics.
  const rttMs = crossRegionRtt(region);          // speed-of-light bound
  if (policy.eBranch === 'C') {
    // Block until a quorum acknowledges — pay the round trip.
    const acks = Math.floor(replicas / 2) + 1;   // quorum = ⌊N/2⌋ + 1
    return { accepted: true, consistent: true, latencyMs: rttMs, quorum: acks };
  }
  // Latency wins: ack locally, replicate in the background.
  return { accepted: true, consistent: false, latencyMs: 0.4,
           note: 'async replication; read-your-writes not guaranteed' };
}

function crossRegionRtt(region) {
  const table = { 'same-dc': 0.4, 'same-region': 1.5, 'cross-continent': 70, 'intercontinental': 85 };
  return table[region] ?? 70;
}

const dynamo  = { pBranch: 'A', eBranch: 'L' };  // PA/EL
const spanner = { pBranch: 'C', eBranch: 'C' };  // PC/EC

console.log(classify(dynamo.pBranch,  dynamo.eBranch));   // "PA/EL"
console.log(handleWrite(spanner, { partitioned: false, replicas: 5, region: 'cross-continent' }));
// { accepted: true, consistent: true, latencyMs: 70, quorum: 3 }
console.log(handleWrite(dynamo,  { partitioned: false, replicas: 5, region: 'cross-continent' }));
// { accepted: true, consistent: false, latencyMs: 0.4, ... }

The instructive part is that the same input (no partition, 5 replicas, cross-continent) produces a 70 ms strongly-consistent result for Spanner and a sub-millisecond stale-tolerant result for Dynamo. That single number gap is the entire point of the "E" branch.

A Python simulation of the latency gap

To make the cost tangible, here's a tiny simulation that fans out a read-heavy request across replicas and reports the end-to-end latency under each policy — the same logic, but quantified over many operations.

from dataclasses import dataclass

RTT = {"same-dc": 0.4, "same-region": 1.5,
       "cross-continent": 70.0, "intercontinental": 85.0}

@dataclass
class Policy:
    p_branch: str   # 'A' or 'C'
    e_branch: str   # 'L' or 'C'
    def label(self) -> str:
        return f"P{self.p_branch}/E{self.e_branch}"

def op_latency(policy: Policy, partitioned: bool, n_replicas: int, region: str) -> float:
    if partitioned:
        # Under partition there is no latency knob — it's accept-or-refuse.
        if policy.p_branch == 'A':
            return 0.4                       # answered locally, may be stale
        return float('inf')                  # refused: effectively unavailable
    if policy.e_branch == 'C':
        quorum = n_replicas // 2 + 1         # must hear from a quorum
        return RTT.get(region, 70.0)         # one round trip, quorum-bound
    return 0.4                               # latency wins: local ack

def page_latency(policy, region, reads=20, partitioned=False, n=5):
    """A page that issues `reads` backend reads in parallel waves of 4."""
    per = op_latency(policy, partitioned, n, region)
    if per == float('inf'):
        return per
    waves = (reads + 3) // 4                  # 4-wide fan-out
    return per * waves

for sys in (Policy('A', 'L'), Policy('C', 'C'), Policy('C', 'L')):
    healthy = page_latency(sys, "cross-continent", partitioned=False)
    print(f"{sys.label():6s} healthy cross-continent page: {healthy:7.1f} ms")

# PA/EL  healthy cross-continent page:     2.0 ms
# PC/EC  healthy cross-continent page:   350.0 ms
# PC/EL  healthy cross-continent page:     2.0 ms   (EL behavior when healthy)

The simulation reproduces the headline number: a strongly-consistent (EC) page across continents is ~175× slower than a latency-optimized (EL) page, with no partition involved. PC/EL gets the same fast healthy-path number as PA/EL — the two only diverge during an actual partition.

Refinements and related models worth knowing

Abadi's "EL is the part CAP forgot." The deepest takeaway is not the four labels but the argument that latency and consistency are fundamentally linked outside of partitions. Strong consistency demands synchronization; synchronization demands a round trip; a round trip is latency. This is true even with zero failures, which CAP can't express.

Tunable / per-operation PACELC. Cassandra's QUORUM vs ONE and DynamoDB's strongly consistent reads flag mean a single deployment isn't locked to one quadrant. The right mental model is per-request, not per-database.

Spanner's "C in spite of P" sleight of hand. Spanner advertises external consistency despite partitions, which seems to break CAP. It doesn't: under a partition, Spanner's minority simply becomes unavailable for writes (it's PC), and it pays commit-wait latency when healthy (it's EC). Google's own engineers describe it as effectively CP/EC; the "always available" feel comes from Google's private network making partitions extraordinarily rare, not from defeating the theorem.

The "harvest and yield" framing (Fox & Brewer, 1999). An earlier, finer-grained idea: rather than a binary up/down, degrade gracefully — return a partial result (reduced harvest) while staying available (high yield). PACELC's "A" branch is a coarse version of high yield.

BASE vs ACID. PA/EL systems usually adopt BASE semantics (Basically Available, Soft state, Eventual consistency); PC/EC systems usually adopt ACID. PACELC is the deeper "why" behind that long-standing dichotomy.

Common misconceptions and edge cases

  • Treating CAP's "C" and ACID's "C" as the same. PACELC's C means linearizability (single-copy semantics), not transactional integrity constraints. A system can be EC (linearizable) yet have no transactions at all.
  • Assuming "AP" implies fast reads. CAP's AP is silent about normal-operation latency. You need PACELC to say whether an AP system is also EL. Most are, but it's a separate axis.
  • Calling a system "CP" or "AP" as if it were permanent. Consistency level, write concern, and read preference are usually runtime knobs. Always classify the configuration, not the product name.
  • Forgetting the "else" branch dominates. Engineers optimize endlessly for partition behavior that occurs minutes per quarter, while ignoring the L-vs-C choice paid on every one of billions of requests. PACELC exists to correct exactly this misallocation of attention.
  • Believing you can have PA/EC for free. If you tolerate divergence under partition (PA), you've almost certainly chosen asynchronous replication, which makes you EL not EC when healthy. The PA/EC quadrant is logically definable but operationally near-empty.
  • Ignoring read-your-writes. EL systems don't guarantee a client reads its own just-written value unless you add session/causal consistency on top. This bites users who write then immediately reload and see old data.

Frequently asked questions

What does PACELC stand for?

It reads as a sentence: if Partition (P), then Availability (A) or Consistency (C); Else (E), Latency (L) or Consistency (C). The two letters before the slash describe behavior during a network partition; the two after describe behavior in normal operation. Daniel Abadi coined it in 2010 and formalized it in a 2012 IEEE Computer paper.

How is PACELC different from the CAP theorem?

CAP only describes what happens during a partition, which is a rare event. PACELC keeps CAP's partition-time choice but adds the part you live with 99.9% of the time: even with no partition, every synchronous replication step costs latency, so you constantly trade latency against consistency. PACELC's whole point is that the 'else' branch dominates real operational cost.

Is a PA/EL system the same as 'AP' in CAP?

Not quite. CAP's AP only says you stay available during partitions. PA/EL (Dynamo, Cassandra with default settings, Riak) additionally says you favor low latency over strong consistency even when the network is perfectly healthy — it replicates asynchronously and answers from the nearest replica. PACELC makes that everyday behavior explicit; CAP leaves it unspecified.

Can a system be PC/EL or PA/EC?

PC/EL exists and is common: PNUTS and many primary-backup SQL setups refuse stale reads during a partition (PC) but serve fast, possibly-stale reads from local replicas when healthy (EL). PA/EC is the contradictory quadrant — sacrificing consistency only during partitions but enforcing it strictly otherwise — and is rare because if you tolerate stale data during a partition you usually tolerate it for latency too.

Why does consistency cost latency even when there is no partition?

Strong consistency requires that a write is acknowledged only after enough replicas have durably applied it (a quorum or all replicas). That acknowledgment is a network round trip — and if replicas sit in other regions, that round trip is bounded by the speed of light. A cross-continent round trip is roughly 60–150 ms, so strong consistency adds exactly that latency to every write, partition or not.

Is the 'consistency' in PACELC the same as in CAP?

Yes — both mean linearizability (or close to it): every read sees the most recent committed write, as if there were a single copy of the data. It is not the 'C' of ACID, which is about transactional integrity constraints. Conflating the two is the most common source of confusion in both theorems.