Distributed Patterns
CQRS: Command Query Responsibility Segregation
Two models, one truth — optimize writing and reading independently
CQRS (Command Query Responsibility Segregation) splits an application into a write model that handles commands and a separate read model that serves queries, so each side can use its own schema, storage, and scaling strategy.
- Named byGreg Young, ~2010
- Derived fromMeyer's CQS principle
- Read consistencyusually eventual
- Read scalingindependent replicas
- Pairs well withevent sourcing
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
The split: commands write, queries read
A normal application uses one model for everything. The same Order class, the same orders table, and the same ORM serve both "place an order" and "show me my last 20 orders." That works until the two jobs start fighting. The write side wants a normalized schema with foreign keys and constraints so it never saves a corrupt order. The read side wants a wide, denormalized row with the customer name, the product titles, and the shipping status already joined in — because a dashboard that does six joins per page load doesn't scale.
CQRS resolves the tension by refusing to share a model. The principle is older than the name: in 1988 Bertrand Meyer described Command-Query Separation (CQS) — a single method should either change state (a command) or return data (a query), never both. Greg Young coined CQRS around 2010 by lifting that idea from the method level to the architecture level. Now two whole models are split: a write model (also called the command model or aggregate) accepts commands like PlaceOrder and enforces every business rule, and a read model (a projection or view) answers queries like GetOrderSummary with pre-shaped data.
The flow has four moving parts:
- Command — an intent to change something.
CancelOrder(orderId). It's imperative, named after the business action, and either succeeds or is rejected. - Write model — validates the command, applies it to the authoritative state, and persists the change. This is the single source of truth.
- Synchronization — the change is propagated to the read side, synchronously inside the same transaction or (more often) asynchronously via an event or a change-data-capture feed.
- Read model — a separate, query-optimized store the UI reads from. It holds no business logic; it's a cache of answers.
Crucially, queries never touch the write model and commands never return query data. A command returns success/failure (and maybe an id), not a populated object to render.
The mechanism: projections and the consistency gap
The read model is built by a projection: a function that consumes the write side's changes and folds them into a query-friendly shape. If you pair CQRS with event sourcing, the write side emits a stream of events and the projection is literally a fold over that stream:
readModel = events.reduce(apply, emptyView)
That's an O(n) replay over n events to rebuild a view from scratch, and O(1) amortized to apply each new event incrementally. The key consequence is timing. If the projection runs asynchronously, there's a window — the propagation lag — between a successful command and the read model reflecting it. During that window a query returns stale data. This is not a bug; it's the defining trade-off. The read model is eventually consistent with the write model.
You choose where to sit on the consistency spectrum:
| Sync strategy | Read consistency | Lag | Scaling cost |
|---|---|---|---|
| Same-transaction projection | Strong | 0 | Write and read coupled; no independent scaling |
| Async via outbox / event bus | Eventual | ms–seconds | Read replicas scale freely |
| Batch rebuild (nightly) | Eventual | hours | Cheapest; fine for analytics |
The math that makes CQRS attractive is read amplification. If reads outnumber writes 100:1 — typical for a product catalog or a social feed — then putting reads on their own denormalized store backed by k read replicas multiplies read throughput roughly k× without touching write latency. The write model stays small and normalized; you don't pay for read-side denormalization on the hot write path.
When to use CQRS — and when not to
- Asymmetric read/write load. Reads vastly outnumber writes, and you want to scale them separately.
- Divergent shapes. The data you write (normalized, transactional) looks nothing like the data you read (denormalized, search-indexed, aggregated).
- Multiple read models from one write model. The same orders need to feed a customer dashboard, a fraud detector, a full-text search index, and a BI warehouse — each wants a different projection.
- Complex write-side domains. A rich aggregate with many invariants benefits from a model uncluttered by reporting concerns; it pairs naturally with event sourcing and Domain-Driven Design.
- Collaboration and audit. Task-based commands (
ApproveLoan, notUPDATE loans SET status=...) capture intent, which is invaluable for audit trails.
Don't use it when reads and writes share the same shape and scale — a basic CRUD admin tool. CQRS roughly doubles the code you maintain, introduces a sync mechanism that can lag or fail, and forces every read path to tolerate staleness. Greg Young's own warning is blunt: "Most people using CQRS… should not have done so." Apply it surgically, per bounded context, not across the whole system.
CQRS vs the alternatives
| CQRS (async) | CRUD / single model | Read replicas only | Materialized views | CQRS + Event Sourcing | |
|---|---|---|---|---|---|
| Read/write models | Separate | Shared | Shared schema | Shared base table | Separate |
| Read shape vs write shape | Independent | Identical | Identical | Derived | Independent |
| Consistency | Eventual | Strong | Eventual (replica lag) | Strong or refreshed | Eventual |
| Different storage engines | Yes | No | No (same engine) | No | Yes |
| Audit / time travel | If you log commands | No | No | No | Yes (full event log) |
| Operational complexity | High | Low | Medium | Low | Highest |
| Best for | Read-heavy, divergent shapes | Simple apps | Read scaling, same shape | A few precomputed reports | Complex domains needing audit |
The honest comparison: if you only need to scale reads and the read shape equals the write shape, plain read replicas are far simpler and give you most of the benefit. CQRS earns its complexity only when the read and write shapes genuinely diverge, or when you need many different projections from one source of truth.
What the numbers actually say
- Read amplification scales linearly. A denormalized read view that turns a 6-table join (say 8 ms per query) into a single indexed row lookup (~0.3 ms) is roughly a 25× per-query speedup before you even add replicas. Add 5 read replicas and aggregate read throughput climbs another ~5×.
- Propagation lag is the real cost. An async projection over Kafka or an outbox typically lands the read model within 10–500 ms of the write. Under load or during a consumer restart it can spike to seconds — which is why "I cancelled the order but it still shows active" bugs are endemic to careless CQRS.
- You write roughly 2× the model code. Two models, two persistence mappings, plus a projection per read view. That's the standing tax you pay forever, not a one-time setup cost.
- Rebuild time is bounded by event count. With event sourcing, rebuilding a read model means replaying the full event log: at, say, 50k events/second a projection, a 100-million-event stream takes ~33 minutes to rebuild — fast enough to fix a buggy projection by replay, slow enough that you cache snapshots.
JavaScript implementation
A minimal CQRS slice: a command side that validates and appends events, an async projection that builds a read model, and a query side that only reads the projection. Note that the command handler returns an id — never the rendered view.
// ---- Write side: commands mutate the authoritative state ----
class OrderWriteModel {
constructor(bus) { this.orders = new Map(); this.bus = bus; }
placeOrder({ orderId, customerId, total }) {
if (this.orders.has(orderId)) throw new Error('duplicate order');
if (total <= 0) throw new Error('total must be positive'); // invariant
this.orders.set(orderId, { customerId, total, status: 'PLACED' });
this.bus.emit({ type: 'OrderPlaced', orderId, customerId, total });
return orderId; // command returns an id, NOT a view
}
cancelOrder({ orderId }) {
const o = this.orders.get(orderId);
if (!o) throw new Error('no such order');
if (o.status === 'SHIPPED') throw new Error('cannot cancel shipped order');
o.status = 'CANCELLED';
this.bus.emit({ type: 'OrderCancelled', orderId });
}
}
// ---- Read side: a projection folds events into a query view ----
class OrderSummaryProjection {
constructor(bus, nameOf) {
this.view = new Map(); // orderId -> denormalized summary
nameOf = nameOf || (id => `Customer ${id}`);
bus.on(e => { // async in production (queue/CDC)
if (e.type === 'OrderPlaced') {
this.view.set(e.orderId, {
orderId: e.orderId,
customer: nameOf(e.customerId), // denormalized join, done once
total: e.total,
status: 'PLACED',
});
} else if (e.type === 'OrderCancelled') {
const v = this.view.get(e.orderId);
if (!v) return; // guard: event may arrive out of order
v.status = 'CANCELLED';
}
});
}
// ---- Query side: read-only, no business logic ----
getSummary(orderId) { return this.view.get(orderId) || null; }
byStatus(status) { return [...this.view.values()].filter(v => v.status === status); }
}
// Wiring
const bus = { handlers: [], on(h){ this.handlers.push(h); }, emit(e){ this.handlers.forEach(h => h(e)); } };
const writes = new OrderWriteModel(bus);
const reads = new OrderSummaryProjection(bus);
const id = writes.placeOrder({ orderId: 'A1', customerId: 7, total: 49 });
writes.cancelOrder({ orderId: 'A1' });
console.log(reads.getSummary('A1')); // { orderId:'A1', customer:'Customer 7', total:49, status:'CANCELLED' }
The synchronous emit here makes the read model strongly consistent for the demo. In production you'd publish to a queue or a change-data-capture stream, and the projection would lag — which is exactly when the if (!v) return guard earns its place, because events can arrive late or out of order.
Python implementation
from dataclasses import dataclass, field
# ---- Events ----
@dataclass
class OrderPlaced: order_id: str; customer_id: int; total: float
@dataclass
class OrderCancelled: order_id: str
# ---- Write side ----
class OrderWriteModel:
def __init__(self, bus):
self.orders = {}
self.bus = bus
def place_order(self, order_id, customer_id, total):
if order_id in self.orders: raise ValueError("duplicate order")
if total <= 0: raise ValueError("total must be positive")
self.orders[order_id] = {"customer_id": customer_id,
"total": total, "status": "PLACED"}
self.bus.emit(OrderPlaced(order_id, customer_id, total))
return order_id # return an id, not a view
def cancel_order(self, order_id):
o = self.orders.get(order_id)
if not o: raise ValueError("no such order")
if o["status"] == "SHIPPED": raise ValueError("cannot cancel shipped")
o["status"] = "CANCELLED"
self.bus.emit(OrderCancelled(order_id))
# ---- Read side: projection folds events into a denormalized view ----
class OrderSummaryProjection:
def __init__(self, bus, name_of=lambda i: f"Customer {i}"):
self.view = {}
self.name_of = name_of
bus.subscribe(self.apply)
def apply(self, e): # called async in production
if isinstance(e, OrderPlaced):
self.view[e.order_id] = {
"order_id": e.order_id,
"customer": self.name_of(e.customer_id), # join done once
"total": e.total, "status": "PLACED"}
elif isinstance(e, OrderCancelled):
v = self.view.get(e.order_id)
if v is None: # guard against out-of-order delivery
return
v["status"] = "CANCELLED"
# ---- Query side ----
def summary(self, order_id): return self.view.get(order_id)
def by_status(self, status): return [v for v in self.view.values()
if v["status"] == status]
class Bus:
def __init__(self): self.subs = []
def subscribe(self, h): self.subs.append(h)
def emit(self, e): [h(e) for h in self.subs]
bus = Bus()
writes = OrderWriteModel(bus)
reads = OrderSummaryProjection(bus)
writes.place_order("A1", 7, 49.0)
writes.cancel_order("A1")
print(reads.summary("A1")) # {'order_id': 'A1', 'customer': 'Customer 7', 'total': 49.0, 'status': 'CANCELLED'}
The asymmetry is the whole point. The write classes are full of raise statements — they enforce invariants. The projection has none; it trusts the write side and only reshapes data. The query methods (summary, by_status) are pure reads with zero validation.
Variants worth knowing
Single-database CQRS. The lightest version: one database, separate read and write models in code, often with the read side hitting denormalized views or read replicas. No separate store, no async pipeline — you get the modeling clarity without the operational tax. A sensible starting point.
CQRS with separate stores (async). Write to PostgreSQL, project to Elasticsearch / Redis / a denormalized read DB via an event bus. This unlocks independent storage engines and independent scaling, at the cost of eventual consistency and a sync pipeline to operate.
CQRS + Event Sourcing (CQRS/ES). The write model stores events as the source of truth instead of current state; read models are projections of the event log. You gain a full audit trail, time travel, and the ability to spin up new read models by replaying history. This is the heaviest variant and the one most people mistake for "CQRS itself" — they are separable.
Eager vs lazy projections. Eager projections update the read model as events arrive (low read latency, constant background work). Lazy / on-demand projections compute the view at query time from recent events (no background work, higher read latency). Snapshots are the usual compromise: periodically persist a folded view so replay starts from the snapshot, not from event zero.
Common bugs and edge cases
- Assuming read-your-own-writes. A user clicks "Save," the page re-queries the read model, and the change isn't there yet. Fix by returning the new state from the command, routing that user to the write model briefly, or polling a version token.
- Out-of-order or duplicate events. Async delivery can reorder or redeliver. Projections must be idempotent and tolerate gaps — guard with
if (!view) returnand key updates by event version, not arrival order. - Querying the write model. The single most common violation: reaching into the aggregate to render a list because the read model doesn't have the field yet. It couples the sides and defeats the pattern. Add the field to a projection instead.
- Returning data from commands. A command that returns a fully rendered view re-introduces the read/write coupling CQRS exists to remove. Return an id or an ack; let the client query the read side.
- Dual-write without an outbox. Writing to the database and publishing an event as two separate operations can lose events if the process crashes between them. Use the outbox pattern so the event is committed atomically with the state change.
- Applying it everywhere. CQRS is a per-context decision. Forcing it on simple CRUD modules adds complexity with no payoff — the classic over-engineering trap.
Frequently asked questions
What is the difference between CQRS and CRUD?
CRUD uses one model and one schema for both reading and writing — the same table backs SELECT and UPDATE. CQRS splits them: commands mutate a write model optimized for validation and consistency, while queries hit a separate read model optimized for the exact shapes the UI needs. The two models can live in different databases.
Does CQRS require event sourcing?
No. CQRS only mandates separate read and write models. Event sourcing is a complementary pattern often paired with it — the write side stores a log of events and the read side projects those events into query-friendly views — but you can do CQRS with two ordinary SQL tables and a synchronization job. Greg Young, who named the pattern, has explicitly said the two are independent.
Is the read model always eventually consistent?
Only if the read model is updated asynchronously, which is the common case. If you update both models inside the same transaction (synchronous projection) the read side is strongly consistent but you lose the independent-scaling benefit. Most CQRS systems accept a propagation lag of milliseconds to seconds in exchange for read scalability.
When should you NOT use CQRS?
Avoid it for simple CRUD applications where reads and writes share the same shape and the same scale. CQRS doubles your data models, adds a synchronization mechanism, and forces you to reason about stale reads. For a basic admin panel or a low-traffic form, that complexity buys you nothing. Apply it per bounded context, not application-wide.
How do you handle a user reading their own write under CQRS?
This is the read-your-own-writes problem. Common fixes: return the new state directly from the command handler so the UI doesn't need to re-query; route that user's reads to the write model briefly; or include a version token the client polls until the read model catches up. Picking one is the central UX cost of asynchronous CQRS.
Can the read and write sides use different databases?
Yes — that is one of the main payoffs. A write model might use a normalized PostgreSQL schema for transactional integrity, while the read model uses Elasticsearch for full-text search, Redis for hot lookups, and a denormalized table for dashboards — all projected from the same stream of changes.