Concurrency Patterns

Actor Model

Independent units, private state, one message at a time — no locks, no shared memory

An actor owns private state, processes one message at a time from a mailbox, and talks to others only via messages. Erlang, Akka. WhatsApp ran 2M conns per server on it.

ConceivedCarl Hewitt · 1973
Erlang process~350 bytes at spawn
WhatsApp record2M TCP conns/server
Mailbox semanticsFIFO · async · unbounded by default
Shared statenone — message passing only
Famous runtimesErlang/OTP, Akka, Elixir, Orleans, Pony

Interactive visualization

Three actors, each with its own mailbox. Watch messages flow in, one-at-a-time processing, and never any shared state.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How the actor model works

Carl Hewitt proposed the actor model in 1973 as an alternative to the prevailing shared-memory, lock-based view of concurrency. The thesis was simple: take the principle "everything is an object" from Smalltalk, drop the assumption that those objects can call each other's methods directly, and replace synchronous calls with asynchronous messages. What falls out is a programming model that scales naturally to millions of concurrent entities, recovers cleanly from failure, and never needs a mutex.

An actor is three things bundled together:

State. A private bag of variables only the actor can touch. No other thread, no other actor can read or write it.
Behavior. A handler — usually called receive — that processes one message. Behavior can change between messages: an actor might switch from "logged-out" to "logged-in" state.
Mailbox. A FIFO queue of inbound messages. Other actors enqueue; the actor dequeues one at a time and runs its behavior.

The "one message at a time" rule is the entire reason actors are safe. Inside the message handler, the actor's state is single-threaded — exactly one logical thread of control is touching it. You can write straight-line imperative code on the state without any synchronization, because the runtime guarantees no two handlers run concurrently for the same actor.

To talk to another actor, you send a message. The send is fire-and-forget: it goes into the recipient's mailbox and returns immediately. There is no reply unless your message includes a "reply to me at this address" reference and the recipient explicitly sends one back. This asynchronous, non-blocking style is what lets the runtime schedule millions of actors onto a small number of OS threads.

Worked example — a counter actor

Imagine a counter actor with state {count: 0}. It accepts three messages: :inc, :dec, and {:get, reply_to}. Three other actors A, B, C run concurrently and send a hundred increments each, plus a final query.

Actor A sends 100 × :inc → mailbox
Actor B sends 100 × :inc → mailbox
Actor C sends 100 × :inc → mailbox
Actor Q sends {:get, Q} → mailbox

Counter mailbox (FIFO arrival order):
  :inc, :inc, :inc, :inc, :inc, ..., {:get, Q}

Counter processes one at a time:
  count: 0 → 1 → 2 → 3 → ... → 300
  Receives {:get, Q} → sends 300 to Q

Q receives 300.

No locks were taken. No race conditions occurred. The same workload in a shared-memory model would either need a synchronized block around every increment (slow, contended) or an atomic fetch_add (fast, but only because the hardware does the locking for you). The actor model gets the same answer by serializing the work inside the actor itself.

The cost: each :inc message goes through the mailbox, which is a few hundred nanoseconds of overhead in Erlang. For a hot counter this is slower than an atomic add. The benefit: the same pattern scales without change to a distributed counter spread across a hundred machines — the messages travel over the network instead of through a queue, and the model is identical.

Processes and mailboxes at scale

Erlang/OTP's killer feature is the cost of a process. A spawn allocates about 350 bytes — a small heap, a tiny stack, a mailbox pointer, and bookkeeping. The scheduler runs on N OS threads (where N matches your CPU cores) and multiplexes potentially millions of actors over them. When an actor's mailbox has a message and the actor is reducible (hasn't exceeded its time slice), the scheduler picks it up; otherwise it sleeps in a queue at near-zero cost.

WhatsApp's 2014 engineering post described running 2 million simultaneous TCP connections on a single FreeBSD server using Erlang — one actor per connection, plus auxiliary actors for routing and session state. The math: 2M × ~10 KB per session-of-actors ≈ 20 GB resident, comfortably within one machine. The same setup in a thread-per-connection model would need 2M × 2 MB = 4 TB of stack space alone, an impossibility.

Akka on the JVM uses a similar architecture: actors are POJOs scheduled by a fork-join pool. An Akka actor costs about 400 bytes plus the bookkeeping; the documented practical limit is a few million actors per JVM, gated by heap. Microsoft's Orleans uses a "virtual actor" twist: actors are activated on demand from durable storage and deactivated when idle, so you can address billions of logical actors with only a working set in memory.

Let it crash — supervision trees

The actor model's second big idea is supervision. Each actor has a parent — its supervisor — that watches it. When an unhandled exception kills the actor, the supervisor decides how to recover:

One-for-one. Restart only the failed child with fresh state. Use when siblings are independent.
One-for-all. Kill and restart every sibling. Use when siblings share state or depend on a startup ordering.
Rest-for-one. Restart the failed child and any later siblings in start order. Use for pipelines where downstream depends on upstream's identity.
Escalate. The supervisor itself dies, propagating the failure to its parent.

Each strategy comes with a frequency limit — typically "if this child crashes more than 3 times in 5 seconds, escalate." The result is a self-healing tree: transient errors restart cleanly, persistent errors escalate up to where a human or higher logic can intervene. The Erlang slogan "let it crash" means: don't try to handle every exception in-line — write the happy path, and let the supervisor restart you from a known-good baseline. This produces less code, fewer bugs, and demonstrably higher uptime: Ericsson's AXD301 telecom switch built on Erlang/OTP reportedly achieved nine 9s of availability.

Variants

Pure actors (Hewitt's original). No shared memory at all; every actor reference is just a name. Erlang follows this almost completely; Akka allows some shared-immutable references for performance.
Typed actors. Each actor has a static message protocol checked by the compiler. Akka Typed (since 2018), Pony, and Caf++ enforce that you can only send messages the recipient knows how to handle.
Virtual actors. Microsoft Orleans pioneered this: actors are addressable by ID without being explicitly spawned. The runtime activates them on demand from durable storage and deactivates idle ones. Lets you address billions of logical entities with bounded memory.
Reactive streams. Akka Streams adds backpressured, bounded mailboxes that signal upstream when downstream is overwhelmed. This bridges actors with the dataflow model.
Distributed actors. Erlang's distribution layer and Akka Cluster make remote messages look identical to local ones — same mailbox, same FIFO, same semantics. Adds the eight fallacies of distributed computing on top, but the programming surface is preserved.

Actor model vs other concurrency models

Model	Communication	State sharing	Strength
Actors	Async messages to mailboxes	None — private per actor	Millions of entities, supervision, location transparency
Threads + locks	Shared memory + mutex	Shared, manually protected	Tight in-process control; lowest latency for shared structures
CSP (Go channels)	Sync/async channels	Shared discouraged but allowed	In-process pipelines, structured concurrency
Async/await	Futures/promises	Shared with cooperative yields	I/O-bound flows, no per-task state
STM (transactions)	Shared memory + retry	Shared, optimistic	Complex multi-variable updates
Dataflow / reactive	Streams of values	None inside operators	Pipeline transformations with backpressure

Real systems combine these. A typical Akka service has actors for per-entity state, Akka Streams for ingest with backpressure, futures for short async tasks, and the JVM's java.util.concurrent for shared low-level data structures. Each layer uses the right abstraction for the job.

Erlang code — a chat-room actor

-module(chat_room).
-behaviour(gen_server).
-export([start_link/0, join/2, leave/2, send/3]).
-export([init/1, handle_call/3, handle_cast/2]).

start_link() -> gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).

join(Pid, User) -> gen_server:cast(?MODULE, {join, Pid, User}).
leave(Pid, User) -> gen_server:cast(?MODULE, {leave, Pid, User}).
send(From, Msg, _) -> gen_server:cast(?MODULE, {msg, From, Msg}).

init([]) -> {ok, #{members => #{}}}.

handle_cast({join, Pid, User}, State) ->
    {noreply, State#{members := maps:put(Pid, User, maps:get(members, State))}};

handle_cast({leave, Pid, _}, State) ->
    {noreply, State#{members := maps:remove(Pid, maps:get(members, State))}};

handle_cast({msg, From, Msg}, State = #{members := M}) ->
    User = maps:get(From, M, "anon"),
    maps:foreach(fun(Pid, _) -> Pid ! {chat, User, Msg} end, M),
    {noreply, State}.

This is one actor — a gen_server — holding the chat-room membership map. Joins, leaves, and broadcasts arrive as cast messages in its mailbox; it processes them strictly in order. Even with thousands of users sending messages at the same wall-clock moment, the room actor sees them one-at-a-time, so the broadcast iteration is never racing with a leave.

Akka code (Scala) — the same counter

import akka.actor.typed._
import akka.actor.typed.scaladsl._

object Counter {
  sealed trait Cmd
  case object Inc extends Cmd
  case object Dec extends Cmd
  final case class Get(replyTo: ActorRef[Int]) extends Cmd

  def apply(): Behavior[Cmd] = behave(0)

  private def behave(n: Int): Behavior[Cmd] = Behaviors.receiveMessage {
    case Inc => behave(n + 1)
    case Dec => behave(n - 1)
    case Get(replyTo) =>
      replyTo ! n
      Behaviors.same
  }
}

// Spawning:
val system = ActorSystem(Counter(), "counter")
system ! Counter.Inc
system ! Counter.Inc
system ! Counter.Get(probe.ref)

Note how the state n isn't a mutable variable — each transition returns a new behavior parameterized by the new count. This is idiomatic Akka Typed: state is the parameter, behavior is the function. No mutex, no volatile, no synchronized; the typed protocol means the compiler also catches "you sent a String to a Counter" errors at build time.

Common pitfalls

Blocking inside a message handler. A handler that performs a synchronous DB call freezes the actor for the duration — every other message in its mailbox queues up behind it. If many actors share a dispatcher thread pool, blocking handlers can starve the whole system. Use a dedicated blocking-IO dispatcher or convert the call to async.
Unbounded mailboxes. Defaults are usually unbounded, which means a slow consumer paired with a fast producer crashes the actor with OOM. Use bounded mailboxes with a back-pressure or drop-oldest policy for any unbounded producer.
Shared mutable state via closures. If you close over a HashMap and modify it from inside a message handler, you've broken the model. Akka especially makes this easy to do by accident in Scala. Keep all state in the actor's behavior parameters or fields, not captured externals.
Trying to enforce cross-actor invariants. "Account A and Account B must always sum to 100" cannot be enforced with two account actors and asynchronous messages. Either combine the two accounts into one actor or use an explicit saga / two-phase pattern across them.
Ignoring backpressure. A fast sender that never checks if the recipient is keeping up will fill the mailbox and OOM the receiver. Either size mailboxes and use a producer that respects ask + ackpressure, or use Akka Streams / reactive Streams between actors.
One huge "god actor." A single actor that handles everything serializes all work — you lose parallelism. Split state along natural boundaries: one chat-room actor per room, one user actor per user.
Synchronous ask in hot loops. Akka's ask creates a temporary actor and a future per call; doing it in a loop allocates heavily. Prefer pure tell + reply messages, or batch via streams.

Performance and impact

The per-message cost in Erlang is about 300-700 nanoseconds for an intra-node send to a local actor, dominated by the queue lock-free enqueue and the scheduler check. Akka is similar — around 400 ns local. This is slower than a direct method call (~1 ns) by two and a half orders of magnitude, so for a hot inner loop you would not wrap every integer add in an actor. But for the workload actors target — entities receiving network events at human-readable rates — the overhead is invisible.

At scale, the metric that matters is concurrent entities per machine. WhatsApp's published 2 million TCP connections per server figure remains the textbook example. Discord, RabbitMQ, Ericsson's switches, and the Bleacher Report live-update system all sit on Erlang or Elixir for the same reason: millions of long-lived stateful entities, modest message rate each, mandatory fault tolerance. Akka customers — Walmart, Verizon, the Norwegian tax authority — quote similar shapes on the JVM.

The actor model is not a free lunch. It trades shared-memory speed for isolation, and shared-state invariants for asynchrony. But where its shape fits — many independent stateful entities, must survive partial failure, must scale horizontally — nothing else lets you write straightforward sequential code per entity and get a million of them on one box.

Frequently asked questions

What is the actor model in one sentence?

An actor is a unit of computation that owns private state, communicates with other actors only by asynchronous messages to their mailboxes, and processes messages one at a time. Because each actor's state is only ever touched by its own message-handling thread, you get concurrency without locks — and because messages are immutable, races and corrupted state are impossible by construction.

Why does the actor model scale where threads don't?

An OS thread costs 1-8 MB of stack and roughly 1 ms to spawn; you cannot have millions of them per machine. An Erlang process costs about 350 bytes at spawn and a few microseconds — the BEAM VM ran WhatsApp's reported 2 million concurrent TCP connections per server in 2014. Akka actors are similar (~400 bytes). A scheduler multiplexes millions of actors onto a small pool of OS threads (one per CPU core), so 'concurrency per machine' is gated by memory, not by the kernel thread limit.

How is the actor model different from goroutines and channels?

Goroutines + channels (Go's CSP model) and actors are both message-passing concurrency, but with two differences. Identity: actors have a stable address (a pid or ActorRef) you can send to; channels are first-class objects you have to pass around. Mailboxes: every actor has exactly one inbound mailbox; channels are many-to-many and decoupled from any particular goroutine. In practice CSP is fine for in-process pipelines; actors win when you need supervision, location transparency (local vs remote), and per-entity state (millions of users-as-actors).

Why are mailboxes a queue, not a stack or random?

Most actor systems use a FIFO mailbox so causally-related messages from the same sender are processed in order — this is what 'tell me your name, then tell me your age' relies on. Some systems offer priority mailboxes (urgent shutdown messages jump the queue) or bounded mailboxes (back-pressure: senders block or are rejected when the mailbox is full). Random ordering would be a correctness disaster — a 'set X=5' followed by 'increment X' must be processed in that order, not reverse.

How does 'let it crash' work in Erlang and Akka?

Each actor has a supervisor — usually its parent — that watches for crashes. When an actor throws an unhandled exception, the runtime kills it and notifies the supervisor, which decides: restart it with a fresh state (the default), restart all its siblings, or escalate up. Because actors' state is private and reset on restart, you don't get half-corrupted state — the system returns to a known-good baseline. The Erlang OTP supervision tree is the canonical pattern; Akka mirrors it. Trying to handle every exception in-line creates more bugs than it fixes; let it crash, restart from clean state.

What are the downsides of the actor model?

Three. (1) Debugging is harder — there's no stack trace across actors; you need distributed tracing. (2) Cross-actor invariants are slippery — if two actors hold half a constraint each, no transaction enforces it. (3) Performance has overhead: each message is a queue + dispatch + heap allocation, so for hot in-process loops with shared state, direct method calls or lock-free data structures are faster. The model shines at scale (millions of independent entities) and at fault-tolerance, not at micro-benchmarks.

When should I reach for actors vs. async/await?

Use async/await for I/O-bound workflows that don't have long-lived per-entity state — request handlers, scrapers, ETL. Use actors when you have millions of stateful entities (chat rooms, games, IoT devices, user sessions) that each receive a stream of events and you want supervision and location-transparent addressing. WhatsApp's design choice — each user connection is an actor with its own mailbox and process — is why one machine could hold 2M of them.