Concurrency

Condition Variable

Sleep on a predicate, release the lock, wake on a signal

A condition variable lets a thread wait until a predicate becomes true — atomically releasing a held mutex while blocked, re-acquiring it on wakeup. The monitor pattern, every time.

  • Used withA mutex (always)
  • wait() cost (uncontended)~50-100 ns
  • wait() cost (true sleep)~3-5 µs (context switch)
  • Operationswait, signal, broadcast
  • Spurious wakeupsAllowed by POSIX
  • Predicate checkwhile-loop mandatory

Interactive visualization

Press play, or step through manually. Watch threads queue on the condition variable and wake one-by-one when signaled.

Open visualization fullscreen

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How a condition variable works

Two threads, one shared queue. The consumer wants to pop an item, but the queue is empty. The consumer could spin in a tight loop checking queue.empty(), burning a whole CPU core to poll a boolean — wasteful at best, deadlock-prone at worst. A condition variable lets the consumer say, instead: wake me up when this is no longer empty.

The mechanics rest on three atomic operations:

  • wait(cv, mutex) — atomically releases the mutex and enqueues the calling thread on the condvar's wait queue. The thread sleeps. When signaled, the thread is dequeued, re-acquires the mutex (which may itself block), and only then returns from wait().
  • signal(cv) — wakes exactly one waiter, if any. Picks one waiter from the queue and marks it runnable. No-op if the queue is empty (the signal is not "stored").
  • broadcast(cv) — wakes every waiter currently on the queue. Each waker re-acquires the mutex in turn; only one runs at a time.

The pattern looks like this:

// Consumer:
lock(mutex);
while (!predicate())     // ← while, not if
    wait(cv, mutex);
do_work();
unlock(mutex);

// Producer:
lock(mutex);
update_state();
signal(cv);              // or broadcast(cv)
unlock(mutex);

The while-loop is non-negotiable. Three things can wake a waiter: a matching signal, a broadcast that doesn't apply to this thread's specific predicate, or a spurious wakeup the OS is allowed to inflict for its own reasons. Only the predicate check tells you which.

Spurious wakeups, and why POSIX allows them

POSIX defines pthread_cond_wait as allowed to return at any time, with no signal having been delivered. This sounds bizarre — why would the standard let an implementation be wrong?

Two reasons. First, on multiprocessor systems, preventing spurious wakeups requires extra synchronization that costs more than the re-check it would save. Second, on Linux the implementation uses futex wait, and signal handlers, thread cancellation, and timed waits can all return EINTR before a signal arrives. Forcing every waiter to re-check the predicate anyway means none of these edge cases need special handling. The while-loop is the contract.

Empirically, spurious wakeups on Linux happen approximately never under normal load — but "approximately never" is not "never," and the bugs that show up in production from if-check waiters are vicious. They reproduce only under load, only on specific kernels, and only after the system has been running for hours.

Signal vs broadcast

DecisionUse signalUse broadcast
State change wakesOne waiterAll waiters
Cost (many waiters)O(1) wakeO(n) wake
Use whenAny waiter can handle the changeState affects all waiters differently
Producer-consumer (1 item)Yes — wake one consumerNo — thundering herd
Producer-consumer (N items in bulk)Signal N times, or...broadcast once
Different waiters, different predicatesRisky — might wake wrong threadSafer — all re-check
Shutdown / phase changeNoYes — wake everyone

The classic "thundering herd" failure: 100 consumers wait on a queue, the producer broadcasts when a single item arrives, all 100 wake up, 99 re-check the predicate, see the queue is empty again, and go back to sleep — burning 99 context switches for one delivery. Signal would wake one.

When to reach for a condition variable

  • Producer-consumer queues. The classic. Consumers wait on "queue non-empty," producers signal on push.
  • Bounded buffers. Two condvars — one for "not full," one for "not empty." Producers wait on the first, consumers on the second.
  • Thread-pool task dispatch. Workers wait on "task available." Submitter signals when enqueuing a task; broadcast on shutdown.
  • Phase synchronization. Threads wait for a phase to begin or complete. Often a barrier is simpler, but condvars give finer control.
  • Lazy initialization with multiple waiters. First caller initializes; subsequent callers wait until "initialized == true."

If the wait is for a counted resource (N permits available), reach for a semaphore. If the wait is for a one-shot event, an std::promise / CountDownLatch is clearer. Condvars excel when the predicate is arbitrary boolean state.

Pseudo-code: bounded producer-consumer

// Shared:
buffer = ring_buffer(capacity = 10)
mutex
not_empty = condition_variable
not_full  = condition_variable

producer():
    while true:
        item = produce_one()
        lock(mutex)
        while buffer.full():
            wait(not_full, mutex)
        buffer.push(item)
        signal(not_empty)
        unlock(mutex)

consumer():
    while true:
        lock(mutex)
        while buffer.empty():
            wait(not_empty, mutex)
        item = buffer.pop()
        signal(not_full)
        unlock(mutex)
        consume(item)

JavaScript (Atomics.wait/notify equivalent)

JavaScript has no native condition variable, but SharedArrayBuffer + Atomics gives the same primitives. Atomics.wait blocks until a value at a given index changes; Atomics.notify wakes N waiters.

// Shared state across Workers
const sab = new SharedArrayBuffer(8);
const state = new Int32Array(sab);
// state[0] = item count
// state[1] = mutex (0 = unlocked, 1 = locked)

function consumerWait() {
  // Acquire mutex (spin-on-CAS for brevity)
  while (Atomics.compareExchange(state, 1, 0, 1) !== 0) {
    Atomics.wait(state, 1, 1);
  }
  // Wait until queue non-empty
  while (Atomics.load(state, 0) === 0) {
    Atomics.store(state, 1, 0);          // release mutex
    Atomics.wait(state, 0, 0);           // sleep on cv
    while (Atomics.compareExchange(state, 1, 0, 1) !== 0) {
      Atomics.wait(state, 1, 1);
    }
  }
  Atomics.sub(state, 0, 1);              // pop
  Atomics.store(state, 1, 0);            // release mutex
  Atomics.notify(state, 1, 1);           // signal mutex waiters
}

function producerSignal() {
  while (Atomics.compareExchange(state, 1, 0, 1) !== 0) {
    Atomics.wait(state, 1, 1);
  }
  Atomics.add(state, 0, 1);
  Atomics.notify(state, 0, 1);           // signal one consumer
  Atomics.store(state, 1, 0);
  Atomics.notify(state, 1, 1);
}

C with pthreads

#include <pthread.h>

#define CAP 10
int buffer[CAP];
int head = 0, tail = 0, count = 0;

pthread_mutex_t m  = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t  ne = PTHREAD_COND_INITIALIZER;  // not_empty
pthread_cond_t  nf = PTHREAD_COND_INITIALIZER;  // not_full

void *consumer(void *_) {
    for (;;) {
        pthread_mutex_lock(&m);
        while (count == 0)
            pthread_cond_wait(&ne, &m);     // ← while-loop guards spurious + race
        int item = buffer[head];
        head = (head + 1) % CAP;
        count--;
        pthread_cond_signal(&nf);
        pthread_mutex_unlock(&m);
        process(item);
    }
}

void *producer(void *_) {
    for (;;) {
        int item = produce();
        pthread_mutex_lock(&m);
        while (count == CAP)
            pthread_cond_wait(&nf, &m);
        buffer[tail] = item;
        tail = (tail + 1) % CAP;
        count++;
        pthread_cond_signal(&ne);
        pthread_mutex_unlock(&m);
    }
}

Common pitfalls

  • Using if instead of while. The number-one bug. Works fine for months, then production wakes up in a half-initialized state and corrupts data. Always while.
  • Signaling without holding the mutex. POSIX allows it, but a wakeup can race with a state change in a third thread, producing a "wakeup that finds the predicate false again" — fine if you used while, undefined behavior if you used if.
  • Forgetting that wait() releases the mutex. If you call wait() expecting the mutex to stay locked, every other thread can suddenly enter the critical section. Anything you cached from before the wait is potentially stale; re-read shared state after waking.
  • One condvar for two predicates. Two consumer types waiting on a single condvar — a signal meant for type A wakes a type B that immediately re-sleeps. Use broadcast, or split into two condvars.
  • Lost wakeups via separate locks. Signaling under a different mutex than the wait uses breaks the atomicity guarantee. The signal can land before the waiter has added itself to the queue.
  • Destroying a condvar with waiters on it. Calling pthread_cond_destroy on a condvar that still has sleepers is undefined behavior. Always broadcast a shutdown signal and join all threads before destroying.

Performance characteristics

Modern Linux glibc implements condvars on top of futex. The hot path — signal with no waiters — is a single atomic increment of a sequence counter, around 5 ns. Wait with the predicate already true is comparable. A true sleep/wake transition costs:

  • Futex syscall: ~100-200 ns each way
  • Context switch: ~1-3 µs (cache pollution dominates)
  • Re-acquiring the mutex: ~25-100 ns uncontended

Total round-trip wait→wake→re-acquire is typically 3-5 µs. Compare to spinning with a 1 ms sleep (a common naive polling pattern): 1000 µs latency in the best case, plus 100% CPU burned during the wait. Condvars are not just cleaner — they are 200× more responsive and use 0% CPU while idle.

The pathological case is broadcast with 100+ waiters: each wake costs a context switch, and they serialize on the mutex. Wake-storm cost is O(n) and can stall the system for milliseconds. If you find yourself needing to wake N waiters frequently, consider per-waiter condvars or a lock-free queue with atomics instead.

Frequently asked questions

Why must wait() always be inside a while-loop?

Because wakeups are advisory, not authoritative. The waiter must re-check the predicate before proceeding for two reasons: spurious wakeups (the OS may unblock a waiter even when nobody signaled), and racy hand-offs (between signal and the waiter re-acquiring the mutex, a third thread may have consumed the resource). An if-check skips the re-check and produces use-after-free bugs that only show up under load.

What's the difference between signal and broadcast?

signal (pthread_cond_signal) wakes exactly one waiting thread. broadcast (pthread_cond_broadcast) wakes all waiting threads. Use signal when any single waiter can handle the new state (one item produced → wake one consumer). Use broadcast when the state change is global or when waiters wait on different sub-predicates of the same condition variable.

Why does wait() take a mutex argument?

Because the act of releasing the mutex and adding the thread to the wait queue must be atomic. If wait() unlocked the mutex first and then enqueued, a signal arriving in between would be lost — the waiter would block forever. The kernel atomically queues the thread, then releases the mutex, then sleeps.

What is a spurious wakeup and why does it happen?

A spurious wakeup is when wait() returns without any thread having called signal or broadcast. POSIX permits this because some implementations would otherwise need extra synchronization to prevent it — and forcing the waiter to re-check the predicate anyway means the extra cost isn't worth paying. Spurious wakeups are rare on Linux but do happen, especially after signal handlers or thread cancellation.

When should I use a condition variable versus a semaphore?

Semaphores count resources; condition variables wait on predicates. If you're tracking N permits (slots in a fixed-size buffer, available DB connections), use a semaphore — its count is the abstraction you want. If you're waiting for an arbitrary boolean condition over shared state (queue non-empty AND not in shutdown mode), use a condvar with the protecting mutex. Condvars compose; semaphores don't.

How expensive is pthread_cond_wait?

Uncontended condvar operations are ~50-100 ns on Linux because the implementation is built on futex — userspace can fast-path the signal when there are no waiters. A true sleep/wake cycle costs one futex syscall (~1-2 µs round trip plus a context switch, ~3-5 µs total). That's still 1000× cheaper than polling with a 1 ms sleep.

Can I signal without holding the mutex?

Yes, POSIX allows it — pthread_cond_signal does not require the mutex to be held. But you almost always want to signal while holding the mutex: signaling outside the lock creates a window where another thread can change the predicate between your update and the signal, causing a lost wakeup or a wakeup that finds the condition no longer true. The cost of holding the mutex one extra instruction is negligible.