Concurrency Patterns
Async / Await
Suspend at await, resume when the future completes — no thread blocked
async/await is syntactic sugar over promise/future chaining — a function suspends at await points and resumes when the awaited future completes. Cooperative multitasking, no thread per task.
- Spawn cost~1 µs (vs ~1 ms thread)
- Per-await overhead~100 ns (JS/Python), ~10 ns (Rust)
- Memory per task~200 B-2 KB heap frame
- SchedulingCooperative (only at await points)
- Max concurrent tasksMillions per process
- Famous runtimesNode.js, asyncio, Tokio, C# Task
Interactive visualization
Watch a single thread juggle three tasks. At each await, control returns to the event loop; when the future resolves, the task resumes.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How async/await works
The big idea is to look like synchronous code while behaving like a callback chain. When you write:
async function fetchUser(id) {
const response = await http.get(`/users/${id}`);
const user = await response.json();
return user;
}
The compiler rewrites this function as a small state machine. Each await is a potential suspension point. When the function hits an await, three things happen: it stores its local variables in a heap-allocated frame, registers a wakeup callback with the future being awaited, and returns control to the caller. The thread that was running the function is now free to do anything else.
When the future eventually resolves — the HTTP response arrives, the file is read, the timer fires — the runtime's event loop calls the registered wakeup. The wakeup restores the local variables and jumps back into the function at the line after the await. To the programmer the code reads top-to-bottom; to the machine it is a chain of callbacks stitched together by the compiler.
The implication is enormous: a single OS thread can multiplex tens of thousands of in-flight async tasks, because the thread only runs work when a task is actually progressing — when an await fires, the thread immediately picks up another ready task. Compare to threads, where each blocked thread costs 1-8 MB of stack and a context-switch slot.
The event loop
An async runtime is two things: a queue of ready tasks and a poller for not-yet-ready futures. The main loop is conceptually:
while running:
task = ready_queue.pop()
if task is None:
events = io_poller.poll(timeout=until_next_timer)
for event in events:
ready_queue.push(event.task)
continue
task.step() # runs until next await or completion
That single line — "runs until next await" — is the whole game. Tasks are not interrupted mid-execution. They run until they voluntarily yield by awaiting. This is cooperative scheduling, in contrast to threads' preemptive scheduling where the kernel can stop you at any instruction.
Cooperative scheduling has one beautiful property: there are no data races between two tasks on the same event loop, because only one runs at a time and there is no preemption. JavaScript and Python single-threaded async code is race-free by construction. Cooperative scheduling has one ugly property: if a task forgets to await — runs a long sync computation or calls a blocking syscall — every other task on the loop stalls until it finishes.
Async vs threads
The comparison everyone wants to make. The honest answer: they solve different problems.
| Async / Await | OS Threads | |
|---|---|---|
| Per-task spawn cost | ~1 µs (heap frame) | ~1 ms (clone syscall + stack) |
| Per-task memory | ~200 B-2 KB | 1-8 MB stack |
| Max practical concurrent | Millions per process | Thousands per process |
| Scheduling | Cooperative (await is yield) | Preemptive (kernel decides) |
| Data races between tasks | None (single-threaded loop) | Yes — locks required |
| CPU-bound work | Blocks the event loop | True parallelism |
| I/O-bound work | Excellent fit | One thread idles per blocked call |
| Debugging | Stack traces lose context | Native debugger support |
The crisp rule: async for I/O, threads (or a thread pool) for CPU. Modern runtimes combine both — Tokio has a multi-threaded async scheduler plus a separate blocking pool; Node.js has a single-threaded JS loop plus libuv's thread pool; C# has Task with both async I/O and parallelism.
Variants across languages
- JavaScript / TypeScript. Single-threaded event loop. Every async task runs on the main thread.
asyncfunctions return Promises;awaitresolves them. Worker threads exist but communicate via message passing — no shared memory by default. - Python (asyncio). Single-threaded event loop, similar to JS.
async defcreates a coroutine;awaitsuspends it. Coroutines need a running loop — callasyncio.run(main())to start one. The GIL means CPU-bound work blocks every task on the loop. - Rust (Tokio / async-std). Multi-threaded scheduler by default. Futures are inert — they do nothing until polled.
.awaitcompiles to a state-machine poll. Tasks can migrate between OS threads, so shared state needs atomics or message channels. Zero-cost when no async actually happens. - C# (Task). Multi-threaded by default;
awaitresumes on a thread-pool thread (or on the captured SynchronizationContext for UI code). Theasynckeyword is mostly hint to the compiler; the meaningful action is onawait. - Go. No explicit async/await — every
go func()spawns a goroutine that the runtime multiplexes onto OS threads. Cheap (about 2 KB initial stack) but preemptive at function-call boundaries. Channels for coordination.
When to use async/await
- Network-bound services. HTTP APIs, RPC servers, scrapers, proxies — anything where most wall time is spent waiting on a socket. Async beats threads by an order of magnitude on memory and concurrency limits.
- Real-time client UIs. Browsers, mobile apps, desktop apps. The UI thread can fetch data with await without blocking input handling.
- Pipelines of dependent I/O. "Read row, look up related thing, write result" reads naturally as
await db.row(); await api.lookup(); await db.write();— code that would be a callback nightmare without await. - Cancellation and timeouts. Async runtimes give you cancellable tasks (CancellationToken, AbortController, Tokio's select). Cancelling a thread cleanly is much harder.
Avoid async when the task is purely CPU-bound, when you need true real-time guarantees (cooperative scheduling has unbounded latency under load), or when you need to interop heavily with sync libraries that block.
Pseudo-code: an async function as a state machine
// Original:
async function load() {
let a = await fetchA();
let b = await fetchB(a);
return a + b;
}
// Compiler-generated state machine:
class LoadStateMachine {
state = 0
a = null
poll(waker):
switch state:
case 0:
let fa = fetchA()
state = 1
this.fa = fa
case 1:
if !this.fa.is_ready(): return Pending
this.a = this.fa.result()
this.fb = fetchB(this.a)
state = 2
case 2:
if !this.fb.is_ready(): return Pending
return Ready(this.a + this.fb.result())
JavaScript implementation
// Sequential — total latency = A_time + B_time.
async function loadUserSeq(id) {
const profile = await fetch(`/users/${id}`).then(r => r.json());
const posts = await fetch(`/users/${id}/posts`).then(r => r.json());
return { profile, posts };
}
// Parallel — total latency = max(A_time, B_time).
async function loadUserPar(id) {
const [profile, posts] = await Promise.all([
fetch(`/users/${id}`).then(r => r.json()),
fetch(`/users/${id}/posts`).then(r => r.json()),
]);
return { profile, posts };
}
// Cancellation via AbortController.
async function loadWithTimeout(url, ms) {
const ctrl = new AbortController();
const timer = setTimeout(() => ctrl.abort(), ms);
try {
const res = await fetch(url, { signal: ctrl.signal });
return await res.json();
} finally {
clearTimeout(timer);
}
}
// Common bug — fire-and-forget without await.
async function process(items) {
for (const item of items) {
handle(item); // BUG: handle returns a Promise that nobody awaits
// any errors become unhandled rejections.
}
}
// Fix:
async function processFixed(items) {
await Promise.all(items.map(handle));
}
Python implementation
import asyncio
import aiohttp
async def fetch_user(session, user_id):
async with session.get(f'/users/{user_id}') as resp:
return await resp.json()
async def main():
async with aiohttp.ClientSession() as session:
# Run 100 fetches concurrently — all share one thread.
tasks = [fetch_user(session, i) for i in range(100)]
users = await asyncio.gather(*tasks)
return users
asyncio.run(main())
# Mixing blocking code — push it to a thread pool.
import time
async def slow_thing():
# WRONG — blocks the entire event loop for 5 seconds.
# time.sleep(5)
# RIGHT — runs on a thread pool, event loop keeps moving.
loop = asyncio.get_running_loop()
await loop.run_in_executor(None, time.sleep, 5)
# Cancellation.
async def with_timeout():
try:
return await asyncio.wait_for(slow_thing(), timeout=2.0)
except asyncio.TimeoutError:
return None
Common pitfalls
- Blocking the event loop. Any sync call that takes meaningful time — file I/O without aiofiles, network calls without async client, CPU-heavy parsing — freezes every other task. Symptom: latency spikes correlated with specific request types.
- Forgetting await. The async function fires, returns its Promise/coroutine, and you discard it. Work either silently completes off in the void or, in Python, never starts at all. Linters catch most cases; reviewers catch the rest.
- Sequential awaits when parallel is possible.
await a; await b;waits sequentially. If A and B are independent, usePromise.allorasyncio.gatherto overlap them. - Async functions that aren't async. If your function never awaits, marking it async only adds Promise allocation overhead. Either await something or drop the async keyword.
- Cancellation not respected. Tasks that catch CancelledError and continue, or that have no await points to be cancelled at, leak resources. Always propagate cancellation or document why you don't.
- Mixing runtimes. Running an asyncio loop inside another loop, or mixing Tokio and async-std futures, often deadlocks. Pick one runtime per process.
Performance analysis
A modern async runtime can dispatch around 5-10 million await resumptions per second on a single CPU core. Each suspension/resumption costs roughly 100 nanoseconds in JavaScript V8 (microtask enqueue + dequeue plus a small allocation), about the same in Python's asyncio, and as low as 5-10 nanoseconds in Rust where the entire state machine compiles to inline code with no heap allocation in the happy path.
Memory: a Tokio task is around 200 bytes resident plus the actual state-machine frame, typically 50-300 bytes more for a few locals. A million Tokio tasks fit comfortably in 1 GB. A million OS threads would need terabytes of address space.
The catch: throughput on an async event loop is bounded by the slowest task. Because scheduling is cooperative, a single 100-millisecond CPU-bound task adds 100 ms to every other task's latency. That is why production async systems carefully wrap CPU work in run_in_executor or spawn_blocking to push it off the loop.
Frequently asked questions
How does async/await differ from a thread?
A thread is an OS-managed unit of execution with its own kernel stack (1-8 MB) and is preemptively scheduled by the kernel. An async task is a userspace state machine — its frame is a few hundred bytes on the heap, it does not have a kernel stack, and it is cooperatively scheduled by the runtime. Per-task spawn cost drops from about 1 millisecond to about 1 microsecond, and you can have millions of concurrent tasks instead of thousands of threads.
What actually happens at the await keyword?
The compiler rewrites the async function as a state machine. When the function hits an await on a not-yet-completed future, the state machine saves its local variables and registers, registers a wakeup callback with the future, and returns control to the caller. When the future completes, the runtime's event loop calls the wakeup, which resumes the state machine from the saved point. To the programmer it looks sequential; to the machine it is a chain of callbacks.
Why is JavaScript async so much simpler than other languages?
JavaScript runs on a single-threaded event loop, so there is no data race between async tasks — only one task runs at a time, switching only at await points. Python's asyncio is the same. Rust and C# run async on a multi-threaded scheduler where tasks can migrate between OS threads, which means async code can race just like multithreaded code — typically requiring atomics, locks, or message passing for shared state.
When should I use async vs threads?
Async wins for I/O-bound work — HTTP calls, database queries, file reads — because most of the time is spent waiting and no real CPU work happens. Threads win for CPU-bound work — image processing, hashing, parsing — because async tasks all share one thread and block the event loop. Many runtimes give you both: an async event loop for I/O plus a thread pool that async code can dispatch CPU work to.
What is the cost of a promise or future allocation?
A promise chain in JavaScript or Python costs roughly 100 nanoseconds per await — an object allocation, a microtask enqueue, and a microtask dequeue. Rust's async/await compiles to a stack-allocated state machine with no heap allocation in the common case, putting the per-await cost in single-digit nanoseconds. Compare to thread context switches of 1-5 microseconds plus syscall overhead.
What happens if I forget to await a promise?
In JavaScript the promise still runs to completion, but you lose its result and any exception it throws becomes an unhandled rejection. In Python asyncio the coroutine is created but never scheduled — you get a 'coroutine was never awaited' warning and the work simply does not happen. In Rust, a future is inert until polled, so a non-awaited future does nothing at all. Compilers and linters often warn about unawaited futures specifically because this bug class is so common.
Can I mix async code with synchronous code?
You can call sync from async — but if the sync function blocks (file I/O, sleep, network) it blocks the whole event loop and starves every other task. Wrap blocking calls in run_in_executor (Python), Task.Run (C#), or spawn_blocking (Rust) to push them to a thread pool. Calling async from sync is harder — you must either run an event loop (asyncio.run) or use a runtime's block_on, and you cannot await sync code without changing it.