Cognitive Psychology

Cognitive Load

Working memory's narrow bottleneck and how learning slips through it

Cognitive load is the total mental effort working memory uses at any moment. John Sweller's Cognitive Load Theory (1988) distinguishes intrinsic load (from the task's inherent difficulty), extraneous load (from how the task is presented), and germane load (from the work of building lasting schemas). Working memory holds roughly 4 chunks (Cowan, 2001) — a sharp downward revision of Miller's classic 7±2 — and overflowing this capacity halts learning. The theory has driven practical changes in textbooks, software interfaces, and worked-example design across education.

  • FounderJohn Sweller (1988)
  • Three load typesIntrinsic, extraneous, germane
  • Working memory capacity~4 chunks (Cowan, 2001); ~7±2 (Miller, 1956 — older estimate)
  • Duration without rehearsal~15-30 seconds
  • Worked-example effectStudied solutions outperform problem-solving for novices
  • Modern unificationSchema construction reduces effective load

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why cognitive load matters

  • Instructional design. Reducing extraneous load improves learning more than adding content.
  • Software UX. Cluttered interfaces overload novices and stall adoption.
  • Aviation and medicine. Cockpit and dashboard design accounts for load in critical moments.
  • Reading instruction. Decoding fluency frees capacity for comprehension.
  • Mathematics. Worked examples beat discovery learning for novices.
  • Multimedia learning. Mayer's principles formalize load-based design rules.
  • Workplace training. Microlearning and chunking improve transfer to job tasks.

Common misconceptions

  • Working memory holds 7 items. Modern estimates put focused capacity at ~4 chunks.
  • More information helps learning. Beyond a point it overflows working memory and learning collapses.
  • Discovery learning is always best. For novices, worked examples typically outperform unguided problem-solving.
  • Multimedia is always better. Redundant text and narration usually hurt rather than help.
  • Load only matters for hard tasks. Even simple tasks fail if extraneous load is high.
  • Experts and novices need the same materials. The expertise reversal effect shows what helps novices can hurt experts.

Frequently asked questions

What are the three types of cognitive load?

Intrinsic load reflects the task's inherent complexity — how many elements must be held and integrated. Solving 7×8 has lower intrinsic load than solving a multi-step word problem. Extraneous load comes from how information is presented; poor formatting forces processing that doesn't serve learning. Germane load is the productive effort of constructing schemas — patterns that let novices process complex material as if it were simple.

How big is working memory really?

Miller's "magical number 7±2" (1956) measured short-term memory in undifferentiated digit-like items. Cowan (2001) reanalyzed and argued for a focus capacity of about 4 chunks when rehearsal is prevented. Both numbers describe the same system at different levels. The pragmatic implication: instructional designers should not load more than 3-4 simultaneously processed elements on novices, even when the material seems "small" to experts.

What is chunking?

Chunking groups elements into a single meaningful unit, expanding effective capacity. Chess masters do not see 32 pieces; they see ~4-7 strategic configurations (Chase & Simon, 1973). Skilled readers see whole words, not letters. Schemas built through practice transform sequences of low-level information into single chunks, which is how expertise effectively expands working memory without changing its biological capacity.

What is the worked-example effect?

Sweller and Cooper (1985) showed that novices learn algebra better by studying complete worked solutions than by solving problems themselves. Problem-solving for novices burns working memory on means-end search, leaving little capacity for schema construction. As expertise grows, the worked-example advantage reverses (the expertise reversal effect, Kalyuga 2003) — experienced learners benefit more from active problem-solving.

What is the split-attention effect?

When learners must integrate two related sources — a diagram and a text caption placed elsewhere — the integration itself consumes working memory. Tarmizi and Sweller (1988) showed that geometric problems with annotations placed within the diagram outperformed the same problems with separated annotations. The implication for textbooks, slides, and software: keep related information spatially and temporally integrated.

What is the redundancy effect?

When the same information is presented in two forms — narration and identical on-screen text — performance often falls below either alone. The brain processes both, neither can be ignored, and capacity is wasted. Mayer's multimedia principles (1997-) operationalize many of these into design rules: prefer narration over duplicated text, use coherent visuals, signal critical content, and avoid decorative graphics that add load without benefit.

How is cognitive load measured?

Self-report rating scales (Paas, 1992) ask learners to rate effort on 9-point scales, the most common method. Dual-task paradigms measure performance on a secondary task during learning. Pupillometry — pupil diameter — tracks load in real time. EEG measures including frontal theta correlate with working memory engagement. No single measure is gold standard; converging measures give the most reliable picture.