Developmental Psychology

Stanford Marshmallow Experiment

Mischel's delay-of-gratification studies — and what the 2018 replication actually showed

The Stanford marshmallow experiment is a series of delay-of-gratification studies conducted by Walter Mischel and colleagues at the Bing Nursery School from the late 1960s onward. Children aged three to six chose between one marshmallow now and two if they waited about fifteen minutes alone with the treat. Mischel's longitudinal follow-ups linked waiting time to SAT scores, BMI, and self-control decades later. A 2018 conceptual replication by Watts, Duncan, and Quan with a tenfold larger and more diverse sample found the predictive effect cut roughly in half once family background was controlled — recasting the test as a measure of resources as much as willpower.

  • Original studiesMischel et al., 1968-1972 onward
  • SiteBing Nursery School, Stanford
  • Original sample~90 children, mostly from Stanford community
  • 2018 replicationWatts, Duncan, Quan (n=918, diverse)
  • Replication findingEffect halved after SES controls
  • Modern interpretationTrust + resources as much as willpower

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

Why the marshmallow studies matter

  • Self-control research. Foundational paradigm for executive function in childhood.
  • Cooling strategies. Cognitive restructuring outperforms raw willpower.
  • Replication science. Case study in how original effect sizes shrink with broader samples.
  • SES research. Family background explains much of what looked like willpower.
  • Trust and rational delay. Eating early is rational under unreliable adults.
  • Cultural reach. Featured in The New Yorker, TED talks, and parenting books.
  • Neuroscience. Casey's fMRI follow-ups linked prefrontal differences to childhood wait time.

Common misconceptions

  • Wait time is pure willpower. Trust, SES, and strategy all contribute substantially.
  • The original effect was huge. Stanford-sample correlations of 0.4 dropped to 0.10 in diverse 2018 replication.
  • Eating the marshmallow signals weakness. Under unreliable adults, eating immediately is the rational choice.
  • Training willpower transfers broadly. Executive function gains tend to be domain-specific.
  • Mischel's sample was representative. Stanford faculty children were not.
  • Replication failed. The effect replicated but shrank after SES controls.

Frequently asked questions

What was the original procedure?

A child sat alone in a quiet room with a single marshmallow (or pretzel, or cookie). The experimenter said: "I'll be back in fifteen minutes. If the marshmallow is still here, you can have two. If not, you can eat this one now." A bell let the child summon the experimenter early. About a third of children waited the full fifteen minutes. The dependent measure was wait time, with strategies — covering the eyes, singing, distracting attention — recorded.

What did the longitudinal follow-ups find?

Mischel and Shoda's 1988 and 1990 follow-ups linked preschool wait time to adolescent SAT scores (correlation around 0.4 in the high-status sample), parent-rated competence, and resistance to temptation. Casey et al. (2011) scanned now-adult participants and found ventromedial prefrontal differences correlated with original wait time. The study became a touchstone for grit and self-control narratives in popular psychology.

How does the 2018 replication change the picture?

Watts, Duncan, and Quan (2018) used the Eunice Kennedy Shriver National Institute of Child Health and Human Development data — 918 children sampled across SES strata. They replicated a smaller predictive correlation (r ≈ 0.28 unadjusted, dropping to r ≈ 0.10 after controlling for family income, parent education, and home environment). The wait was still informative but explained much less unique variance than the Stanford sample suggested.

Was the Stanford sample biased?

Yes — overwhelmingly children of Stanford faculty and staff. Mischel acknowledged the sample's homogeneity. The original cohort grew up with stable food access, educated parents, and reliable caregivers. In such environments, willpower may be the rate-limiting factor for life outcomes. In samples where adults are unreliable or food is scarce, eating the marshmallow immediately can be the rational choice.

What's the trust-environment interpretation?

Kidd, Palmeri, and Aslin (2013) primed children with reliable or unreliable adults before the test. Children in the unreliable condition waited a third as long. The result implies wait time partly measures whether the child trusts the experimenter to deliver the second marshmallow. Children from low-trust environments rationally take the certain first treat — a strategy, not a deficit.

What strategies helped children wait?

Mischel and Mischel (1983) showed cognitive restructuring matters most. Children told to think of the marshmallow as a "puffy cloud" waited longer than those told to think of its taste. Distraction (singing, hiding the eyes) outperformed willpower. Mischel called this "cooling" — converting hot, tempting stimuli into cool, abstract representations. The technique generalizes to adult self-control interventions.

Should parents train delay of gratification?

The training case is weaker after the 2018 replication. Predictive power comes partly from underlying SES and trust. Direct interventions to improve self-control (Diamond's tools-of-the-mind curriculum, executive function training) show modest, often non-transferring gains. Building reliable, predictable home environments and teaching specific cooling strategies is better supported than generic willpower drills.