Richard Feynman gave a commencement address at Caltech in 1974 that named a failure mode so precisely it still hasn’t been fixed. He called it cargo cult science — research that has all the outward appearance of the scientific method but lacks its essential ingredient: the commitment to proving yourself wrong. Fifty years later, entire research programs are built on the exact error he described. The crisis isn’t behind us. It’s the default operating condition of modern academic science.
What Feynman Actually Said
The address is worth reading in full, because Feynman wasn’t making a general complaint about sloppy thinking. He was making a precise structural argument. He described the cargo cults of the South Pacific — island peoples who, after watching Allied airstrips bring in planes loaded with goods during World War II, built replica airstrips after the war ended, complete with wooden headphones and bamboo control towers, hoping to summon the cargo back. The form was perfect. The understanding was absent.
Feynman’s charge was that this is what large portions of psychology, sociology, and the softer sciences had become: exquisitely well-dressed rituals that looked like science without the self-critical engine that makes science actually work. He called the essential missing ingredient “a kind of utter honesty” — specifically, the obligation to report everything that might undermine your own result. Not just the numbers that worked. Everything.

The Replication Crisis as Confirmation
In 2015, the Open Science Collaboration published a landmark study in Science that attempted to replicate 100 published psychology findings. Fewer than 40 percent replicated successfully under the same conditions. The study didn’t find fraud. It found something structurally worse: a literature built on results that were real the first time but couldn’t be reliably reproduced — published because they confirmed hypotheses and reported significant p-values, never tested for robustness because the incentive to do so was effectively zero.
The pattern extends beyond psychology. A 2011 study by Glenn Begley and Lee Ellis, then at Amgen, found that only 6 out of 53 landmark cancer biology papers could be reproduced internally. Six. The other 47 had passed peer review, accumulated citations, and in some cases formed the basis for clinical trials. The experimental airstrips were elaborate. No planes landed.
The Incentive System That Makes This Rational
The conventional response to replication failures is to reach for individual blame — sloppy researchers, p-hacking, outright fraud in extreme cases. These things exist. But framing the crisis as a character problem misses what makes it so durable. The incentive structure of academic science systematically rewards cargo cult behavior and punishes its alternative.
Publication in high-impact journals requires novel, positive results. Replications of existing work are difficult to publish. Null results — the experiments that found nothing, which are often the most scientifically valuable — are nearly impossible to place. A researcher who runs five careful experiments, finds a significant result in one and nothing in four others, faces a stark choice: report all five and struggle to publish, or report the one and build a career. The system has selected, over decades, for the second option.
John Ioannidis, in his 2005 paper “Why Most Published Research Findings Are False,” modeled this mathematically and reached a conclusion that should have restructured science funding and publishing immediately. It didn’t — because the people who would have to restructure it were the people the existing system had elevated. Incentive systems are conservative by nature. This problem sits at the core of what Karl Popper was trying to solve when he built his falsifiability framework — the demarcation problem between real science and its imposters is older than the replication crisis, and harder to fix.
The Social Pressure Inside the Lab
There’s a layer below the institutional incentives that Feynman gestured at but didn’t fully develop: the social dynamics inside research groups. A graduate student who finds that the lab’s core hypothesis doesn’t hold up is not in a neutral position. That hypothesis is often the intellectual identity of the principal investigator. The lab’s funding, reputation, and future grant applications depend on it. Delivering a negative result isn’t just scientifically inconvenient — it can read as a kind of insubordination.
This is how paradigms ossify before they become paradigms in Kuhn’s formal sense. The process doesn’t require malice. It requires only that career survival depends on not asking certain questions too loudly. The social fabric of a lab, a department, a field, rewards alignment. Feynman’s “utter honesty” demands a kind of professional courage that the system structurally discourages. The history of evolutionary biology offers a version of this same dynamic — the war between punctuated equilibrium and gradualism was as much a fight over institutional positioning as it was a scientific disagreement.

What Rigorous Science Actually Costs
The boring, unsexy answer to the replication crisis is pre-registration: researchers declare their hypotheses, sample sizes, and analysis plans before collecting data, making post-hoc fishing for significance impossible. The Center for Open Science has championed this approach, and the results from pre-registered studies are instructive. Effect sizes shrink. Significance rates drop. The literature gets smaller and more reliable — which means what looked like knowledge was, in substantial part, well-organized noise.
Pre-registration helps. It doesn’t solve the underlying problem, because it doesn’t change what journals want to publish or what grant committees want to fund. A pre-registered study that finds nothing is still a career setback in most fields. The fix requires changing what the system rewards — how journals, tenure committees, and funding bodies assign value. That is a political problem dressed in a lab coat.
Feynman’s Deeper Charge
What made the 1974 address uncomfortable then, and more uncomfortable now, is that Feynman wasn’t only criticizing soft science. He was criticizing science education — the way students are taught to revere results rather than process, to treat published findings as ground truth rather than provisional claims awaiting challenge. Cargo cult science reproduces itself because the scientists who practice it were trained to mistake the ritual for the substance.
The problem isn’t bad scientists. Most researchers working within broken incentive structures are doing exactly what they were trained to do and what the system rewards them for doing. The problem is that a method designed to be self-correcting has been operationalized in a way that makes self-correction economically irrational. Feynman saw it coming. He described it with enough precision that nothing in the subsequent fifty years requires his diagnosis to be substantially revised. That is either a tribute to his clarity or an indictment of what science’s institutions have done with it.
You Might Also Like
- The Demarcation Problem: Karl Popper, Falsifiability, and the Boundary Between Science and Pseudoscience
- Punctuated Equilibrium vs. Gradualism: The Forgotten War in Evolutionary Theory
- Francis Collins Builds the Human Genome; Francis Collins Builds an Altar
Sources
- Feynman, R. P. (1974). “Cargo Cult Science.” Caltech Commencement Address. calteches.library.caltech.edu
- Open Science Collaboration. (2015). “Estimating the reproducibility of psychological science.” Science, 349(6251). doi.org/10.1126/science.aac4716
- Begley, C. G., & Ellis, L. M. (2012). “Raise standards for preclinical cancer research.” Nature, 483, 531–533. doi.org/10.1038/483531a
- Ioannidis, J. P. A. (2005). “Why Most Published Research Findings Are False.” PLOS Medicine, 2(8). doi.org/10.1371/journal.pmed.0020124
- Center for Open Science: osf.io







