The Prospective Replication Project

Jonathan Schooler, UC Santa Barbara

Leif Nelson, UC Berkeley

Jon Krosnick, Stanford University

Brian Nosek, University of Virginia

Scientists are concerned with the irreproducibility of scientific findings. Across medicine, psychology, economics, and genetics, there is accumulating evidence that findings are smaller, less robust, or simply less true, than originally believed. Such concerns have given rise to the field of meta-science, which uses quantifiable methodologies to understand how scientific practices influence the veracity of scientific conclusions.

To date, the understanding of reproducibility has been impeded by two related challenges: 1) the lack of transparency of the scientific record, and 2) the retrospective nature of reproducibility studies. The suggestion that reproducibility issues are due to publication bias and selective reporting hinges on the presumption of a large body of unpublished studies with negative outcomes. Although the existence of such studies can hardly be doubted, the contribution of unpublished findings to reproducibility issues is difficult to assess. Compounding this, replications of published studies are of little value in and of themselves in establishing why initial studies frequently report inflated results.

Our project aims to overcome those obstacles. We are conducting a multi-site prospective multi-replication study. Four labs are engaged in their business-as-usual investigations to discover new experimental effects. As new effects are discovered, they are systematically replicated by the originating laboratory and by the others. In so doing, this project both assesses the individual hypotheses explored in each study, and provides the context for a deeper meta-scientific understanding. We will evaluate the evidence for different accounts for variation in the reproducibility of scientific findings, including: false positive effects, selective reporting, publication bias, and changes in procedure or sampling.

This project has three primary goals: 1) To develop a gold standard for replication protocol, in which every effort is made to design experiments and implement replications in a manner that will maximize the likelihood of full replication. 2) To examine whether the replications of newly devised experimental protocols are associated with declining effect sizes, even when all reasonable efforts are made to minimize such declines. 3) If declining effect sizes are still observed, to identify their possible locus by, for example, assessing whether other labs can replicate the findings as effectively as the originating lab.

The project is implementing the following meta-scientific innovations: 1) Exclusively conducting replications of newly devised experiments. When replications are made of previously published experiments, there is no way to assess how many unpublished studies and/or analyses are hidden from view. We eliminate this concern by focusing on newly devised experiments, developed and reported in full. 2) Carefully logging all aspects of the research protocols and analyses. By including a complete record of all aspects of the research process (paradigm development, methodology, population demographics, all analyses), we maximize the likelihood of a good replication, but also leave the necessary evidence for identifying why a finding might not replicate. 3) Tightly constraining and adequately powering all studies. In order to maximize the comparability and statistical power of the studies, all experiments will be constrained to the same highly powered design:1500 participants with two conditions and one dependent measure. 4) Engaging in multiple replication attempts. All studies will be replicated by each of the participating laboratories, thereby providing a powerful test of the robustness of each effect. 5) Systematic blinding of the outcomes of studies. In order to assess the possible impact of knowing the outcomes of findings on subsequent replications, the project manipulates the timing at which the outcomes of replication studies are analyzed and reported to the rest of the team.

This research is supported by the Fetzer Franklin Fund of the John E. Fetzer Memorial Trust.