Estimating the reproducibility of psychological science

Published on Aug 28, 2015in Science41.845
· DOI :10.1126/SCIENCE.AAC4716
Alexander A. Aarts1
Estimated H-index: 1
Joanna E. Anderson2
Estimated H-index: 2
(DRDC: Defence Research and Development Canada)
+ 267 AuthorsKellylynn Zuni2
Estimated H-index: 2
(Adams State University)
Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
📖 Papers frequently viewed together
39 Authors (Brian A. Nosek, ..., Tal Yarkoni)
1,161 Citations
566 Citations
833 Citations
#1Brian A. NosekH-Index: 93
#2George AlterH-Index: 21
Last. Tal YarkoniH-Index: 48
view all 39 authors...
Transparency, openness, and reproducibility are readily recognized as vital features of science ( 1 , 2 ). When asked, most scientists embrace these features as disciplinary norms and values ( 3 ). Therefore, one might expect that these valued features would be routine in daily practice. Yet, a growing body of evidence suggests that this is not the case ( 4 – 6 ).
1,161 CitationsSource
#1Uri Simonsohn (UPenn: University of Pennsylvania)H-Index: 38
This article introduces a new approach for evaluating replication results. It combines effect-size estimation with hypothesis testing, assessing the extent to which the replication results are consistent with an effect size big enough to have been detectable in the original study. The approach is demonstrated by examining replications of three well-known findings. Its benefits include the following: (a) differentiating “unsuccessful” replication attempts (i.e., studies yielding p > .05) that are...
379 CitationsSource
#1Tim Errington (Center for Open Science)H-Index: 11
#2Elizabeth IornsH-Index: 15
Last. Brian A. Nosek (UVA: University of Virginia)H-Index: 93
view all 6 authors...
It is widely believed that research that builds upon previously published findings has reproduced the original work. However, it is rare for researchers to perform or publish direct replications of existing results. The Reproducibility Project: Cancer Biology is an open investigation of reproducibility in preclinical cancer biology research. We have identified 50 high impact cancer biology articles published in the period 2010-2012, and plan to replicate a subset of experimental results from eac...
179 CitationsSource
#1John P. A. IoannidisH-Index: 204
#2Marcus R. Munafò (UoB: University of Bristol)H-Index: 97
Last. Sean P. David (Stanford University)H-Index: 35
view all 5 authors...
Recent systematic reviews and empirical evaluations of the cognitive sciences literature suggest that publication and other reporting biases are prevalent across diverse domains of cognitive science. In this review, we summarize the various forms of publication and reporting biases and other questionable research practices, and overview the available methods for probing into their existence. We discuss the available empirical evidence for the presence of such biases across the neuroimaging, anim...
285 CitationsSource
#1Sanford L. Braver (UCR: University of California, Riverside)H-Index: 5
#2Felix Thoemmes (Cornell University)H-Index: 20
Last. Robert Rosenthal (UCR: University of California, Riverside)H-Index: 121
view all 3 authors...
The current crisis in scientific psychology about whether our findings are irreproducible was presaged years ago by Tversky and Kahneman (1971), who noted that even sophisticated researchers believe in the fallacious Law of Small Numbers—erroneous intuitions about how imprecisely sample data reflect population phenomena. Combined with the low power of most current work, this often leads to the use of misleading criteria about whether an effect has replicated. Rosenthal (1990) suggested more appr...
198 CitationsSource
#1Brian A. Nosek (UVA: University of Virginia)H-Index: 93
#2Daniel Lakens (TU/e: Eindhoven University of Technology)H-Index: 31
Ignoring replications and negative results is bad for science. This special issue presents a novel publishing format – Registered Reports – as a partial solution. Peer review occurs prior to data collection, design and analysis plans are preregistered, and results are reported regardless of outcome. Fourteen Registered Reports of replications of important published results in social psychology are reported with strong confirmatory tests. Further, the articles demonstrate open science practices s...
335 CitationsSource
#1Geoff Cumming (La Trobe University)H-Index: 33
We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data- analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesi...
1,748 CitationsSource
#1Richard A. Klein (UF: University of Florida)H-Index: 8
#2Kate A. Ratliff (UF: University of Florida)H-Index: 16
Last. Brian A. Nosek (UVA: University of Virginia)H-Index: 93
view all 51 authors...
Although replication is a central tenet of science, direct replications are rare in psychology. This research tested variation in the replicability of 13 classic and contemporary effects across 36 independent samples totaling 6,344 participants. In the aggregate, 10 effects replicated consistently. One effect – imagined contact reducing prejudice – showed weak support for replicability. And two effects – flag priming influencing conservatism and currency priming influencing system justification ...
566 CitationsSource
#1Katherine S. Button (UoB: University of Bristol)H-Index: 16
#2John P. A. Ioannidis (Stanford University)H-Index: 204
Last. Marcus R. MunafòH-Index: 97
view all 7 authors...
A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is ineffici...
4,039 CitationsSource
Reproducibility is a defining feature of science. However, because of strong incentives for innovation and weak incentives for confirmation, direct replication is rarely practiced or published. The Reproducibility Project is an open, large-scale, collaborative effort to systematically examine the rate and predictors of reproducibility in psychological science. So far, 72 volunteer researchers from 41 institutions have organized to openly and transparently replicate studies published in three pro...
348 CitationsSource
Cited By3760
#1Marlene Sophie Altenmüller (LMU: Ludwig Maximilian University of Munich)
#2Mario Gollwitzer (LMU: Ludwig Maximilian University of Munich)H-Index: 32
Abstract null null Science is unthinkable without collaboration between scientists. Yet, science is also unthinkable without competition (i.e., competing for the best and most solid arguments and limited, precious resources). In this review, we argue that scientific work routines represent social dilemmas and that two facets of prosociality help researchers solve these dilemmas: (i) sacrificing personal profit for the sake of collective profit (i.e., cooperation) and (ii) deciding to make onesel...
Abstract null null Hypothesis testing is a central statistical method in the biomedical sciences. The ongoing debate about the concept of statistical significance and the reliability of null hypothesis significance tests (NHST) and p-values has brought the advent of various Bayesian hypothesis tests as possible alternatives, which often employ the Bayes factor. However, careful calibration of the prior parameters is necessary for the type I error rates or power of these alternatives to be any be...
#2Manfred E. BeutelH-Index: 66
Last. Falk Leichsenring (University of Rostock)H-Index: 52
view all 0 authors...
#1Amanda R. Carrico (CU: University of Colorado Boulder)H-Index: 18
Behavior change is widely recognized as an important strategy for mitigating and adapting to climate change. Yet, positive or negative behavioral spillover effects—if prevalent—have the potential to render behavioral interventions more or less effective. Behavioral spillover occurs when the adoption of one behavior targeted by an intervention changes the likelihood that an individual will adopt one or more nontargeted behaviors. As spillover is defined as a causal process, methods that isolate c...
2 CitationsSource
#1Mark D. Packard (UNR: University of Nevada, Reno)H-Index: 5
#2Per L. Bylund (OSU: Oklahoma State University–Stillwater)H-Index: 3
Abstract null null The aim of this article is to expound the subjectivist position on the concept of ‘rationality.’ To begin, we review the longstanding and still ongoing debate in philosophy over the differences (or not) between the natural and social sciences. While positivism, which supposes no difference between the sciences, has been the tradition whence the economic rationality construct (homo economicus and its modern variants) has derived, a longstanding interpretivist tradition holds th...
#1Jennifer Marie Logg (Georgetown University)H-Index: 8
#2Charles Dorison (NU: Northwestern University)H-Index: 2
Abstract null null In the past decade, the social and behavioral sciences underwent a methodological revolution, offering practical prescriptions for improving the replicability and reproducibility of research results. One key to reforming science is a simple and scalable practice: pre-registration. Pre-registration constitutes pre-specifying an analysis plan prior to data collection. A growing chorus of articles discusses the prescriptive, field-wide benefits of pre-registration. To increase ad...
#1Paz Fortier (McMaster University)H-Index: 2
#2Louis A. Schmidt (McMaster University)H-Index: 64
Abstract We propose three practical suggestions that might be used in tandem to help address enduring replication challenges: (1) a methodology checklist to increase awareness of potentially overlooked variables and scaffold the design of replication attempts, (2) a means of facilitating visual side-by-side comparisons of reference and replication methodologies to increase replication fidelity, and (3) a broader methodology fidelity rating system to capture the nuance associated with direct repl...
#1Nicola J. Buckland (University of Sheffield)H-Index: 8
#2Eva Kemps (Flinders University)H-Index: 50
This research aimed to replicate a previous UK-based finding that low craving control predicts increased intake of high energy density foods (HED) during the COVID-19 lockdown, and extend this finding to adults living in Victoria, Australia. The study also assessed whether acceptance coping moderates the relationship between craving control and increased HED food intake, and examined the associations between trait disinhibition, perceived stress and changes to HED food intake. An online survey c...
#1Jacqueline F. I. Anderson (University of Melbourne)H-Index: 12
#2Amy S. Jordan (University of Melbourne)H-Index: 47
The literature examining the relationship between sleep disturbance, fatigue, and cognition in premorbidly healthy civilian adults after mTBI is very limited. The current study aimed to investigate the relationships of sleep disturbance and fatigue with cognition while controlling for psychological distress and age. Using a prospective observational design, we assessed 60 premorbidly healthy individuals approximately 8 weeks after mTBI. Participants were assessed with the Pittsburgh Sleep Qualit...
#1David M. Markowitz (UO: University of Oregon)H-Index: 8
#2Paul Slovic (UO: University of Oregon)H-Index: 159
Dehumanization is a topic of significant interest for academia and society at large. Empirical studies often have people rate the evolved nature of outgroups and prior work suggests immigrants are common victims of less-than-human treatment. Despite existing work that suggests who dehumanizes particular outgroups and who is often dehumanized, the extant literature knows less about why people dehumanize outgroups such as immigrants. The current work takes up this opportunity by examining why peop...