Two examples of original and replication study pairs which meet the non-significance replication success criterion from the Reproducibility Project: Cancer Biology (Errington et al., 2021). Shown are standardized mean difference effect estimates with 95% confidence intervals, sample sizes n, and two-sided p-values p for the null hypothesis that the effect is absent. Effect estimate, 95% confidence interval, and p-value from a fixed-effect meta-analysis pMA of original and replication study are shown in gray.

Null hypothesis (H0) and alternative hypothesis (H1) for superiority and equivalence tests (with equivalence margin Δ > 0).

Effect estimates on standardized mean difference (SMD) scale with 90% confidence interval for the 15 “null results” and their replication studies from the Reproducibility Project: Cancer Biology (Errington et al., 2021). The title above each plot indicates the original paper, experiment and effect numbers. Two original effect estimates from original paper 48 were statistically significant at p < 0.05, but were interpreted as null results by the original authors and therefore treated as null results by the RPCB. The two examples from Figure 1 are indicated in the plot titles. The dashed gray line represents the value of no effect (SMD = 0), while the dotted red lines represent the equivalence range with a margin of Δ = 0.74, classified as “liberal” by Wellek (2010, Table 1.1). The p-value pTOST is the maximum of the two one-sided p-values for the null hypotheses of the effect being greater/less than +Δ and −Δ, respectively. The Bayes factor BF01 quantifies the evidence for the null hypothesis H0 : SMD = 0 against the alternative H1 : SMD ≠ 0 with normal unit-information prior assigned to the SMD under H1.

Number of successful replications of original null results in the RPCB as a function of the margin Δ of the equivalence test (pTOSTα in both studies for α = 0.1, 0.05, 0.01) or the standard deviation of the zero-mean normal prior distribution for the SMD effect size under the alternative H1 of the Bayes factor test (BF01γ in both studies for γ = 3, 6, 10).

Effect estimates on Fisher z-transformed correlation scale with 90% confidence interval for the “null results” and their replication studies from the Reproducibility Project: Psychology (RPP, Open Science Collaboration, 2015) and the Experimental Philosophy Replicability Project (EPRP, Cova et al., 2018). The dashed gray line represents the value of no effect (z = 0), while the dotted red lines represent the equivalence range with a margin of Δ = 0.74. The p-value pTOST is the maximum of the two one-sided p-values for the null hypotheses of the effect being greater/less than +Δ and −Δ, respectively. The Bayes factor BF01 quantifies the evidence for the null hypothesis H0 : z = 0 against the alternative H1 : z ≠ 0 with normal prior centered around zero and standard deviation of 2 assigned to the effect size under H1.

Sensitivity analyses for the “null results” and their replication studies from the Reproducibility Project: Psychology (RPP, Open Science Collaboration, 2015) and the Experimental Philosophy Replicability Project (EPRP, Cova et al., 2018). The Bayes factor of the replication of Ranganath and Nosek (2008) decreases very quickly and is only shown for a limited range.