Inferring from P-values

From The Embassy of Good Science

Inferring from P-values

What is this about?

Inferring from P-values is considered to be a conventional scientific procedure [1]. However, this statistical method is frequently misused, resulting in the publication of false positive results [2], which is one of the reasons why the American Statistical Association (ASA) released a policy statement on P-values. There is an ongoing discussion in the scientific community regarding the statement.

  1. Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  2. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.

Why is this important?

In early 20th century, the concept of a P-value was introduced, along with a decision rule that stated that if p<0.05, then the null hypothesis should be rejected [1]. In other words, when a P-value is less than 0.05, the results are regarded as “statistically significant” [2].  

Journals widely encourage the use of the method of inferring from P-values for publication, which puts researchers under a lot of pressure to publish “statistically significant” results [2]. According to recent findings, 96% of abstracts and full-text articles in the biomedical literature from 1990 to 2015 presented p<0.05, which is considered “too good to be true”, and indicates that there is a practice of selective reporting [3].

Developments in decision theory, information theory, mathematical modelling and computing in the second half of the 20th century shed a completely different light on the use of P-values and statistical inference in general [1]. By 2016, mounting criticisms of the use and interpretation of P-values prompted the ASA to publish a policy statement [4].

  1. 1.0 1.1 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  2. 2.0 2.1 Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from:  https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/.
  3. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
  4. Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.

For whom is this important?

What are the best practices?

The ASA statement on P-values gives instructions on the correct use of P-values, with the goal of improving interpretation in quantitative science. The overall conclusion of the ASA is that scientific inferences should not be based exclusively on P-value threshold, because that, in itself, does not provide substantial evidence regarding a model or hypothesis, nor does it measure the size of a certain effect or determine the importance of the results. Researchers should use P-values within a proper context, because otherwise it can lead to selective reporting [1].  Good scientific inference requires the full and transparent reporting of data and methods [1]. There are other methods that researchers can use with or instead of P-values, which mostly focus on estimations as opposed to testing. These include confidence, credibility or prediction intervals, Bayesian methods, decision-theoretic modeling and false discovery rates [1].

Since its release in 2016, the ASA statement has been cited about 1,700 times and downloaded nearly 300,000 times. In 2017, the ASA organized a symposium on statistical methods, which resulted in 43 articles on the topic of the responsible use of P-values[2] . Statisticians and scientists are currently considering “a world beyond p<0.05” ([3]), suggesting a wide spectrum of solutions and possibilities. One solution involves changing the P-value threshold for statistical significance from 0.05 to 0.005 ([3][4]). By contrast, others argue that reproducibility of results and pre-registration are the best means for preventing selection bias [5]. Others still recommend including more information when reporting P-values, such as the researcher’s confidence in the P-value or their assessment of the likelihood that a statistically significant finding is, in fact, a false positive result [6].

Critiques, initiatives and recommendations require not only further academic discussion, but also significant educational reforms in statistics [7].

  1. 1.0 1.1 1.2 Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.
  2. Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from:  https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/
  3. 3.0 3.1 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  4. Betensky RA. The p-Value Requires Context, Not a Threshold. Am Stat. 2019;73(sup1):115-117.
  5. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
  6. Halsey LG. The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum. Biol Lett. 2019;15:1-8.
  7. Hubbard R, Haig BD, Parsa RA. The Limited Role of Formal Statistical Inference in Scientific Inference. Am Stat. 2019;73(sup1):91-98.