Inferring from P-values

From The Embassy of Good Science

Inferring from P-values

What is this about?

Inferring from P-values is considered to be a conventional scientific procedure [1]. However, this statistical method is frequently misused, resulting in the publication of false positive results [2], which is one of the reasons why the American Statistical Association (ASA) released a policy statement on P-values. There is an ongoing discussion in the scientific community regarding the statement.

  1. Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  2. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.

Why is this important?

In early 20th century, the concept of a P-value was introduced, along with a decision rule that stated that if p<0.05, then the null hypothesis should be rejected [1]. In other words, when a P-value is less than 0.05, the results are regarded as “statistically significant” [2].  

Journals widely encourage the use of the method of inferring from P-values for publication, which puts researchers under a lot of pressure to publish “statistically significant” results [2]. According to recent findings, 96% of abstracts and full-text articles in the biomedical literature from 1990 to 2015 presented p<0.05, which is considered “too good to be true”, and indicates that there is a practice of selective reporting [3].

Developments in decision theory, information theory, mathematical modelling and computing in the second half of the 20th century shed a completely different light on the use of P-values and statistical inference in general [1]. By 2016, mounting criticisms of the use and interpretation of P-values prompted the ASA to publish a policy statement [4].

  1. 1.0 1.1 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  2. 2.0 2.1 Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from:  https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/.
  3. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
  4. Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.

For whom is this important?

What are the best practices?

The ASA statement on P-values gives instructions on the correct use of P-values, with the goal of improving interpretation in quantitative science. The overall conclusion of the ASA is that scientific inferences should not be based exclusively on P-value threshold, because that, in itself, does not provide substantial evidence regarding a model or hypothesis, nor does it measure the size of a certain effect or determine the importance of the results. Researchers should use P-values within a proper context, because otherwise it can lead to selective reporting [1].  Good scientific inference requires the full and transparent reporting of data and methods [1]. There are other methods that researchers can use with or instead of P-values, which mostly focus on estimations as opposed to testing. These include confidence, credibility or prediction intervals, Bayesian methods, decision-theoretic modeling and false discovery rates [1].

Since its release in 2016, the ASA statement has been cited about 1,700 times and downloaded nearly 300,000 times. In 2017, the ASA organized a symposium on statistical methods, which resulted in 43 articles on the topic of the responsible use of P-values[2] . Statisticians and scientists are currently considering “a world beyond p<0.05” ([3]), suggesting a wide spectrum of solutions and possibilities. One solution involves changing the P-value threshold for statistical significance from 0.05 to 0.005 ([3][4]). By contrast, others argue that reproducibility of results and pre-registration are the best means for preventing selection bias [5]. Others still recommend including more information when reporting P-values, such as the researcher’s confidence in the P-value or their assessment of the likelihood that a statistically significant finding is, in fact, a false positive result [6].

Critiques, initiatives and recommendations require not only further academic discussion, but also significant educational reforms in statistics [7].

  1. 1.0 1.1 1.2 Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.
  2. Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from:  https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/
  3. 3.0 3.1 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
  4. Betensky RA. The p-Value Requires Context, Not a Threshold. Am Stat. 2019;73(sup1):115-117.
  5. Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
  6. Halsey LG. The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum. Biol Lett. 2019;15:1-8.
  7. Hubbard R, Haig BD, Parsa RA. The Limited Role of Formal Statistical Inference in Scientific Inference. Am Stat. 2019;73(sup1):91-98.

In Detail

Part of the problem is scholarly journals which are prone to only publishing positive results. Changes in publishing policies and fees, especially in the era of digital, publicly available databases and journals, could provide a climate for publishing negative results. Pre-registrations of trials/research can only solve a problem if complete results are published after completion.
Cookies help us deliver our services. By using our services, you agree to our use of cookies.
5.1.6