Themes

Inferring from P-values

What is this about?

Inferring from P-values is considered to be a conventional scientific procedure ^[1]. However, this statistical method is frequently misused, resulting in the publication of false positive results ^[2], which is one of the reasons why the American Statistical Association (ASA) released a policy statement on P-values. There is an ongoing discussion in the scientific community regarding the statement.

Why is this important?

In early 20^th century, the concept of a P-value was introduced, along with a decision rule that stated that if p<0.05, then the null hypothesis should be rejected ^[3]. In other words, when a P-value is less than 0.05, the results are regarded as “statistically significant” ^[4].

Journals widely encourage the use of the method of inferring from P-values for publication, which puts researchers under a lot of pressure to publish “statistically significant” results ^[4]. According to recent findings, 96% of abstracts and full-text articles in the biomedical literature from 1990 to 2015 presented p<0.05, which is considered “too good to be true”, and indicates that there is a practice of selective reporting ^[5].

Developments in decision theory, information theory, mathematical modelling and computing in the second half of the 20^th century shed a completely different light on the use of P-values and statistical inference in general ^[3]. By 2016, mounting criticisms of the use and interpretation of P-values prompted the ASA to publish a policy statement ^[6].

For whom is this important?

ResearchersPhD StudentsUniversitiesJournalsPolicy makers

What are the best practices?

The ASA statement on P-values gives instructions on the correct use of P-values, with the goal of improving interpretation in quantitative science. The overall conclusion of the ASA is that scientific inferences should not be based exclusively on P-value threshold, because that, in itself, does not provide substantial evidence regarding a model or hypothesis, nor does it measure the size of a certain effect or determine the importance of the results. Researchers should use P-values within a proper context, because otherwise it can lead to selective reporting ^[3]. Good scientific inference requires the full and transparent reporting of data and methods ^[3]. There are other methods that researchers can use with or instead of P-values, which mostly focus on estimations as opposed to testing. These include confidence, credibility or prediction intervals, Bayesian methods, decision-theoretic modeling and false discovery rates ^[3].

Since its release in 2016, the ASA statement has been cited about 1,700 times and downloaded nearly 300,000 times. In 2017, the ASA organized a symposium on statistical methods, which resulted in 43 articles on the topic of the responsible use of P-values^[7] . Statisticians and scientists are currently considering “a world beyond p<0.05” (^[4]), suggesting a wide spectrum of solutions and possibilities. One solution involves changing the P-value threshold for statistical significance from 0.05 to 0.005 (^[4]^[8]). By contrast, others argue that reproducibility of results and pre-registration are the best means for preventing selection bias ^[9]. Others still recommend including more information when reporting P-values, such as the researcher’s confidence in the P-value or their assessment of the likelihood that a statistically significant finding is, in fact, a false positive result ^[10].

Critiques, initiatives and recommendations require not only further academic discussion, but also significant educational reforms in statistics ^[11].

Natalie Evans, Andrijana Perković Paloš, Fran Šaler contributed to this theme. Latest contribution was Feb 09, 2023

Other information

Who

Virtues & Values

Good Practices & Misconduct

Research Area

↑ Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.
↑ Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
↑ ^3.0 ^3.1 ^3.2 ^3.3 ^3.4 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90. Cite error: Invalid <ref> tag; name ":0" defined multiple times with different content
↑ ^4.0 ^4.1 ^4.2 ^4.3 Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from: https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/. Cite error: Invalid <ref> tag; name ":1" defined multiple times with different content
↑ Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
↑ Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.
↑ Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from: https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/
↑ Betensky RA. The p-Value Requires Context, Not a Threshold. Am Stat. 2019;73(sup1):115-117.
↑ Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.
↑ Halsey LG. The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum. Biol Lett. 2019;15:1-8.
↑ Hubbard R, Haig BD, Parsa RA. The Limited Role of Formal Statistical Inference in Scientific Inference. Am Stat. 2019;73(sup1):91-98.

[1] Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90.

[2] Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.

[:0-3] 3.0 ^3.1 ^3.2 ^3.3 ^3.4 Kennedy-Shaffer L. Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing. Am Stat. 2019;73(sup1):82-90. Cite error: Invalid <ref> tag; name ":0" defined multiple times with different content

[:1-4] 4.0 ^4.1 ^4.2 ^4.3 Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from: https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/. Cite error: Invalid <ref> tag; name ":1" defined multiple times with different content

[5] Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.

[6] Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70(2):129-33.

[7] Singh Chawala D. Statistics experts urge scientists to rethink p-value. Spectrum. 2019 March 25. [cited 2020 Aug 18]. Available from: https://www.spectrumnews.org/news/statistics-experts-urge-scientists-rethink-p-value/

[8] Betensky RA. The p-Value Requires Context, Not a Threshold. Am Stat. 2019;73(sup1):115-117.

[9] Ioannidis JPA. What Have We (Not) Learnt from Millions of Scientific Papers with P Values. Am Stat. 2019;73(sup1):20-25.

[10] Halsey LG. The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum. Biol Lett. 2019;15:1-8.

[11] Hubbard R, Haig BD, Parsa RA. The Limited Role of Formal Statistical Inference in Scientific Inference. Am Stat. 2019;73(sup1):91-98.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]