P-value hacking

From The Embassy of Good Science

P-value hacking

What is this about?

P-value hacking, also known as data dredging, data fishing, data snooping or data butchery, is an exploitation of data analysis in order to discover patterns which would be presented as statistically significant, when in reality, there is no underlying effect.[1][2] In other words, p-hacking is running statistical tests on a set of data until some statistically significant results arise. That can be done in a few different ways, for example: by stopping the collection of data once you get a P<0.05, analyzing many outcomes, but only reporting those with P<0.05, using covariates, excluding participants, etc.

  1. Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS Biol. 2015;13(3).
  2. Norman G. Data dredging, salami-slicing, and other successful strategies to ensure rejection: twelve tips on how to not get your paper published: Adv Health Sci Educ Theory Pract. 2014 Mar;19(1):1-5. doi: 10.1007/s10459-014-9494-8.

Why is this important?

Unfortunately, current practices in science show that journals that are considered of high quality (those with high impact factors) predominately publish statistically significant results. Researchers want to publish in such journals because it's important for their academic prestige and job.[1] This creates pressure on researchers, and can lead to P-value hacking. P-value hacking leads to false positive results, which can get published, and have a negative impact on future research in the field, secondary research and systematic reviews and human knowledge in general.[2]

  1. Ekmekci PE. An increasing problem in publication ethics: Publication bias and editors' role in avoiding it. Med Health Care Philos. 2017;20(2):171-8.
  2. Dwan K, Altman DG, Arnaiz JA, Bloom J, Chan AW, Cronin E, et al. Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS One. 2008;3(8):0003081.

For whom is this important?

What are the best practices?

It’s difficult to address the issue of P-value hacking, especially since there aren’t many incentives to replicate research. However, some steps can be taken in order to prevent it. Cross-validation, or out-of-sample testing is a statistical method used to create two sets of data. The first set of data is then used for statistical analysis, to develop new models or hypotheses, and the other, independent set is then used to verify them.[1] A number of statistical analyses is also available to check for p-value hacking, such as Bonferonni correction, Scheffé's method and false discovery rate. A lot of journals will now ask for raw data to be published, or shift their way of work to registered report format. That is a publication process in which journals accept the publications based on theoretical justification and methodology only, without looking at results. [2]

  1. Berk R, Brown L, Zhao L. Statistical Inference After Model Selection. Journal of Quantitative Criminology. 2010;26(2):217-36.
  2. Simons DJ, Holcombe AO, Spellman BA. An Introduction to Registered Replication Reports at Perspectives on Psychological Science. Perspect Psychol Sci. 2014;9(5):552-5.

Other information

Virtues & Values
Good Practices & Misconduct