Hypothesizing after the results are known (HARKing)

From The Embassy of Good Science

Hypothesizing after the results are known (HARKing)

What is this about?

Hypothesizing after the results are known (HARKing) refers to the practice of presenting unexpected findings as a priori hypotheses or failing to report empirically unsupported hypotheses that were derived a priori and guided the research. In other words, research reports suffer from HARKing if they include one or more post hoc hypotheses (that is, hypotheses developed after the results of the data analysis are known) that are misrepresented as a priori (that is, as developed prior to the data analysis) or if they exclude one or more a priori hypotheses that were empirically disconfirmed. Consequently, HARKed reports misrepresent the ratio of empirically confirmed and disconfirmed a priori hypotheses by elevating exploratory findings to a priori expectations and suppressing a priori expectations unsupported by the data at hand. Thus, HARKing misportrays the research process by falsely describing hypothesis generating exploratory research as hypothesis testing confirmatory research or by failing to report hypotheses that could not be corroborated and therefore deceives readers.

This theme page describes the practice of HARKing and its detrimental consequences on research in some more depth, briefly explains how initiatives such as preregistration aim to reduce HARKing and differentiates pure HARKing from transparent forms of HARKing that are not necessarily detrimental to the research endeavor.

Why is this important?

In recent years, several academic disciplines – perhaps most strongly psychology – have witnessed a replication crisis, casting doubt on the validity of seemingly well-established findings and theories. The prevalence of HARKing, evidenced by empirical studies of research misconduct and research misbehavior, is considered one of several driving factors behind the replication crisis. This is due to two reasons: Firstly, data that were used to generate a hypothesis cannot be used to test that same hypothesis in any meaningful way. Portraying an empirically inspired post hoc hypothesis as a priori violates the falsification principle crucial for hypothesis-driven (that is, confirmatory) empirical research. Secondly, suppressing unsupported a priori hypotheses and the attendant failure to report null effects throws away opportunities to cast doubt on the validity of hypotheses derived from extant knowledge. Consequently, multiple disconfirmations of a hypothesis may go unnoticed when researchers HARK because disconfirmations are not reported.

While pure HARKing is unquestionably a detrimental research practice, the same is not necessarily true for transparent forms of HARKing. Transparent HARKing (THARKing) occurs when researchers develop post hoc hypotheses (that is, hypothesize after the results are known), but do so transparently and based on theory. If done transparently and inspired not only by results but also by theory, post hoc hypothesizing does not misportray the research process because exploratory findings are clearly labeled as such.

For whom is this important?

What are the best practices?

On the systemic level, HARKing can be prevented by changing researcher assessment and promoting the preregistration of studies, ideally in a form involving reviewed preregistration with guaranteed publication if the accepted protocol is followed.

Individual researchers should make post hoc hypotheses transparent and thereby avoid deceiving readers to reap the benefits from exploratory studies without misrepresenting them as following a hypothetico-deductive model.

In Detail

The term HARKing was coined in a seminal article by Kerr[1] and is usually used synonymously with accommodational hypothesizing[2] and presenting post hoc hypothesis as a priori (PPHA).[3] Kerr identified twelve potential costs of HARKing:

1.     Translating Type I errors into hard-to-eradicate theory.

2.     Propounding theories that cannot (pending replication) pass Popper’s disconfirmability test.

3.     Disguising post hoc explanations as a priori explanations (when the former ted also be more ad hoc, and consequently, less useful).

4.     Not communicating valuable information about what did not work.

5.     Taking unjustified statistical licence.

6.     Presenting an inaccurate model of science to students.

7.     Encouraging “fudging” in other grey areas.

8.     Making us less receptive to serendipitous findings.

9.     Encouraging adoption of narrow, context-bound new theory.

10.  Encouraging retention of too-broad, disconfirmable old theory.

11.  Inhibiting identification of plausible alternative hypotheses.

12.  Implicitly violating basic ethical principles.

While Kerr’s article initially was not widely cited, this changed in the wake of the replication crisis and empirical studies into the prevalence and drivers of detrimental research practices and research misconduct. The surge of interest in HARKing worryingly showed that it indeed is a rather prevalent practice. Various studies on the prevalence of detrimental research practices found that a sizeable proportion of researchers (up to 58% in one study) from different disciplines (most notably psychology) did engage in HARKing in the past.[4]

To identify measures to reduce HARKing, it is necessary to understand its causes. A key driving factor of HARKing most likely is publication bias: it is much more difficult to publish negative findings than positive findings, and confirmatory research seemingly following a hypothetico-deductive model is generally higher valued than exploratory research, at least in most fields of research. The number of publications, however, still is one of the most important metrics commonly used in researcher evaluation. As a result, researchers have an incentive to publish as much as possible, while the publication system rewards analyses that (seemingly) yield positive findings derived from hypothesis testing research.

One pathway to reduce HARKing thus is changing the incentives for researchers by, for example, evaluating the quality rather than the quantity of publications and recognizing the value of replication studies. The latter also would be facilitated by a comprehensive move towards open science and a recognition of the value of open science practices. Another pathway to reduce HARKing is preregistration because it helps tying the hands of researchers before the data analysis. If researchers decide to preregister a study, they submit a time-stamped paper describing the rationale of their study, the experimental and analytical methods they will use, and their hypotheses. This document cannot be changed at a later stage so that HARKing would be easily detectable and lead to inconsistencies in the line of argument. If the pre-registered study is reviewed, publication is guaranteed if the registered protocol is followed, regardless of the results. Consequently, preregistration and changes in the incentive system are potentially mutually reinforcing.

However, it is worth noting that it is in principle possible to preregister studies after the results are known (PARKing) and thereby reap the reputational benefits coming with what seems to be a commitment to methodological rigor without actually following the practice.[5]

Although pure HARKing is unquestionably a detrimental research practice because it misportrays the research process, tends to bias results and ultimately deceives readers, the same cannot necessarily be said about other forms of post hoc hypothesizing. Using the fictional example of a group of epidemiologists conducting a drug trial to cure a new life-threatening disease, Hollenbeck and Wright argue that HARKing is not detrimental to science if it is done transparently and informed by theory, a practice they call THARKing (transparently hypothesizing after the results are known).[6] In their example, the epidemiologists initially find no effect of the tested drug, but know of cases where it apparently worked. Discussing about these cases, they recognize that all cured patients they know of are female, yet a reanalysis of the data turns out insignificant, even though the effect size for women is larger than for men. They continue discussing if gender could be an important factor and, drawing on their implicit theoretical knowledge, develop the hypothesis that estrogen levels (that peak at certain ages) might be a crucial moderating variable. A reanalysis of their data corroborates their hypothesis. They publish an article summarizing their study, noting in the discussion section that the age-by-gender interaction was the result of an exploratory analysis conducted after the main effects turned out to be insignificant. Other research teams replicate their study, and eventually a meta-analysis confirms their findings. Hollenbeck and Wright argue that THARKing, unlike secretly hypothesizing after the results are known, SHARKing or pure HARKing), is justifiable if readers are transparently informed that a hypothesis is post hoc rather than a priori in the discussion section of an article (in other words, the introduction in their view should only include a priori hypotheses).

In general, pure HARKing is a detrimental research practice and hampers scientific progress. It can be disincentivized by changes in the research system, such as changes in researcher assessment and increasing preregistration of studies. Transparent post hoc hypothesizing, by contrast, seems justifiable if the exploratory nature of results is clearly stated.


References

[1] Kerr, N. (1998). HARKing: Hypothesizing After the Results are Known. Personality and Social Psychology Review, 2(3), 196-217. doi:10.1207/s15327957pspr0203_4

[2] Hitchcock, C., & Sober, E. (2004). Prediction versus Accommodation and the Risk of Overfitting. The British Journal for the Philosophy of Science, 55(1), 1–34. http://www.jstor.org/stable/3541832

[3] Leung, K. (2011). Presenting Post Hoc Hypotheses as A Priori: Ethical and Theoretical Issues. Management and Organization Review, 7(3), 471-479. doi: 10.1111/j.1740-8784.2011

[4] An overview of different studies on the prevalence of HARKing can be found in Table 1 in Rubin, M. (2017). When does HARKing hurt? Identifying when different types of undisclosed post hoc hypothesizing harm scientific progress. Review of General Psychology, 21, 308-320. doi: 10.1037/gpr0000128

[5] Yamada, Y. (2018). How to Crack Pre-registration: Toward Transparent and Open Science. Frontiers in Psychology, 9:1831. doi: 10.3389/fpsyg.2018.01831

[6] Hollenbeck, J. R., & Wright, P. M. (2017). Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data. International Journal of Qualitative Methods, 43(1), 5-18. 10.1177/1609406920947600
Cookies help us deliver our services. By using our services, you agree to our use of cookies.
5.1.6