Improving reproducibility in science
Improving reproducibility in science
What is this about?
Here on The Embassy of Good Science, we share the Delphi process outputs as part of a living, community-informed knowledge base. The data presented includes two priority lists: prioritised interventions and reproducibility measures. By making these results openly available, we aim to support ongoing dialogue, encourage community contribution, and facilitate the uptake, adaptation, and continuous improvement of practices that strengthen research reproducibility across disciplines and stakeholder groups.Within iRISE, our work focuses on stakeholder engagement and the prioritisation of interventions to improve reproducibility. Building on a comprehensive scoping review (WP2), we use a structured Delphi consultation process to reach cross-disciplinary consensus on which practices and tools should be adopted directly and which require adaptation before implementation. The Delphi method involves iterative rounds of expert consultation, enabling participants to review anonymised group feedback and refine their responses until consensus is achieved. This approach ensures transparency, inclusivity, and community alignment in setting priorities.
For whom is this important?
What are the best practices?
Items were rated on a 10-point Likert scale. Scores of 8–10 were classified as high priority, and consensus was defined a priori as at least 70% of panellists assigning a score within this high-priority range.
Reproducibility measures are:
- Methodological quality (9.03)
- Reporting quality (9.00)
- Code and dana availability and re-use (8.66)
- Computational reproducibility (8.52)
- Transparency of research plan (8.47)
- Reproducible workflow practices (8.34)
- Trial registration (8.21)
- Materials availability and re-use (8.16)
Interventions:
- Data management training (8.52)
- Data quality checks/feedback (8.33)
- Statistical training (8.33)
- Data sharing policy/g uideline (8.22)
- Protocol/trial registration (8.15)
- Reproducible code/analysis training (8.11)
In Detail
Round 1
In the first round, the panelists scored reproducibility measures and interventions on a scale from 1 to 10 on Likert scale. Items scoring 8–10 with at least 70% agreement were added to the priority list. Items scoring 1–3 with at least 70% agreement were discarded. The panelists could also comment on their scores. The reproducibility measures and interventions that scored 4-7, with 70% agreement, were again revised in the second round.
Round 2
The panelists reviewed the reproducibility measures and interventions that scored 4–7 with 70% agreement. The rankings and anonymized comments from the first round were shared to help participants reassess their scores.
Final Round
The final round consisted of an online meeting with eight selected panellists (two researchers, two editors, two publishers, one funder, and one policymaker). During this session, the participants revisited highest-scoring interventions that had not reached consensus in previous rounds. The panel then had a task to review the ranking order of the two prioritised lists. After the final round there were no changes to the prioritised listscontributed to this theme. Latest contribution was Feb 19, 2026
