“Reproducible research” is a hot question

I have long been interested in the question of reproducible research and as a manuscript author, reviewer and more recently, editor, have attempted to make sure that no key information was missing and that methods were described in full detail and, of course, valid.

Although the problem has always existed, I think that in recent years papers and reports with badly described methods have become more frequent. I think that there are many reasons for this: 1) the pressure to publish quickly and frequently as a condition for career advance, 2) the overload on reviewers work’ and the pressure from journals to get manuscript reviews submitted within a few days’ time, 3) the stricter and stricter rules of journals about maximum number of “free” pages, and 4) the practice by some journals of publishing methods at the end of the papers or in smaller typeface, implying that methods are not important for most readers, and irrelevant for understanding the results described (which is a false premise).

The consequence of accepting incomplete and undetailed methods descriptions is that frequently reviewers have to base recommendations to editors on assumptions on the quality of the work done. This not only lets through the “quality control filter” some papers that should have never been published (e.g. requiring withdrawals), but also creates a bias in favour of the “big name labs” as it is inherent to human nature to trust more easily, when confronted with incomplete information, familiar people than totally unknown ones. It has also made the publication of manipulated, “selected” and even fabricated results easier, which although not a frequent practice has lead to some resounding cases that damage the overall credibility of scientific research.

In the last couple of years there has been an important increase in the awareness about these problems. This has caused a change in rules and procedures both in the case of funding agencies and some big-name journals like Nature and Science. Broadly the changes have been: 1) Requirement of open-access to data, in many cases even raw data, 2) tighter checking of statistical analyses (both Nature and Science have established especial review boards about statistics), so that in addition to the normal reviewing process, manuscripts will be reviewed by statisticians, 3) in some cases journals are requesting validation of experimental results by independent labs before accepting manuscripts for publication.

Next spring I will start teaching a new course related to this question. Not so much from the theoretical/philosophical point of view, but instead with the intention to train M.Sc. and Ph.D. students in ways of doing data analyses that provide a faithful and nicely typeset record of the procedure used with a modest additional effort compared to “usual” ad-hoc analyses using “menu” and dialogue-box based interfaces to software.

Before that, this autumn I will give three or four seminar talks on how the available statistical methods have evolved in the last 35-40 years and what have been the consequences of the advances in computers and computing for everyday researchers. I will also argue that some traditions ingrained in reporting of biological research are the consequence of practical limitations that no longer exist, and that consequently there is no reason for continuing these traditions.

University of Helsinki

Leave a Reply Cancel reply