Unreproducible results in research papers is a growing problem affecting academia and the publishing industry, but a cancer researcher at the University of Sydney is hoping to change that with a new, semi-automated fact checker.
Whether it’s an honest mistake in transcribing results, or something more sinister like research fraud, errors in research papers can not only be a waste of time but also a huge cost to labs and institutions.
The ‘Seek & Blastn’ program, co-developed by professor Jennifer Byrne from the University of Sydney and Dr Cyril Labbé from the University of Grenoble Alpes, aims to tackle the problem in the field of biomedicine by identifying DNA and RNA constructs used to target genes and comparing them with a database of known genes.
Byrne likened the use of the DNA and RNA constructs, known as nucleotide sequence reagents, to ingredients in a recipe.
“Doing an experiment with wrong reagents either means that you cook something different from what you thought you were cooking, or what you cook is a failure,” Byrne said.
But the stakes are a bit higher when the cake you’re baking costs thousands of dollars and supports an international effort to fight cancer.
“Unfortunately with experiments, failures are not always as obvious as they are in the kitchen. And here we are dealing with fundamental genetic research, and other researchers are using these failures as building blocks for their own work.”
A cohort of 155 research papers analysed with the new fact-checker, combined with manual verification, identified 25 percent of papers as having sequence errors.
While the sample was of a group already under suspicion for containing errors and doesn’t represent a baseline for all research in the field, the number is still worryingly high.
“That’s quite a lot of wrong sequences in a small group of papers and there will be many more out there, unfortunately, given that nucleotide sequence reagents have been described in literally hundreds of thousands of biomedical publications,” Byrne said.
Seek & Blastn was able to identify both ‘identity errors’ (sequences which were completely incorrect) and also typographic errors (spelling mistakes that, while probably accidental, are still problematic).
Examples of errors found by Seek & Blastn include:
- Sequence reagents that are supposed to target a human gene instead didn’t actually target anything, meaning experiments might not work and researchers wouldn’t know
- Reagents that weren’t supposed to target anything as a control in experiments were found to instead target human genes, meaning researchers couldn't accurately compare data
- Sequence reagents that targeted the wrong gene, resulting in data that had nothing to do with the system being studied
Byrne and Labbé suggest that sequence identity errors in these examples are a particular hallmark of research fraud, and the semi-automated fact-checker could be deployed to help weed it out.
“Our hope is that tools like Seek & Blastn will prospectively deter publications that describe incorrect nucleotide sequence reagents and may flag existing publications so that their conclusions can be re-evaluated.”
A 2016 report in the journal Nature estimates that more than 70 percent of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own.