Scientific Integrity: Can We Rely on the Published Scientific Literature?

ImageEver since the internet became the primary means of disseminating scientific research, scientific publishing has lurched from crisis to crisis, with seemingly increasing rates of fraud, plagiarism, retraction and selective publications. There are many reasons behind this, including the pressure on researchers to publish, and easier ways for publishers and academics to identify plagiarism. At a time when the ACS has created its statement “Scientific insight and integrity in public policy,” it seemed appropriate to organize a symposium on scientific integrity, with a particular focus on the degree to which we can rely on the published scientific literature. This symposium is one of a series on related topics and follows the symposium on data reproducibility organized by Martin Hicks at the spring ACS National Meeting in Denver.

The first speaker of the morning was Chris Leonard from QScience, who was speaking in his capacity as a member of the Committee on Publishing Ethics (COPE). His talk was entitled “Integrity, ethics and trust in scientific research literature.” He explained that the new publishing landscape is often defined in terms of technology and the “bells and whistles” that can improve many aspects of a manuscript, but it also offers the potential of a new era of ethics, integrity and reproducibility, especially in peer review and emerging publication areas such as datasets. The price is increased vigilance at the authoring and reviewing stages, but the cost of not doing so is incalculable. Chris explained COPE’s role in assisting publishers, editors, and authors in this task.

Our second speaker was Chris Proctor from British American Tobacco (BAT) who spoke on “Policy making at the American Chemical Society: developing a statement on scientific integrity.” His presentation examined, from the perspective of a member of the writing committee, how the American Chemical Society created its statement “Scientific insight and integrity in public policy.” Acting with integrity is absolutely critical to the scientific process, so much so that this concept has been embedded in government policies around the world. Leading scientific membership associations have in turn created their own policies to support these initiatives and their members in best practice. The second half of his talk focused on the importance of integrity in controversial areas of science, in particular the case of tobacco harm reduction.

The last speaker before the interval was Martin Hicks of the Beilstein Institute whose subject was “publishability.” He explained that publishing scientific papers has several functions: registration, certification, dissemination, inspiring innovation and archiving. The scientific community is then expected over time to verify the ideas and results and in the end uncover the objective truth. It is assumed that authors make best efforts to ensure that their submissions are correct, and that the scientific community has the time and resources to concern itself with review, reproduction, and validation. Nowadays, most people are, in principle, able to get access to individual articles that they need, but the amount of information has increased so much that scientists cannot keep up with all the publications in their own area of expertise, except for just skimming through the tables of contents. To maximize their reputation, scientists are expected to publish a certain number of papers per year in journals having a certain minimum Impact Factor. Having a significant new idea is no longer sufficient: the numbers are what matters, behavior adapts to the system, and output is adjusted accordingly. It seems likely that this is also leading to the increase in plagiarism and other unethical behavior. Martin’s presentation discussed the effects of this increasingly problematical “publish or perish” paradigm and was illustrated with experiences gained during the publishing of the journals of the Beilstein Institute.

After the interval, Na Qin of Michigan State University gave a librarian’s perspective with a presentation entitled “What is the role of peer review in protecting the integrity of scientific research?” Scientific misconduct, such as plagiarism, data manipulation, data fabrication or duplicate publications, occurs with distressing frequency in science communities. The peer review system has long been used as a self-regulation approach to maintain the standards of quality, improve performance, and provide credibility, but fraudulent and flawed research has been published even in peer-reviewed journals, and the number of articles retracted for fraud or error has risen dramatically in the last decade. This presentation considered questions such as “Is detecting scientific misconduct or errors a primary goal of peer review?” and “What is the role of peer review in ensuring the responsible conduct of scientific research?” and addressed the difficulties and limitations of anonymous peer review in detecting irresponsible conduct in scientific research. These include the naturally conflicting concepts from peer review: expertise and objectivity, and the capacity to expose or minimize legal or ethical issues. Na’s intention was to start a conversation on whether peer review is, by its nature, ill-equipped to detect scientific misconduct. The practice of sciences involves its own self-corrections, and the peer review system does not replace that. Understanding this can reduce the expense to researchers who try to use or replicate fabricated results.

Next, Rajeev Voleti, ScienceOpen, who was a last-minute replacement for Stephanie Dawson, presented Stephanie’s talk entitled “An open, network-based answer to the reproducibility crisis: the ScienceOpen peer review concept.” Spectacular failures of the anonymous peer review system, even in highly prestigious journals, paired with research demonstrating extremely low levels of reproducibility in landmark studies, have called the present system of scientific quality assurance into question. To create a more effective, transparent and fairer system that begins to address the question of reproducibility, ScienceOpen was developed to provide a networking and publishing platform. A researcher network forms the basis for public post-publication peer review, and as a transparent network approach, provides more rigorous quality control than two anonymous referees.

Articles submitted to ScienceOpen are published rapidly after an editorial check, followed by an open peer review process. A unique versioning concept allows researchers to continue to improve their published work, based on comments and reviews by scientists in the field. Papers are not marked as approved because information on the reproducibility of experiments comes later than the first expert check, and thus the status of a paper may change. An article published on ScienceOpen is also placed within the wider context of all Open Access publications in its field, as ScienceOpen aggregates content from a variety of sources, opening them up to discussion with the same tools for commenting, sharing, and discovery. With this holistic concept, ScienceOpen provides high-quality open access publishing services, while redefining publishing as one element in a whole suite of communication tools available to the researcher. Scholarly publishing is not an end in itself, but the beginning of a dialogue to move the whole field forward.

The last speaker of the morning was my co-chair and co-organizer of the symposium, Judith Currano, chemistry librarian at the University of Pennsylvania, who discussed “Managing new threats to the integrity of the scientific literature.” This paper, co-authored by a professor who edits an online journal, framed the challenges facing scientists at all levels, as a result of the highly variable quality of the scientific literature resulting from the introduction of a deluge of new open access online journals, many from previously unknown publishers with highly variable standards of peer review. The problems are so pervasive that even papers submitted to well-established, legitimate journals may include citations to questionable or even frankly plagiarized sources. Judith suggested ways in which science librarians can work with students and researchers to increase their awareness of these new threats to the integrity of the scientific literature, and to increase their ability to evaluate the reliability of journals and individual articles. Traditional rules of thumb for assessing the reliability of scientific publications (peer review, publication in a journal with an established Thomson-Reuters Impact Factor, and a credible publisher) are more challenging to apply given the highly variable quality of many of the new open access journals, the appearance of new publishers, and the introduction of new impact metrics, some of which are interesting and useful, but others of which are based on citation patterns found in poorly described data sets or nonselective databases of articles. The authors suggested that instruction of research students in responsible conduct of research be extended to include ways to evaluate the reliability of scientific information.

After the lunch break, Cesar Berrios of Faculty of 1000 gave a talk entitled “Towards a more reproducible corpus of scientific literature.” Several recent reports and high profile retractions have added to a growing chorus of concern among scientists and laypeople clamoring for a restructuring of the system necessary for reproducibility in science, but the problem is a complex one for which there is no single solution. Some factors that have contributed include: poor training in proper experimental design; increased emphasis on making outlandish statements; and an overreliance on publishing papers in peer reviewed journals with high impact factors for purposes of career progression and tenure. The availability and accessibility of all underlying data necessary to reproduce a study has been identified as integral to solving these issues, yet most traditional journals often have limited space available for each paper. Furthermore, there are numerous technical obstacles in making datasets truly accessible. These issues combine to create a scientific culture where sharing and publishing data ends up low on a researcher’s list of priorities, impeding further progress towards reproducible research. F1000Research are addressing some of these challenges. They have implemented several initiatives to provide methods and tools to capture the production of scientific data, and to establish this as an important output of research activity in itself. All F1000Research articles include the underlying data to enable others to attempt to reproduce the findings, and even to reuse the data. Authors are also offered the option to publish data-only papers that include just the data, together with a detailed description of the protocol used to generate the data. In addition, all articles are openly peer reviewed, post-publication, and previous versions of each article are archived. Cesar described how F1000’s data policy and transparency in the peer review process allows reviewers and readers to scrutinize the data underlying the conclusions carefully, and to follow the full provenance of each paper, ultimately leading to a more trustworthy corpus of scientific literature.

The next speaker, James Solyst of Swedish Match, described “Extraordinary public access to scientific evidence in the FDA modified risk tobacco product process.” Section 911 of the Federal Food, Drug and Cosmetic Act - Modified Risk Tobacco Product (MRTP) provides a process for a company to submit scientific evidence demonstrating a product is of lower risk (modified risk) than another tobacco product. A MRTP application must demonstrate that by switching from one product (cigarettes for example) to another product (Swedish snus, for example), a user reduces his or her individual risk, and the switch benefits the health of the overall population. Reducing harm is generally accepted as a good thing, but tobacco use is widely viewed as something to be eliminated. In addition, the tobacco industry has a challenging history of scientific integrity. Thus, the MRTP process is filled with difficult public health issues, and the FDA, which implements the Tobacco Control Act, has had to manage a process that ensures scientific integrity and is consistent with public health goals. One way that this has been accomplished is through extraordinary transparency: specifically, by making all information in an MRTP publicly available (except for confidential business information). This is not the case with other FDA product applications. A current MRTP application for Swedish snus provides a case study that was reviewed by James.

The next two talks before the afternoon interval discussed fraud and integrity in crystallographic journals and databases. Sean Conway (International Union of Crystallography, IUCr) presented “Validation and fraud in small-molecule crystallography.” Publishing in crystallography is underpinned by a wealth of structural data. In IUCr journals, submitted data are rigorously checked for correctness and consistency. Even so, the journals have experienced cases of fraud in small-molecule reports. In-house validation software is constantly evolving to guard against a growing variety of egregious errors and fraudulent practice. Ian Bruno of Cambridge Crystallographic Data Centre (CCDC) discussed “Scientific integrity: a crystallographic perspective.” The Cambridge Structural Database (CSD) contains over 750,000 experimental determinations of small molecule crystal structures, the majority of which are made available by researchers to support the science published in journal articles. The crystallographic community has, over the years, developed tools that support evaluation of the scientific integrity of crystal structure data and some publishers and journals pay particular attention to this during the peer review process. Once structures are published, CCDC undertakes further scientific processing of the data before including structures in the CSD. This presentation offered a perspective on scientific integrity based on crystal structure data collected over the last half century, and experiences encountered during this time. It also looked at the role domain-specific data centers such as the CCDC can play now, and in the future, to help ensure trust in the results of scientific research.

The last three presentations considered the important roles that publishers play. Ray Boucher from Wiley described “The ways publishers help, maintain, and support responsible research.” At all stages of the publishing process, including pre-publication and post-publication, the publisher is engaged in helping to maintain the integrity of the scientific record. The talk covered areas where the publisher is involved, such as publishing workshops and how the next generation are trained; plagiarism and other pre-publication software packages for specific communities; maintaining the quality of peer review; ethics guidelines; bodies such as the Committee on Publication Ethics (COPE); and how issues are dealt with; and an analysis of the process and practice of retractions. The talk illustrated how the publisher supports and promotes the publication of responsible research, and how interaction with the community is key to this process.

The second speaker in this group, Guido Hermann of Thieme, gave a talk entitled “Integrity, trust and reproducibility: how scientific publishers can contribute.” Thieme publishes scientific information in various formats: journals, reference works, encyclopedias, monographs and textbooks. Scientists have to rely on the validity of the published information and Guido described how Thieme addresses this issue, the internal procedures and mechanisms to safeguard the quality of their publications, and Thieme’s experiences with fraud and plagiarism. The talk considered questions such as “How do we engage our authors, editors, advisors and readers in this process?” and “Are there differences between original research articles, reference works and textbooks?” Guido presented background information, and highlighted some of Thieme’s key findings and best practices. 

Finally, Richard Kidd of the Royal Society of Chemistry (RSC), presented “The write stuff: scientific integrity and publishing,” and considered the responsibilities of a publisher in addressing the questions of scientific integrity, and how is this changing. He described the RSC’s principles and practices, and how RSC works with our community worldwide to evolve their approaches. He considered how the increasing push towards the availability of original data makes validation easier, and the exact meaning of “reproducible.”

William Town, Symposium Organizer