Open@VT

Open Access, Open Data, and Open Educational Resources

Category Archives: Research Evaluation

Dr. Malte Elson on Peer Review in Science

Few areas in scholarly publishing are undergoing the kind of examination and change that peer review is currently undergoing. Healthy debates continue on different models of peer review, incentivizing peer reviewers, and various shades of open peer review, among many other issues. Recently, the second annual Peer Review Week was held, with several webinars available to view.

Since peer review is currently such a dynamic topic, the University Libraries and the Department of Communication are especially pleased to host a talk about peer review in science by Dr. Malte Elson of Ruhr University Bochum. Dr. Elson is a behavioral psychologist with a strong interest in meta-science issues. Dr. Elson has created some innovative outreach projects related to open science, including FlexibleMeasures.com, a site that aggregates flexible and unstandardized uses of outcome measures in research, and JournalReviewer.org (in collaboration with Dr. James Ivory in Virginia Tech’s Department of Communication), a site that aggregates information about journal peer review processes. He is also a co-founder of the Society for Improvement of Psychological Science, which held its first annual conference in Charlottesville in June. Details and a description of his talk, which is open to the public, are below. Please join us! (For faculty desiring NLI credit, please register.)

Wednesday, October 12, 2016, 4:00 pm
Newman Library 207A

Is Peer Review a Good Quality Management System for Science?

Through peer review, the gold standard of quality assessment in scientific publishing, peers have reciprocal influence on academic career trajectories, and on the production and dissemination of knowledge. Considering its importance, it can be quite sobering to assess how little is known about peer review’s effectiveness. Other than being a widely used cachet of credibility, there appears to be a lack of precision in the description of its aims and purpose, and how they can be best achieved.

Conversely, what we do know about peer review suggests that it does not work well: Across disciplines, there is little agreement between reviewers on the quality of submissions. Theoretical fallacies and grievous methodological issues in submissions are frequently not identified. Further, there are characteristics other than scientific merit that can increase the chance of a recommendation to publish, e.g. conformity of results to popular paradigms, or statistical significance.

This talk proposes an empirical approach to peer review, aimed at making evaluation procedures in scientific publishing evidence-based. It will outline ideas for a programmatic study of all parties (authors, reviewers, editors) and materials (manuscripts, evaluations, review systems) involved to ensure that peer review becomes a fair process, rooted in science, to assess and improve research quality.

Removing the Journal Impact Factor from Faculty Evaluation

One barrier to open access publishing that receives a thorough debunking on an almost-daily basis, yet refuses to go away, is the journal impact factor (JIF). Unfortunately the JIF continues to be used in the evaluation of research (and researchers) by university committees (in hiring, or the tenure and promotion process) as well as by grant reviewers. This is a barrier to open access, in most cases, because the most prestigious journals (those with the highest JIF) often do not have any open access publication options. There are exceptions like PLOS Biology, but in general the focus on prestige by evaluators is slowing down badly needed changes in scholarly communication. It’s also a barrier because many open access journals are newer and tend to have a lower JIF if they have one at all (three years of data must be available, so an innovative journal like PeerJ won’t have a JIF until June 2015).

But even if the JIF weren’t posing a barrier to open scholarly communication, it would still be a poor metric to use in research evaluation. Here are a few of the reasons:

  • The JIF measures journals, not articles. It was never intended for use in evaluating researchers or their articles.
  • Because the JIF counts citations to a journal in the previous year and divides by the number of articles published in the two years prior to that, there is absolutely no relationship between a journal’s current JIF and a newly published article. For example, a journal’s 2014 JIF measures citations in 2013 to articles published in 2012 and 2011.
  • The distribution of citations to journal articles is highly skewed: one article may be cited hundreds of times, and others cited little or not at all. By using the mean as a descriptor in such a skewed distribution, the JIF is a poster child for statistical illiteracy.
  • The JIF is not reproducible. It is a proprietary measure owned by Thomson Reuters (Journal Citation Reports or JCR), and the details of its calculation are a black box. In particular, it is not known whether non-peer reviewed content in journals are counted as articles, and/or whether citations from that content are counted.
  • Citations can occur for a variety of reasons, and may or may not be an indication of quality.
  • The JIF only counts citations to an article in its first two years, so journals select articles that will cause an immediate buzz (though there is also a 5-year JIF). Meanwhile, solid research that might accumulate citations more slowly is rejected.
  • Citation rates vary greatly between disciplines. The JIF serves as a disincentive to do interdisciplinary research.
  • Reliance on citations misses the broader impact of research on society.
  • The JIF can be gamed by journal editors who suggest or require that authors cite other articles in that journal. In addition, review and methods articles are privileged since they are more frequently cited. The quest for citations is also why journals don’t publish negative results, even though those are important for science.
  • The JIF covers only about a third of existing journals, and is heavily STEM-centric.
  • The prestige of publishing in a high-JIF journal encourages bad science. Retraction rates have been correlated with the JIF. Researchers are incentivized to produce sexy but unreliable research, and to tweak the results (e.g., p-hacking).
  • Journal prestige is a cause of rising journal prices. If a journal is prestigious in its discipline, then it can charge what libraries can afford to pay, rather than for the work required to produce the journal (though also more expensive because high-JIF journals spend so much time and effort rejecting papers which are then published elsewhere). Journal prices have outpaced the consumer price index for decades.

So how is the JIF used in libraries, where it was intended for use in journal evaluation? Here at Virginia Tech, it’s not used at all in journal selection, and is only one of many considerations in journal cancellation. It’s ironic that the JIF has so little use in libraries while becoming so influential in research evaluation. Why should libraries care about this? To some extent, because “our” little metric somehow escaped and is now inflicting damage across academia. More importantly, we often encounter faculty who don’t fully understand what the JIF is (only that it should be pursued), and, as mentioned at the beginning, the focus on JIF is a real barrier as we advocate to faculty for the open dissemination of research.

Why is the JIF so appealing? Convenience no doubt plays a major role- just grab the number from JCR (or the journal, many institutions can’t afford JCR). After all, it’s a quantitative “factor” that measures “impact” (to three decimal places, so you know it’s scientific!). And if the JIF wasn’t problematic enough, it’s now being incorporated into world university rankings.

How can we address this problem? One attempt to address the misuse of the JIF is the San Francisco Declaration on Research Assessment (DORA), which gives this general recommendation:

  • Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions.

DORA continues by giving specific recommendations for funding agencies, institutions, publishers, organizations that supply metrics, and researchers.

DORA

For institutions, DORA has two recommendations:

  • Be explicit about the criteria used to reach hiring, tenure, and promotion decisions, clearly highlighting, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published.
  • For the purposes of research assessment, consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice.

Virginia Tech should consider signing DORA and making that known, as University College London just has. But more importantly, it should also become official university policy, making it clear to all faculty that the JIF should not be used in hiring, tenure, or promotion.

For those interested in exploring further, here are just a few of the most recent commentaries I’ve come across (and to discuss further, consider my upcoming NLI session):

Nine Reasons Why Impact Factors Fail And Using Them May Harm Science (Jeroen Bosman)
Why We Should Avoid Using the Impact Factor to Assess Research and Researchers (Jan Erik Frantsvåg)
If Only Access Were Our Only Infrastructure Problem (Bjorn Brembs, slides 4-23)
Misrepresenting Science Is Almost As Bad As Fraud (Randy Shekman)
Deep Impact: Unintended Consequences of Journal Rank (Bjorn Brembs, Katherine Button, Marcus Munafo)
The Impact Factor Game (PLOS Medicine editors)
Everybody Already Knows Journal Rank is Bunk (Bjorn Brembs)
Sick of Impact Factors and Coda (Stephen Curry)
Excess Success for Psychology Articles in the Journal Science (Gregory Francis, Jay Tanzman, William J. Matthews)
Assess Based on Research Merit, Not Journal Label (David Kent)
Do Not Resuscitate: The Journal Impact Factor Declared Dead (Brendan Crabb)
High Impact Journals May Not Help Careers: Study (Times Higher Education)
Choosing Real-World Impact Over Impact Factor (Sam Wineburg)
Journal Impact Factors: How Much Should We Care? (Henry L. Roediger, III)
How to Leave Your Impact Factor (Gary Gorman)
The Big IF: Is Journal Impact Factor Really So Important? (Open Science)
End Robo-Research Assessment (Barbara Fister)

(I’m sure there are many, many more I missed!)