Tracks Positive and Negative Citations for COVID-19 Literature

Adding perspective

Citation counts are conventionally seen by researchers as measures of influence. But just because a paper is highly cited doesn’t mean it’s a good thing, says Elizabeth Suelzer, a reference librarian at the Medical College of Wisconsin Libraries in Milwaukee.

Former physician Andrew Wakefield’s infamous retracted 1998 study that claimed a link between autism and vaccines is highly cited, she notes, but most of those citations are negative.

Without a thorough citation analysis, “it would be hard to tell why the article was so highly cited”, Suelzer explains. That, she says, is why a tool such as Scite.ai could be helpful. Other examples include the Retraction Watch plug-in that flags retracted articles for the Zotero reference-management software.

Josh Nicholson, co-founder, and chief executive of Scite.ai, first recognized the need for such a tool in 2012. Nicholson was pursuing a Ph.D. in cell biology at Virginia Polytechnic Institute and State University in Blacksburg when he read a Nature commentary that was making waves about scientific reproducibility3.

In it, a researcher formerly at the biotechnology company Amgen in Thousand Oaks, California, revealed that scientists there had been unable to reproduce the findings of 47 out of 53 ‘landmark’ cancer studies.

That spurred Nicholson and biologist Yuri Lazebnik, then at Yale University in New Haven, Connecticut, to propose a new citation metric to indicate whether a given study or its conclusions have been verified by subsequent reports4. The pair launched Scite.ai in April last year.

At the heart of Scite.ai is a machine-learning algorithm that scans research articles to identify which papers they cite, and to determine whether they support, contradict, or simply mention those papers.

The algorithm mines the text of articles from publisher partners, including the Rockefeller University Press in New York City and Wiley in Hoboken, New Jersey.

Scite.ai has also had preliminary conversations with Springer Nature in Heidelberg, Germany, which publishes Nature, Nicholson says. According to Nicholson, eight out of every ten papers flagged by the tool as supporting or contradicting a study are correctly categorized.

Although the machine-learning algorithm at the heart of Scite.ai has not been made public, Giovanni Colavizza, an AI scientist at the University of Amsterdam, currently a visiting researcher at the Alan Turing Institute in London, says that “their results are sound and precise”, from what he can tell.

“Most citations are classified as ‘mentions’, because the classifier is trained to be cautious, which is reasonable, too,” says Colavizza, who is a user of the platform and whose team has analyzed data from the start-up in the past.

James Heathers, a data scientist at Northeastern University in Boston, Massachusetts, likes the way that, for each paper, Scite.ai shows snippets of the other articles in which that paper’s citations appear, saving him from having to look up each referring paper and hunt for this context.

“Every time I’m exploring a complicated topic from scratch, I’m using this,” Heather says of Scite.ai. “The sentiment analysis seems to work really well,” he adds, referring to how Scite.ai categorizes positive an