- American Physical Society Sites
- Meetings & Events
- Policy & Advocacy
- Careers In Physics
- About APS
- Become a Member
By Daniel Garisto
In the years following World War II, a flood of research created a crisis for scientific communication. To keep up with the deluge, physicists began mailing unpublished manuscripts across the country and around the world. These preprints speedily brought research to physicists hungry for news, outstripping traditional publications by months. By the time Paul Ginsparg launched email@example.com (now known as arXiv.org) in 1991, paper preprints had been entrenched in the culture of physics for decades.
The success of arXiv, which now holds 1.5 million preprints, is well known to physicists, mathematicians, and computer scientists who rely on it. But similar efforts in other fields foundered. Life sciences repository Nature Precedings quietly shut down after six years and only about 2,000 preprints; the Chemistry Preprint Server barely got off the ground. In many fields, journal editors refused to publish papers posted as preprints.
Recently, however, the tide has begun to shift. Since 2013, dozens of preprint servers in fields such as biology, chemistry, and sociology have popped up and garnered tens of thousands of submissions.
In 2017, the National Institutes of Health allowed the inclusion of preprints in grant proposals. In May, the Nature family of journals announced that it would move from allowing preprints to encouraging them, now allowing researchers to speak to the media about preprints of submitted manuscripts. And on June 25, Cold Spring Harbor Laboratory (CSHL), Yale University, and the journal BMJ launched a new server for medical preprints, medRxiv.
For most physicists, scholarship without preprints is foreign. But working with preprints can be just as alien to scientists in other disciplines. “I still encounter people who don't know what a preprint is,” said Jessica Polka, a biochemist and executive director of ASAPbio, a nonprofit focused on faster, transparent publishing in the life sciences.
Fields other than physics have some catching up to do. Only one to two percent of the nearly 30 million peer-reviewed articles in PubMed, a life science aggregator maintained by NIH, initially appeared as preprints. As of 2009, over 95 percent of articles published in peer-reviewed HEP journals also appeared on arXiv.org. This discrepancy didn’t always exist—at one time, fields like psychology and biology also used preprints.
Throughout the 1960s, multiple organizations across various disciplines attempted to scale up the private, informal sharing of preprints that had grown in previous decades. In 1961, the American Psychological Association began a short-lived experiment in preprint exchange to solve the issue of publishing lag. The association found that “those who need preprints most—young scientists, workers at small institutions, and researchers in less developed countries—are frequently not the recipients.”
At the same time, the NIH formed its Information Exchange Groups (IEGs), which ballooned from 32 biologists in 1961 to 3600 in 1966. But in 1967, the IEGs were abruptly terminated. A 1966 Nature editorial enumerated a list of growing problems, concluding that “the experiment was plainly on the point of getting out of hand.”
While the IEGs certainly faced organizational problems like the cost of mail ($400,000 then) and confidentiality, their biggest impediment came from journals like Nature itself. Wary of the IEGs’ popularity, editors of biochemical journals were happy to recognize the value of IEG memoranda—so long as it was clear that the material was never to be published. Two years later, this anti-preprint stance spread across scientific publishing thanks to what became known as the Ingelfinger rule (see APS News, November 2012).
During the early 1960s, preprints, especially in HEP, had proliferated out of control. In 1965, theoretical physicist Michael Moravscik proposed an analogous “Physics Information Exchange” (PIE) to tame and centralize the chaos. Similarly, Physical Review editors like Sam Goudsmit and Simon Pasternack voiced opposition. In an acerbic editorial entitled “Communication Problems,” Goudsmit mocked preprints:
"The next step might be to equip theorists with portable recorders so that all their statements about physics, including those uttered in their sleep, would be preserved on tape. The contents of the tapes would be transmitted electronically to interested colleagues via a distribution center. Hopefully, such a system might result in such chaos as to make priority assignments impossible, and the great advances in theoretical physics would become anonymous, just like the great achievements in the art of ancient Egypt."
Rita Taylor helped run Preprints in Particles and Fields (PPF) at SLAC for many years.
But thanks to funding from the US Atomic Energy Commission, the PIE launched as a trial run. Distributed weekly, the PIE cut costs by giving only a list of preprints, as opposed to providing the full document. Its success led to the SLAC-based Preprints in Particles and Fields (PPF). Until 1993, hundreds of physicists paid a subscription fee to get a weekly listing of preprints, delivered by airmail. To appease journal editors, PPF also contained a list of “anti-preprints,” which were preprints that had been published.
By the mid 1980s, networks like BITNET and DECnet connected physicists across the US and Europe, allowing them to access bibliographic information in databases like SPIRES-HEP. Math typesetting software like TeX and the development of email made it possible to share electronic preprints.
In 1989, Joanne Cohn, a physicist then at the Institute for Advanced Study, began distributing TeX files of string theory papers via email. By August of 1991, the email list had grown to 180 physicists—an unwieldy number for Cohn to individually respond to requests for papers. As Cohn recounts, a young physicist then at Los Alamos National Laboratory offered to automate the list, and arXiv was born.
“Day one, something happened, day two, something happened. Day three, Ed Witten posted a paper,” said Cornell University physicist Paul Ginsparg, founder of arXiv.org. “That was when the entire community joined.”
For physicists, it became indispensable. “It was this one-stop-shopping daily information feed. If it's not there, then it may as well not exist,” Ginsparg said. “I still don't know if there's anyone that's using it quite like the high energy physicists were using it already in the early ‘90s.”
Eager to capitalize on the phenomenon, the APS participated in “e-print” workshops and even launched its own ill-fated “e-print” server in 1996, which closed down within a few years. More successful was the groundbreaking decision by the APS in 1997 to amend its copyright rules, formally allowing e-prints. This reversal from anti-preprint attitudes of the 1960s was a testament to the cultural changes the past decades had wrought on physics and the inescapable power of the internet.
Jim Till, a biophysicist at the University of Toronto credits the adoption of arXiv to the fact that “HEP physicists have been members of a well-defined and highly interactive community of voracious readers, with a pre-existing hard-copy preprint habit, a standardized text formatting system (TeX), and a generally high degree of computer literacy.”
In the decades that followed, as scientific publishing transitioned to the digital age, other preprint servers popped up. But few, if any, have replicated arXiv’s success.
One of the most promising attempts was “E-Biomed,” which then-NIH director Harold Varmus proposed in 1999. But after four months of opposition from journals, it was dead. The project lived on in PubMed Central, which archives peer-reviewed open-access articles in the life sciences, but no preprints.
“I shamelessly reused the same comment roughly every two years for over a decade: that it's thrilling that biologists are finally entering the latter half of the 20th century, better late than never,” said Ginsparg. “And, of course I could reuse it because it never actually happened.”
A number of fears and concerns link these failed ventures. In fields such as biomedicine, researchers are often wary that unrefereed papers could have serious public health implications. Though preprint advocates believe the concerns are overstated, the worries still exist and have led to extremely careful rollout of medRxiv, which claims to have instituted stringent acceptance criteria.
“There’s a huge variety in how rigorous peer review is and papers people really want to get published probably will anyway,” said Polka. “I'm not sure eliminating preprints is going to fix that problem.”
While physicists consider a preprint as a stamp of priority, many scientists in other disciplines worry that posting a preprint will cause them to be scooped. There is little evidence of such scooping so far, but the fears still persist.
“If you're doing something off of a big data set that anybody has access to, and you put a preprint out, it's not totally clear that somebody is going to respect that,” said Elizabeth Berman, a sociologist at the University of Albany and a member of the SocArXiv steering committee.
August 12, 1983 issue of PPF mailed out from SLAC. Each issue contained a list of that week's preprints.
SocArXiv was established in 2017 and hosts over 3,000 papers, but is still not used widely by the sociology community, despite the fact that adjacent fields of research use the Social Science Research Network (SSRN) and economists have long posted preprints to the National Bureau of Economic Review and the online repository RePEc since 1997. Academic communities unfamiliar with preprints and rarely exposed to preprint culture didn’t develop it; those that accepted preprints never looked back.
Outside of a few exceptions, the majority of preprint servers have popped up within the past few years. One of the first of this new wave, bioaRxiv, was founded in 2013 with the support of CSHL and now receives over 2,000 preprints every month. Then, an explosion of servers followed: ChemrXiv, engrXiv, SocArXiv, LawrXiv, SportrXiv, PaleorXiv, and regional hubs, like AfricArXiv and IndiaRxiv. Thanks to easily available software from the Center for Open Science, building the infrastructure for a preprint server is no longer an obstacle.
Looking at bioaRxiv as a case study, Till points to a number of important reasons for its growth, including its backing by CSHL, the concomitant rise of quantitative biology submissions to arXiv, and a cultural shift in biology toward openness and transparency.
Still, bioaRxiv and other new preprint servers are far from having complete community buy-in, like arXiv did. “I would say it's far from being successful at this point,” said biologist Jon Inglis, the co-founder of bioaRxiv. “But it clearly has momentum.”
At various times in the history of preprints, their advocates and detractors have predicted that preprints would spell the end for traditional academic journals. Both sides have been repeatedly proven wrong on this matter: preprints have managed to largely coexist with traditional journals, which still fulfill the important task of quality control and curation. As they continue to expand in other fields, preprints may soon find themselves as much a fixture of publishing as they have become in physics.
The author is a science writer based in Bellport, New York.
©1995 - 2021, AMERICAN PHYSICAL SOCIETY
APS encourages the redistribution of the materials included in this newspaper provided that attribution to the source is noted and the materials are not truncated or changed.
Editor: David Voss
Staff Science Writer: Leah Poffenberger
Contributing Correspondent: Alaina G. Levine
Publication Designer and Production: Nancy Bennett-Karasik