Let’s move to the cloud and refactor the code together.
For the increasingly fast-paced academic research field, arXiv is a very important paper preprint platform. Like Wikipedia, it is a non-profit organization.
On Thursday local time, Cornell Tech announced the good news that arXiv has received a huge donation.
As a nonprofit database, arXiv is free and accessible to all and has long relied on donations. Cornell Tech announced that the Simons Foundation and the National Science Foundation (NSF) have awarded grants totaling more than $10 million to support arXiv.
According to reports, the funding will enable this research repository with more than 2 million papers to move to the cloud and modernize its code to ensure higher levels of reliability, fault tolerance and availability. Accessibility.
In a few days, we should be able to load PDFs on arXiv faster, and perhaps we can read papers directly on the web page.
“I am deeply grateful to the Simons Foundation and the National Science Foundation for their tremendous support,” said Greg Morrisett, the Jack and Leila Nefsey Dean and Vice Provost of Cornell Tech. "This investment ensures that the arXiv service continues to scale, serve a broader audience, and better serve the scientific community."
Ramin Zabih, professor of computer science at Cornell Tech Campus, said: "Through By modernizing the code base and transitioning to the cloud, we are strengthening arXiv's infrastructure and ensuring it continues to be a source of innovation in sharing scholarly publications."
arXiv (pronounced "archive") on It was founded in 1991 by Dr. Paul Ginsparg, then a physicist at Los Alamos National Laboratory, who wanted to catalog about 100 research papers. As papers poured in, he tried to solve the problem with the help of a computer program, which he reportedly learned how to write "by attending machine learning seminars for more than a decade."
Ginsparg is now a professor of physics and information science at Cornell University.
The paper platform is now maintained and operated by Cornell University Libraries. It is a huge repository of preprints of academic papers, collecting published and large quantities that have not yet undergone the peer review process. , or articles not intended for publication in refereed journals.
Currently, the sub-disciplines covered by arXiv include multiple fields of natural sciences and social sciences, including physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and economics. As of the end of 2022, there have been more than 2.2 million submissions on arXiv.
Due to the great appeal of arXiv, researchers in many fields will "publish" their latest research results in advance before being accepted by academic conferences or journals. to this platform. Generally people think that the benefits of doing this lie in "taking advantage of" and publicity: they can not only protect their own ideas, but also expand publicity and enhance the scholar's own influence. At the same time, this approach also greatly speeds up the dissemination of information in the academic community.
In contrast, even in today's fast-paced AI field, it takes months to wait for conference or journal papers from submission to final visibility, and sometimes new directions have even emerged during this period. .
As a result, arXiv has gradually become the preferred "submission" place for many academic fields, such as mathematics and computer science. Today, frequent browsing of arXiv has become a habit for many scholars. In the field of artificial intelligence, many articles included in top conferences such as NeurIPS, CVPR, and AAAI have been posted on arXiv in advance to gain exposure. On the other hand, because it does not have a very effective screening mechanism, there are also many lower-quality papers on the platform. This may be the direction arXiv tries to change after receiving new funding.
Most of the time, whether it is accepted by an important journal conference is an important criterion for evaluating the quality of a paper. However, after the emergence of arXiv, many papers that were missed by the conference have exerted an important influence in the academic world and received many citations. In the field of artificial intelligence, we can easily list some well-known articles that appeared on arXiv but were rejected by top AI conferences, such as research on YOLO, transformer XL, and Dropout.
YOLO, a well-known target detection algorithm in the field of computer vision, has had more than 40,000 citations in its paper. However, it was originally rejected by NIPS and was later submitted to CVPR 2016 and was accepted.
In 2012, the later Turing Award winner Geoffrey Hinton proposed Dropout in the paper "Improving neural networks by preventing co-adaptation of feature detectors". In the same year, the emergence of AlexNet opened a new era of deep learning. AlexNet used Dropout to significantly reduce overfitting and played a key role in its victory in the ILSVRC 2012 competition. It can be argued that without Dropout, the great development of deep learning may be delayed by several years.
However, this paper was rejected by NIPS 2012 and is still in preprint status on arXiv.
What will the arXiv platform, which carries the hope of advanced science, develop into in the future?
Cornell University said the next phase of arXiv development will include hiring additional software developers to support modernization efforts. At the same time, computer science faculty will develop new search and recommendation technologies with NSF funding that are planned to support arXiv's large user community and will be backed by state-of-the-art privacy guarantees. In addition, arXiv will provide better access for the visually impaired by generating HTML and PDF versions of the content.
The $10 million in funding will significantly increase arXiv’s capabilities. By comparison, arXiv spent a total of $2.42 million in 2021.
After the news of the donation was released, people applauded and looked forward to the future evolution of the preprint platform.
Reference content:
##https://news.cornell. edu/stories/2023/10/research-repository-arxiv-receives-10m-upgrades
##https://news.ycombinator.com/item?id =37949656
https://medium.com/nautilus-magazine/what-counts-as-science-76ebd1f5d403
The above is the detailed content of Received a donation of US$10 million for code refactoring and cloud migration, and the paper preprint platform arXiv 'is getting better'. For more information, please follow other related articles on the PHP Chinese website!