Editor | ScienceAI
A year ago, Llion Jones, the last author of Google’s Transformer paper, left to start a business and co-founded the artificial intelligence company Sakana AI with former Google researcher David Ha. Sakana AI claims to create a new foundational model based on nature-inspired intelligence!
Now, Sakana AI has handed in its answer sheet.
Sakana AI announces the launch of AI Scientist, the world’s first AI system for automated scientific research and open discovery!
From conceiving, writing code, running experiments and summarizing results, to writing entire papers and conducting peer reviews, AI Scientists usher in a new era of AI-driven scientific research and accelerated discovery.
In principle, it can continuously repeat the scientific research process, iteratively developing ideas in an open manner, just like human scientists.
The researchers demonstrated its versatility by applying it to three different subfields of machine learning: diffusion modeling, Transformer-based language modeling, and learning dynamics.
Each idea will be implemented and developed into a complete paper for less than $15 per paper. To evaluate the generated papers, the researchers designed and validated an automated reviewer with near-human performance in assessing paper scores.
AI Scientist can write papers that exceed the acceptance threshold of top machine learning conferences.
The launch of AI Scientist marks an important step towards realizing the full potential of artificial intelligence in scientific research. By automating the discovery process and integrating AI-driven review systems, it opens the door to endless possibilities for innovation and problem-solving in the most challenging fields of science and technology.
Relevant research titled "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery" was published on the preprint platform arXiv on August 12.
Paper link: https://arxiv.org/abs/2408.06292
One of the challenges facing artificial intelligence is to develop agents that can conduct scientific research and discover new knowledge. While cutting-edge models have been used as ancillary tools for human scientists, such as brainstorming ideas, writing code, or performing predictive tasks, they still only complete a small part of the scientific process.
In latest research, scientists at Sakana AI propose the first comprehensive framework for fully automated scientific discovery, enabling cutting-edge large-scale language models to independently conduct research and communicate their findings.
AI Scientist can generate novel research ideas, write code, perform experiments, visualize results, describe their findings by writing a full scientific paper, and then run a simulated review process for evaluation.
About AI Scientist
AI Scientist has three main stages: (1) idea generation, (2) experimental iteration, (3) paper writing. Once written, the researchers introduce and validate the reviews generated by the LLM to assess the quality of the resulting paper.
Illustration: End-to-end LLM-driven scientific discovery process AI Scientist’s concept illustration. (Source: Paper)
Researchers provide AI Scientists with a starting code template that reproduces lightweight baseline training runs of popular models or benchmarks. For example, this might be code to train a small transformer on Shakespeare, a classic proof-of-concept training run in natural language processing that can be completed in minutes.
Then, AI Scientists are free to explore any possible research direction. The template also includes a LaTeX folder containing style files and section headers as well as simple plotting code. Generally, each run starts with a representative small-scale experiment relevant to the topic area.
The researchers explained: "Focusing on small-scale experiments is not a fundamental limitation of our method, but is simply a matter of computational efficiency and the computational limitations of our equipment."
Why is writing a paper important?
Given that the overall goal of scientists is to automate scientific discovery, why would researchers want AI Scientists to write papers like human scientists? For example, previous AI systems such as FunSearch and GNoME once produced impressive scientific discoveries in restricted fields, but they were not capable of writing papers.
The team believes that it is crucial for AI Scientists to write scientific papers to disseminate their findings for the following reasons: first, writing papers provides humans with a highly interpretable way to benefit from what they have learned; second, in Reviewing written papers within the framework of existing machine learning conferences allows scientists to standardize assessments; third, since the birth of modern science, scientific papers have been the main medium for disseminating research results.
Because the paper can use natural language and contain plots and codes, it can flexibly describe any type of scientific research and findings. Almost every other format imaginable is locked into some data or scientific genre. Until a superior alternative emerges (or may be invented by artificial intelligence), the team believes training AI Scientists to write scientific papers is critical to their integration into the wider scientific community.
Illustration: Preview of the "Adaptive Dual-Scale Denoising" paper completely independently generated by AI Scientist. (Source: paper)
About the cost
The framework here is flexible enough to efficiently conduct research in various subfields of machine learning, including transformer-based language modeling, neural network learning dynamics, and diffusion modeling. The system is highly cost-effective, costing approximately $15 per paper, and produces conference-relevant papers, highlighting its ability to democratize research (increase its accessibility) and accelerate scientific progress.
For example, the researchers’ preliminary qualitative analysis of AI Scientist suggests that the resulting papers can be broadly informative and novel, or at least contain ideas worthy of future research.
The actual amount of computing allocated by the team to AI Scientists for experiments is also very small by current standards. Notably, most of the researchers' experiments, which generated hundreds of papers in a week, were run using only a single 8×NVIDIA H100 node. If the search and filtering scope were expanded on a large scale, higher quality papers might be produced.
In this project, most of the cost of running AI Scientist was related to the cost of LLM API coding and paper writing. In comparison, the costs associated with running the LLM reviewer and the computational expense of conducting the experiments were negligible due to constraints imposed by the team to reduce overall costs.
Of course, this cost sharing may change in the future if AI Scientists are applied to other scientific fields or used in larger-scale computational experiments.
Open vs. Closed Model
To quantitatively evaluate and optimize the generated papers, the researchers first created and validated an automated paper reviewer. The results show that, although there is still a lot of room for optimization, LLM is able to produce fairly accurate reviews and achieve results comparable to humans on various metrics.
Graphic: Violin graph shows the distribution of scores for AI Scientist reviewer-generated papers in three areas and four base models. (Source: Paper)
Applying this reviewer to papers generated by AI Scientist enables scientists to extend paper evaluation beyond human review. The researchers found that Sonnet 3.5 consistently produced the best papers, some of which even exceeded the acceptance threshold of automated paper reviewers at standard machine learning conferences.
However, the team has no reason to expect AI Scientist to maintain its lead with a single model like Sonnet 3.5. Researchers believe that all cutting-edge LLMs, including open models, will continue to improve. Competition among LLMs will significantly increase their commoditization and capabilities.
Illustration: Evaluating AI Scientist’s paper review process on ICLR 2022 OpenReview data using GPT-4o. (Source: Paper)
In this project, the researchers studied a variety of proprietary LLMs, including GPT-4o and Sonnet, but also explored the use of open models such as DeepSeek and Llama-3. The open model was found to have significant advantages, such as lower costs, guaranteed availability, greater transparency, and greater flexibility, albeit with slightly lower quality.
In the future, the researchers aim to use the proposed discovery process to produce self-improving artificial intelligence in closed-loop systems using open models.
Future Directions
Immediate improvements to AI Scientist may include integrating visual capabilities to better handle charts and graphs, incorporating human feedback and interaction to improve the output of AI, and enabling AI Scientist to extract data from the Internet new data and models to automatically expand the scope of their experiments, provided it is safe to do so.
Additionally, AI Scientists can follow up on their best ideas and even work directly on their own code in a self-referential way. In fact, most of the code for the project was written by Aider. Expanding the framework to other scientific fields could further expand its impact, paving the way for a new era of automated scientific discovery.
Crucially, future work should address reliability and hallucination issues, possibly through deeper automated validation of reported results. This can be achieved by directly linking the code and experiments, or by seeing if an automated verifier can independently reproduce the results.
Epilogue
AI Scientist marks the beginning of a new era of scientific discovery in machine learning: bringing the transformative advantages of AI agents into the entire research process of AI itself, and bringing scientists closer to a world that can unleash unlimited and affordable A world where creativity and innovation come to solve the world's most challenging problems.
Ultimately, “We envision a scientific ecosystem entirely powered by AI, including not just AI-driven researchers but also reviewers, area chairs, and entire conferences. However, we do not believe that the role of human scientists will weaken. As we adapt to new technologies and move up the food chain, the role of scientists will change," the researchers said in the paper.
While current iterations of AI Scientist demonstrate a strong ability to innovate on top of proven ideas such as diffusion modeling or Transformers, it remains an open question whether such systems will ultimately be able to come up with truly paradigm-shifting ideas.
Will future versions of AI Scientists be able to come up with ideas as impactful as diffusion modeling, or come up with the next Transformer architecture? Will machines eventually be able to invent concepts as fundamental as artificial neural networks or information theory?
"We believe that AI Scientist will be an excellent partner for human scientists, but only time will tell."
GitHub open source address: http://github.com/SakanaAI/AI-Scientist
Paper link: https://arxiv.org/abs/2408.06292
Reference content:
http://sakana.ai/ai-scientist/
https://x.com/SakanaAILabs/status/1823178623513239992
https://mp.weixin.qq.com/s/-jjXBJAkdMEyl2JhRgwdaA
The above is the detailed content of The first fully automated scientific discovery AI system, Transformer author startup Sakana AI launches AI Scientist. For more information, please follow other related articles on the PHP Chinese website!