


OpenAI CEO says: Expanding scale is not the only way to progress, and the era of giant AI models may be coming to an end
News on April 18th, OpenAI’s chatbot ChatGPT is so powerful that it has aroused great interest and investment in artificial intelligence. However, the company’s CEO Sam Altman believes that existing research strategies have failed and future AI progress requires new ideas.
In recent years, OpenAI has made an impressive series of advances in processing language by scaling existing machine learning algorithms to previously unimaginable scales. Its most recently developed project is GPT-4, which it says has been trained using trillions of words of text and thousands of powerful computer chips at a cost of more than $100 million.
However, Altman said that future advances in AI will no longer depend on making models larger. "I think we're at the end of an era," he said at an MIT event. "In this [outgoing] era, models got bigger and bigger. Now, we're going to be in other ways." Make them better.”
Altman’s comments represent an unexpected turn in the race to develop and deploy new AI algorithms. Since launching ChatGPT in November, Microsoft has leveraged the underlying technology to add chatbots to its Bing search engine, and Google has launched a competitor called Bard. Many people are eager to try out this new chatbot to help with work or personal tasks.
Meanwhile, a number of well-funded startups, including Anthropic, AI21, Cohere, and Character.AI, are pouring resources into building larger algorithms in an effort to catch up with OpenAI. The initial version of ChatGPT is built on GPT-3, but users now also have access to a more powerful GPT-4 supported version.
Altman’s statement also hinted that after adopting the strategy of expanding the model and providing more data for training, GPT-4 may be OpenAI’s last major achievement. However, he did not reveal any research strategies or techniques that might replace current methods. In a paper describing GPT-4, OpenAI said its estimates showed diminishing returns from scaling up models. There are also physical limits to the number of data centers the company can build and how quickly it can build them, Altman said.
Cohere co-founder Nick Frosst, who worked on artificial intelligence at Google, said that what Altman calls "continuously increasing the size of the model is not an effective solution without limit." plan" is correct. He believes that machine learning models for GPT-4 and other transformers types (editing group: transformers are literally translated as converters, and GPT is the abbreviation of Generative pre-trained transformers, meaning generative pre-training models based on transformers), progress It’s not just about scaling anymore.
Frost added: "There are many ways to make transformers better and more useful, and many of them do not involve adding parameters to the model. New artificial intelligence model design or architecture, and human-based Further adjustment of feedback is a direction that many researchers are already exploring.”
In OpenAI’s language algorithm family, each version is composed of artificial neural networks. The design of this software is inspired by neural networks. The way in which elements interact with each other, after training, it can predict the words that should follow a given text string.
In 2019, OpenAI released its first language model GPT-2. It involves up to 1.5 billion parameters and is a measure of the adjustable number of connections between neurons. That's a very large number, thanks in part to a discovery by OpenAI researchers that scaling up makes the model more coherent.
In 2020, OpenAI launched GPT-3, the successor of GPT-2, which is a larger model with up to 175 billion parameters. GPT-3’s broad ability to generate poetry, emails, and other text has led other companies and research institutions to believe that they can scale their own AI models to similar or even larger scales than GPT-3.
After ChatGPT debuted in November last year, meme makers and technology experts speculated that when GPT-4 came out, it would be a more complex model with more parameters. However, when OpenAI finally announced its new AI model, the company didn't reveal how big it would be, perhaps because size was no longer the only factor that mattered. At the MIT event, Altman was asked if the cost of training GPT-4 was $100 million, and he responded: "More than that."
Although OpenAI is keeping the scale and inner workings of GPT-4 secret, it is likely that it no longer relies solely on scaling up to improve performance. One possibility is that the company used a method called "reinforcement learning with human feedback" to enhance ChatGPT's capabilities, including having humans judge the quality of the model's answers to guide it in providing services that are more likely to be judged as high quality. s answer.
GPT-4’s extraordinary capabilities have alarmed many experts and sparked debate over AI’s potential to transform the economy, as well as concerns that it could spread disinformation and create unemployment. A number of entrepreneurs and AI experts recently signed an open letter calling for a six-month moratorium on development of models more powerful than GPT-4, including Tesla CEO Elon Musk.
At the MIT event, Altman confirmed that his company is not currently developing GPT-5. He added: "An earlier version of this open letter claimed that OpenAI was training GPT-5. In fact we are not doing this and won't be in the short term."
The above is the detailed content of OpenAI CEO says: Expanding scale is not the only way to progress, and the era of giant AI models may be coming to an end. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

In 2023, AI technology has become a hot topic and has a huge impact on various industries, especially in the programming field. People are increasingly aware of the importance of AI technology, and the Spring community is no exception. With the continuous advancement of GenAI (General Artificial Intelligence) technology, it has become crucial and urgent to simplify the creation of applications with AI functions. Against this background, "SpringAI" emerged, aiming to simplify the process of developing AI functional applications, making it simple and intuitive and avoiding unnecessary complexity. Through "SpringAI", developers can more easily build applications with AI functions, making them easier to use and operate.

OpenAI recently announced the launch of their latest generation embedding model embeddingv3, which they claim is the most performant embedding model with higher multi-language performance. This batch of models is divided into two types: the smaller text-embeddings-3-small and the more powerful and larger text-embeddings-3-large. Little information is disclosed about how these models are designed and trained, and the models are only accessible through paid APIs. So there have been many open source embedding models. But how do these open source models compare with the OpenAI closed source model? This article will empirically compare the performance of these new models with open source models. We plan to create a data

The humanoid robot Ameca has been upgraded to the second generation! Recently, at the World Mobile Communications Conference MWC2024, the world's most advanced robot Ameca appeared again. Around the venue, Ameca attracted a large number of spectators. With the blessing of GPT-4, Ameca can respond to various problems in real time. "Let's have a dance." When asked if she had emotions, Ameca responded with a series of facial expressions that looked very lifelike. Just a few days ago, EngineeredArts, the British robotics company behind Ameca, just demonstrated the team’s latest development results. In the video, the robot Ameca has visual capabilities and can see and describe the entire room and specific objects. The most amazing thing is that she can also

Regarding Llama3, new test results have been released - the large model evaluation community LMSYS released a large model ranking list. Llama3 ranked fifth, and tied for first place with GPT-4 in the English category. The picture is different from other benchmarks. This list is based on one-on-one battles between models, and the evaluators from all over the network make their own propositions and scores. In the end, Llama3 ranked fifth on the list, followed by three different versions of GPT-4 and Claude3 Super Cup Opus. In the English single list, Llama3 overtook Claude and tied with GPT-4. Regarding this result, Meta’s chief scientist LeCun was very happy and forwarded the tweet and

If the answer given by the AI model is incomprehensible at all, would you dare to use it? As machine learning systems are used in more important areas, it becomes increasingly important to demonstrate why we can trust their output, and when not to trust them. One possible way to gain trust in the output of a complex system is to require the system to produce an interpretation of its output that is readable to a human or another trusted system, that is, fully understandable to the point that any possible errors can be found. For example, to build trust in the judicial system, we require courts to provide clear and readable written opinions that explain and support their decisions. For large language models, we can also adopt a similar approach. However, when taking this approach, ensure that the language model generates

Author丨Compiled by TimAnderson丨Produced by Noah|51CTO Technology Stack (WeChat ID: blog51cto) The Zed editor project is still in the pre-release stage and has been open sourced under AGPL, GPL and Apache licenses. The editor features high performance and multiple AI-assisted options, but is currently only available on the Mac platform. Nathan Sobo explained in a post that in the Zed project's code base on GitHub, the editor part is licensed under the GPL, the server-side components are licensed under the AGPL, and the GPUI (GPU Accelerated User) The interface) part adopts the Apache2.0 license. GPUI is a product developed by the Zed team

The volume is crazy, the volume is crazy, and the big model has changed again. Just now, the world's most powerful AI model changed hands overnight, and GPT-4 was pulled from the altar. Anthropic released the latest Claude3 series of models. One sentence evaluation: It really crushes GPT-4! In terms of multi-modal and language ability indicators, Claude3 wins. In Anthropic’s words, the Claude3 series models have set new industry benchmarks in reasoning, mathematics, coding, multi-language understanding and vision! Anthropic is a startup company formed by employees who "defected" from OpenAI due to different security concepts. Their products have repeatedly hit OpenAI hard. This time, Claude3 even had a big surgery.
