Home > Technology peripherals > AI > body text

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

WBOY
Release: 2024-03-08 15:46:15
forward
1128 people have browsed it

On Thursday, American AI startup Inflection AI officially released a new generation of large language model Inflection-2.5.

According to reports, Inflection-2.5 will combine powerful LLM technology and Inflection’s unique “empathy fine-tuning” feature, integrating the characteristics of high emotional intelligence and high IQ. It can obtain factual information through the Internet, and its performance is comparable to leading large-scale models such as GPT-4 and Gemini.

Inflection-2.5 is now available to all Pi users for free on PC, iOS and Android apps. After a simple test by Heart of the Machine, we found that there is still a certain gap compared with GPT-4, but it is still worth a try. Interested users can experience it themselves.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

Link: https://pi.ai/talk

It is worth noting that Inflection -2.5 achieves performance close to GPT-4, while the training process only uses 40% of the computing power of GPT-4.

Inflection AI points out that a new generation of large-scale models has made significant progress in areas such as intelligent coding and mathematics. These advances will translate into concrete improvements to key industry benchmarks, ensuring Pi remains at the forefront of technology. In addition, Pi also integrates world-class real-time web search capabilities to ensure that users have access to high-quality breaking news and the latest information.

Inflection-2.5 vs GPT-4

The FLOP used in Inflection-1 training is about 4% of GPT-4, in various In "IQ-oriented" tasks, its average performance is about 72% of the GPT-4 level. Now, Inflection-2.5 achieves an average performance of over 94% of GPT-4, despite using only 40% of GPT-4’s FLOPs for training. As shown in the figure below, the performance of Inflection-2.5 has achieved significant improvements across the board, with the greatest improvements in STEM domain knowledge.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

The results of Inflection-2.5 on two different STEM exams - the Hungarian Mathematics Examination and the Physics Graduate Record Examination (GRE) - are as follows:

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

As shown in the table below, the study also evaluated Inflection-2.5 on the MMLU benchmark and GPQA Diamond benchmark. The MMLU benchmark covers 57 disciplines in STEM, humanities, social sciences, and more, effectively testing an LLM’s comprehensive knowledge capabilities, while the GPQA Diamond benchmark is an extremely difficult expert-level benchmark.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

On the BIG-Bench-Hard benchmark, Inflection-2.5 improves performance by more than 10% than Inflection-1 and is comparable to GPT-4 Comparable. The BIG-Bench-Hard benchmark mainly covers problems that are difficult to solve with large language models.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

The study was also evaluated on the MT-Bench benchmark. However, the research team realized that the benchmark had a large portion (nearly 25%) of sample examples in the Reasoning, Mathematics, and Coding categories with incorrect reference solutions or flawed premises. Therefore, the study corrected these examples and performed the evaluation experiments again, and the results are shown in the following table:

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

Evaluation on GSM8k and MATH benchmarks The results show that Inflection-2.5 is a significant improvement over Inflection-1 in terms of math and coding capabilities:

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

To further test the coding of Inflection-2.5 Ability, this study conducted evaluation experiments on two coding benchmarks, MBPP and HumanEval, and the results are shown in the following table:

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

#

The research team evaluated Inflection-2.5 on HellaSwag and ARC-C, as well as various models on common sense and scientific benchmarks. Judging from the results below, Inflection-2.5 achieves strong performance on these benchmarks.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

Additionally, all of the above evaluations were done using models that now support Pi. However, it is also important to note that the user experience may vary slightly due to network retrieval (the above benchmark does not use network retrieval), the structure of the few-shot prompts, and other production aspects.

In general, Inflection-2.5 maintains Pi’s “heart-centered” features and extremely high security standards, becoming a more comprehensive and useful model.

In recent times, the technology competition for large language models has entered a fierce stage. Among many technology companies, Mistral AI (Mistral Large ), Anthropic (Claude 3) stand out, and the new technology proposed achieves capabilities close to GPT-4 and Gemini Ultra. Inflection-2.5, which appeared yesterday, seems to be joining the first echelon.

As a star startup in Silicon Valley, Inflection AI has a long history. It was established in 2022. The three co-founders are Mustafa Suleyman, the original co-founder of DeepMind, and the co-founder of Linkedln. Reid Hoffman, and former DeepMind chief scientist Karen Simonyan.

The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4

In June last year, Inflection AI announced that it had received US$1.3 billion in financing from Microsoft, Nvidia, Reid Hoffman, Bill Gates, and former Google CEO Eric Schmidt led the investment. Currently, Inflection AI has become the fourth largest generative AI startup in the world.

The above is the detailed content of The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!