Home Technology peripherals It Industry Wenxin 4.0 performed well in the SuperBench evaluation, leading in many indicators

Wenxin 4.0 performed well in the SuperBench evaluation, leading in many indicators

Apr 23, 2024 pm 01:37 PM
Wenxinyiyan api call

In March 2024, in the "SuperBench Large Model Comprehensive Capability Evaluation Report" recently released by the Basic Model Research Center of Tsinghua University, the report comprehensively evaluated 14 influential models at home and abroad.

In this report, the outstanding performance of Wenian 4.0 has attracted widespread attention. Its overall performance is close to the top international models, and it is gradually narrowing the gap with the world's leading models, showing that it has become the leading domestic model.

Wenxin 4.0 performed well in the SuperBench evaluation, leading in many indicators

In the evaluation of human alignment ability, Text 4.0 showed outstanding strength and ranked first in the country without any doubt. At the same time, in the evaluation of Chinese reasoning and Chinese language ability, Text 4.0 is also the best. Compared with other models, its advantages are very obvious. Especially in the evaluation of Chinese understanding, the score of Text 4.0 is 0.41 points higher than the second-placed GLM-4, showing its profound skills in Chinese processing.

In the evaluation of mathematical capabilities for semantic understanding, Text 4.0 and Claude-3 models tied for first place in the world, while the well-known GPT-4 series models followed closely behind, ranking fourth and fifth. The scores of other models are mostly concentrated around 55 points, and there is a significant gap between the leading groups.

Wenxin 4.0 performed well in the SuperBench evaluation, leading in many indicators

#In the evaluation of reading comprehension ability, Wenxin 4.0 also shines. It not only surpassed GPT-4 Turbo and Claude-3, but also surpassed GLM-4 and achieved the highest score.

In the security evaluation that enterprises are most concerned about, Text GPT 4.0 also showed excellent performance. It reached a high score of 89.1 points, surpassing the world-class GPT-4 series models and Claude-3. ranked first, while Claude-3 only ranked fourth in this review.

The report also mentioned that since Wenxinyiyan made its public debut on March 16 last year, it has achieved a breakthrough in the number of users in a short period of time, and currently has more than 200 million users. At the same time, the number of daily API calls is also extremely active, exceeding 200 million times.

The above is the detailed content of Wenxin 4.0 performed well in the SuperBench evaluation, leading in many indicators. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Will R.E.P.O. Have Crossplay?
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Exploring the boundaries of agents: AgentQuest, a modular benchmark framework for comprehensively measuring and improving the performance of large language model agents Exploring the boundaries of agents: AgentQuest, a modular benchmark framework for comprehensively measuring and improving the performance of large language model agents Apr 11, 2024 pm 08:52 PM

Based on the continuous optimization of large models, LLM agents - these powerful algorithmic entities have shown the potential to solve complex multi-step reasoning tasks. From natural language processing to deep learning, LLM agents are gradually becoming the focus of research and industry. They can not only understand and generate human language, but also formulate strategies, perform tasks in diverse environments, and even use API calls and coding to Build solutions. In this context, the introduction of the AgentQuest framework is a milestone. It not only provides a modular benchmarking platform for the evaluation and advancement of LLM agents, but also provides researchers with a Powerful tools to track and improve the performance of these agents at a more granular level

Can software compiled by Mingw be used in a Linux environment? Can software compiled by Mingw be used in a Linux environment? Mar 20, 2024 pm 05:06 PM

Can software compiled by Mingw be used in a Linux environment? Mingw is a tool chain used on the Windows platform to compile and generate programs that can run on Windows. So, can the software compiled by Mingw be used in the Linux environment? The answer is yes, but it requires some extra work and steps. The most common way to run programs compiled on Windows on Linux is to use Wine. Wine is a tool used in Linux and other similar Un

How to use PHP to call web services and APIs? How to use PHP to call web services and APIs? Jun 30, 2023 pm 03:03 PM

How to use PHP's Web services and API calls With the continuous development of Internet technology, Web services and API calls have become an indispensable part of developers. By using web services and API calls, we can easily interact with other applications to obtain data or implement specific functions. As a popular server-side scripting language, PHP also provides a wealth of functions and tools to support the development of Web services and API calls. In this article, I will briefly introduce how to use PHP to

View your Litecoin wallet address View your Litecoin wallet address Apr 07, 2024 pm 05:12 PM

To view the Litecoin wallet address, visit the Litecoin wallet and look for the address in the "Receive" tab; you can also use a blockchain browser or API call.

Let Siri no longer be mentally retarded! Apple defines a new client-side model, which is 'much better than GPT-4. It gets rid of text and visually simulates screen information. The minimum parameter model is still 5% better than the baseline system. Let Siri no longer be mentally retarded! Apple defines a new client-side model, which is 'much better than GPT-4. It gets rid of text and visually simulates screen information. The minimum parameter model is still 5% better than the baseline system. Apr 02, 2024 pm 09:20 PM

Written by Noah | 51CTO Technology Stack (WeChat ID: blog51cto) Siri, who is always criticized by users as "a bit mentally retarded", can be saved! Siri has been one of the representatives in the field of intelligent voice assistants since its birth, but its performance has been unsatisfactory for a long time. However, the latest research results released by Apple's artificial intelligence team are expected to significantly change the status quo. These results are exciting and raise great expectations for the future of this field. In related research papers, Apple's AI experts describe a system in which Siri can do more than just identify content in images, becoming smarter and more useful. This functional model is called ReALM, which is based on the GPT4.0 standard and has a

Summary of FAQs for DeepSeek usage Summary of FAQs for DeepSeek usage Feb 19, 2025 pm 03:45 PM

DeepSeekAI Tool User Guide and FAQ DeepSeek is a powerful AI intelligent tool. This article will answer some common usage questions to help you get started quickly. FAQ: The difference between different access methods: There is no difference in function between web version, App version and API calls, and App is just a wrapper for web version. The local deployment uses a distillation model, which is slightly inferior to the full version of DeepSeek-R1, but the 32-bit model theoretically has 90% full version capability. What is a tavern? SillyTavern is a front-end interface that requires calling the AI ​​model through API or Ollama. What is breaking limit

Benchmarking Bing Chat: Baidu Search's small-scale public beta 'conversation” function, based on the Wenxin Yiyan language model Benchmarking Bing Chat: Baidu Search's small-scale public beta 'conversation” function, based on the Wenxin Yiyan language model May 13, 2023 am 09:31 AM

According to news on May 9, according to contributions from IT House netizens, Baidu Search has recently begun a small-scale public test of the generative AI "conversation" function, which is based on Baidu's Wenxin Yiyan Big Language Model. This product is built based on Baidu's knowledge-enhanced large language model Wen Xinyiyan, and benchmarks Microsoft's search engine Bing's NewBing after integrating OpenAI's ChatGPT service. According to the brand public relations laboratory, the current testing channels for Baidu AI dialogue are Baidu main website and Baidu App, and the independent website is Chat.Baidu.com. Users who use this service need to have and log in to Baidu account. Currently, users who are not included in the test scope cannot access the URL normally. After entering the page, "404NotFound" will be displayed, and when accessing the page, "404NotFound" will be displayed.

How to connect Baidu Wenxin Yiyan API with PHP to obtain specific types of sentences and conduct sentiment analysis How to connect Baidu Wenxin Yiyan API with PHP to obtain specific types of sentences and conduct sentiment analysis Aug 12, 2023 pm 08:15 PM

PHP connects Baidu Wenxin Yiyan API to obtain specific types of sentences and conducts sentiment analysis. Introduction Baidu Wenxin Yiyan is an API interface that provides Chinese sentences. It can obtain sentences according to specific types, such as inspirational, love, friendship, etc. corresponding sentences. This article will introduce how to use PHP to connect to Baidu Wenxin Yiyan API and perform sentiment analysis on sentences by calling Baidu Sentiment Analysis API. Preparation Before starting, we need to make some preparations: register a Baidu developer account, create an application, and obtain

See all articles