Table of Contents
Or Transformer
Experimental results
One More Thing
Home Technology peripherals AI AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

Apr 10, 2023 am 08:51 AM
database ai sota

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

It has to be said that scientists have been obsessed with giving AI math lessons recently.

No, the Facebook team also joined in the fun and proposed a new model that can completely automate the demonstration of theorems and is significantly better than SOTA.

You must know that as mathematical theorems become more complex, it will only become more difficult to prove the theorems solely by human power.

Therefore, using computers to demonstrate mathematical theorems has become a research focus.

OpenAI has previously proposed a model GPT-f that specializes in this direction, which can demonstrate 56% of the problems in Metamath.

The latest method proposed this time can increase this number to 82.6%.

At the same time, researchers say that this method takes less time and can reduce computing consumption to one-tenth of the original compared to GPT-f.

Could it be said that this time AI will succeed in its battle with mathematics?

Or Transformer

The method proposed in this article is an online training program based on Transformer.

can be roughly divided into three steps:

First, pre-training in the mathematical proof library;

Second , Fine-tune the policy model on the supervised data set;

Third, Online training of the policy model and judgment model.

Specifically, it uses a search algorithm to let the model learn from the existing mathematical proof library, and then promotes and proves more problems.

The mathematical proof library includes three types, namely Metamath, Lean and a self-developed proof environment.

To put it simply, these proof libraries convert ordinary mathematical language into a form similar to a programming language.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

Metamath’s main library is set.mm, which contains about 38,000 proofs based on ZFC set theory.

Lean is better known as Microsoft’s AI algorithm that can participate in IMO competitions. The Lean library is designed to teach the algorithm of the same name all the undergraduate mathematics knowledge and let it learn to prove these theorems.

The main goal of this research is to build a prover that can automatically generate a series of suitable strategies to prove the problem.

To this end, the researchers proposed a non-equilibrium hypergraph proof search algorithm based on MCTS.

MCTS is translated as Monte Carlo Tree Search, which is often used to solve game tree problems. It is well-known because of AlphaGo.

Its operation process is to find promising actions by randomly sampling in the search space, and then expand the search tree based on this action.

The idea adopted in this study is similar to this.

The search proof process starts from goal g, searches downward for methods, and gradually develops into a hypergraph.

When an empty set appears under a branch, it means that an optimal proof has been found.

Finally, during the backpropagation process, record the node values ​​and total number of operations of the supertree.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

In this link, the researchers assumed a strategy model and a judgment model.

The strategy model allows sampling by judgment models, which can evaluate the current strategy's ability to find proof methods.

The entire search algorithm uses the above two models as a reference.

These two models are Transformer models and share weights.

Next, comes the online training stage.

In this process, the controller will send the statement to asynchronous HTPS verification and collect training and proof data.

The validator will then send the training samples to the distributed trainer and periodically synchronize its model copies.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

Experimental results

In the testing session, the researchers compared HTPS with GPT-f.

The latter is a mathematical theorem reasoning model previously proposed by OpenAI, also based on Transformer.

The results show that the model after online training can prove 82% of the problems in Metamath, far exceeding the previous record of 56.5% of GPT-f.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

In the Lean library, this model can prove 43% of the theorems, which is 38% higher than SOTA. The following are the IMO test questions proved by this model.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

#But it’s not perfect yet.

For example, in the following question, it did not solve the question in the simplest way. The researchers said this was because of errors in the annotations.

AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.

One More Thing

Using computers to demonstrate mathematical problems, the proof of the four-color theorem is one of the most well-known examples.

The four-color theorem is one of the three major problems in modern mathematics. It states that "any map can use only four colors to color countries with common borders in different colors."

Because the demonstration of this theorem requires a lot of calculations, no one could fully demonstrate it within 100 years after it was proposed.

Until 1976, after 1,200 hours and 10 billion judgments on two computers at the University of Illinois, it was finally possible to demonstrate that any map only needs 4 colors to mark it. It caused a sensation in the entire mathematical community.

In addition, as mathematical problems become more complex, it becomes more difficult to use human power to check whether the theorem is correct.

Recently, the AI ​​community has gradually focused on mathematical problems.

In 2020, OpenAI launched the mathematical theorem reasoning model GPT-f, which can be used for automatic theorem proof.

This method can complete 56.5% of the proofs in the test set, exceeding the then SOTA model MetaGen-IL by more than 30%.

In the same year, Microsoft also released Lean, which can make IMO test questions, which means that AI can make questions that it has never seen before.

Last year, after OpenAI added a verifier to GPT-3, the effect of doing math problems was significantly better than the previous fine-tuning method, and it could reach 90% of the level of primary school students.

In January this year, a joint study from MIT, Harvard, Columbia University, and the University of Waterloo showed that the model they proposed can do high math.

In short, scientists are working hard to make AI, a partial subject, become both liberal arts and sciences.

The above is the detailed content of AI can prove 82% of the problems in mathematical databases. The new SOTA has been achieved, and it is still based on Transformer.. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How Tomcat logs help troubleshoot memory leaks How Tomcat logs help troubleshoot memory leaks Apr 12, 2025 pm 11:42 PM

Tomcat logs are the key to diagnosing memory leak problems. By analyzing Tomcat logs, you can gain insight into memory usage and garbage collection (GC) behavior, effectively locate and resolve memory leaks. Here is how to troubleshoot memory leaks using Tomcat logs: 1. GC log analysis First, enable detailed GC logging. Add the following JVM options to the Tomcat startup parameters: -XX: PrintGCDetails-XX: PrintGCDateStamps-Xloggc:gc.log These parameters will generate a detailed GC log (gc.log), including information such as GC type, recycling object size and time. Analysis gc.log

How to implement file sorting by debian readdir How to implement file sorting by debian readdir Apr 13, 2025 am 09:06 AM

In Debian systems, the readdir function is used to read directory contents, but the order in which it returns is not predefined. To sort files in a directory, you need to read all files first, and then sort them using the qsort function. The following code demonstrates how to sort directory files using readdir and qsort in Debian system: #include#include#include#include#include//Custom comparison function, used for qsortintcompare(constvoid*a,constvoid*b){returnstrcmp(*(

How to optimize the performance of debian readdir How to optimize the performance of debian readdir Apr 13, 2025 am 08:48 AM

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

How debian readdir integrates with other tools How debian readdir integrates with other tools Apr 13, 2025 am 09:42 AM

The readdir function in the Debian system is a system call used to read directory contents and is often used in C programming. This article will explain how to integrate readdir with other tools to enhance its functionality. Method 1: Combining C language program and pipeline First, write a C program to call the readdir function and output the result: #include#include#include#includeintmain(intargc,char*argv[]){DIR*dir;structdirent*entry;if(argc!=2){

How to configure firewall rules for Debian syslog How to configure firewall rules for Debian syslog Apr 13, 2025 am 06:51 AM

This article describes how to configure firewall rules using iptables or ufw in Debian systems and use Syslog to record firewall activities. Method 1: Use iptablesiptables is a powerful command line firewall tool in Debian system. View existing rules: Use the following command to view the current iptables rules: sudoiptables-L-n-v allows specific IP access: For example, allow IP address 192.168.1.100 to access port 80: sudoiptables-AINPUT-ptcp--dport80-s192.16

How to learn Debian syslog How to learn Debian syslog Apr 13, 2025 am 11:51 AM

This guide will guide you to learn how to use Syslog in Debian systems. Syslog is a key service in Linux systems for logging system and application log messages. It helps administrators monitor and analyze system activity to quickly identify and resolve problems. 1. Basic knowledge of Syslog The core functions of Syslog include: centrally collecting and managing log messages; supporting multiple log output formats and target locations (such as files or networks); providing real-time log viewing and filtering functions. 2. Install and configure Syslog (using Rsyslog) The Debian system uses Rsyslog by default. You can install it with the following command: sudoaptupdatesud

Where is the Debian Nginx log path Where is the Debian Nginx log path Apr 12, 2025 pm 11:33 PM

In the Debian system, the default storage locations of Nginx's access log and error log are as follows: Access log (accesslog):/var/log/nginx/access.log Error log (errorlog):/var/log/nginx/error.log The above path is the default configuration of standard DebianNginx installation. If you have modified the log file storage location during the installation process, please check your Nginx configuration file (usually located in /etc/nginx/nginx.conf or /etc/nginx/sites-available/ directory). In the configuration file

Debian mail server SSL certificate installation method Debian mail server SSL certificate installation method Apr 13, 2025 am 11:39 AM

The steps to install an SSL certificate on the Debian mail server are as follows: 1. Install the OpenSSL toolkit First, make sure that the OpenSSL toolkit is already installed on your system. If not installed, you can use the following command to install: sudoapt-getupdatesudoapt-getinstallopenssl2. Generate private key and certificate request Next, use OpenSSL to generate a 2048-bit RSA private key and a certificate request (CSR): openss

See all articles