


AI is not learned! New research reveals ways to decipher the black box of artificial intelligence
Artificial Intelligence (AI) has been developing rapidly, but to humans, powerful models are a "black box."
We don’t understand the inner workings of the model and the process by which it reaches its conclusions.
However, recently, Professor Jürgen Bajorath, a chemical informatics expert at the University of Bonn, and his team have made a major breakthrough.
They have designed a technique that reveals how some artificial intelligence systems used in drug research operate.
Research shows that artificial intelligence models predict drug effectiveness primarily by recalling existing data, rather than learning specific chemical interactions.
——In other words, AI predictions are purely based on piecing together memories, and machine learning does not actually learn!
Their research results were recently published in the journal Nature Machine Intelligence.
Paper address: https://www.nature.com/articles/s42256-023-00756-9
In the field of medicine, researchers are feverishly searching for effective active substances to fight disease - which drug molecules are the most effective?
Typically, these effective molecules (compounds) are docked to proteins, which act as enzymes or receptors that trigger specific physiological chains of action.
In special cases, certain molecules are also responsible for blocking adverse reactions in the body, such as excessive inflammatory responses.
The number of possible compounds is huge, and finding the one that works is like looking for a needle in a haystack.
So the researchers first used AI models to predict which molecules would best dock and bind strongly to their respective target proteins. These drug candidates are then further screened in more detail in experimental studies.
Since the development of artificial intelligence, drug discovery research has increasingly adopted AI-related technologies.
For example, graph neural network (GNN) is suitable for predicting the strength of binding of a certain molecule to a target protein.
A graph consists of nodes representing objects and edges representing relationships between nodes. In the graph representation of a protein-ligand complex, the edges of the graph connect protein or ligand nodes, representing the structure of a substance, or the interaction between a protein and a ligand.
GNN models use protein-ligand interaction maps extracted from X-ray structures to predict ligand affinities.
Professor Jürgen Bajorath said that the GNN model is like a black box to us, and we have no way of knowing how it derives its predictions.
Professor Jürgen Bajorath works at the LIMES Institute of the University of Bonn and the Bonn-Aachen International Center for Information Technology (Bonn-Aachen International Center for Information Technology) and the Lamarr Institute for Machine Learning and Artificial Intelligence.
How does artificial intelligence work?
Researchers from the Chemical Informatics Department of the University of Bonn, together with colleagues from the Sapienza University of Rome, analyzed in detail whether graph neural networks really learn the interactions between proteins and ligands. effect.
The researchers analyzed a total of six different GNN architectures using their specially developed "EdgeSHAPer" method.
The EdgeSHAPer program can determine whether the GNN has learned the most important interactions between compounds and proteins, or made predictions through other means.
The scientists trained six GNNs using graphs extracted from the structures of protein-ligand complexes - where the compound's mode of action and the strength of its binding to the target protein are known.
Then, test the trained GNN on other compounds and use EdgeSHAPer to analyze how the GNN produces predictions.
“If GNNs behave as expected, they need to learn the interactions between compounds and target proteins and make predictions by prioritizing specific interactions.”
However, according to the research team’s analysis, the six GNNs basically failed to do this. Most GNNs only learn some protein-drug interactions, focusing mainly on ligands.
The above figure shows the experimental results in 6 GNNs. The color-coded bars represent the top 25 edges of each prediction determined with EdgeSHAPer. The average proportion of proteins, ligands, and interactions in .
We can see that the interaction represented by green should be what the model needs to learn, but the proportion in the entire experiment is not high, while the orange color representing the ligand Articles account for the largest proportion.
To predict the binding strength of a molecule to a target protein, models primarily "remember" the chemically similar molecules they encountered during training and their binding data, regardless of the target protein. . These remembered chemical similarities essentially determine the prediction.
This is reminiscent of the "Clever Hans effect" - just like the horse that looks like it can count Horses, in effect, infer expected outcomes based on subtle differences in their companions' facial expressions and gestures.
This may mean that the so-called "learning ability" of GNN may be untenable, and the model's predictions are largely overestimated because chemical knowledge can be used Make predictions of the same quality as simpler methods.
However, another phenomenon was also found in the study: as the potency of the test compound increases, the model tends to learn more interactions.
Perhaps by modifying the representation and training techniques, these GNNs can be further improved in the desired direction. However, the assumption that physical quantities can be learned from molecular graphs should generally be treated with caution.
「Artificial intelligence is not black magic.」
The above is the detailed content of AI is not learned! New research reveals ways to decipher the black box of artificial intelligence. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Configuring a Debian mail server's firewall is an important step in ensuring server security. The following are several commonly used firewall configuration methods, including the use of iptables and firewalld. Use iptables to configure firewall to install iptables (if not already installed): sudoapt-getupdatesudoapt-getinstalliptablesView current iptables rules: sudoiptables-L configuration

This article describes how to adjust the logging level of the ApacheWeb server in the Debian system. By modifying the configuration file, you can control the verbose level of log information recorded by Apache. Method 1: Modify the main configuration file to locate the configuration file: The configuration file of Apache2.x is usually located in the /etc/apache2/ directory. The file name may be apache2.conf or httpd.conf, depending on your installation method. Edit configuration file: Open configuration file with root permissions using a text editor (such as nano): sudonano/etc/apache2/apache2.conf

The readdir function in the Debian system is a system call used to read directory contents and is often used in C programming. This article will explain how to integrate readdir with other tools to enhance its functionality. Method 1: Combining C language program and pipeline First, write a C program to call the readdir function and output the result: #include#include#include#includeintmain(intargc,char*argv[]){DIR*dir;structdirent*entry;if(argc!=2){

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

In Debian systems, the readdir function is used to read directory contents, but the order in which it returns is not predefined. To sort files in a directory, you need to read all files first, and then sort them using the qsort function. The following code demonstrates how to sort directory files using readdir and qsort in Debian system: #include#include#include#include#include//Custom comparison function, used for qsortintcompare(constvoid*a,constvoid*b){returnstrcmp(*(

The steps to install an SSL certificate on the Debian mail server are as follows: 1. Install the OpenSSL toolkit First, make sure that the OpenSSL toolkit is already installed on your system. If not installed, you can use the following command to install: sudoapt-getupdatesudoapt-getinstallopenssl2. Generate private key and certificate request Next, use OpenSSL to generate a 2048-bit RSA private key and a certificate request (CSR): openss

Using OpenSSL for digital signature verification on Debian systems, you can follow these steps: Preparation to install OpenSSL: Make sure your Debian system has OpenSSL installed. If not installed, you can use the following command to install it: sudoaptupdatesudoaptininstallopenssl to obtain the public key: digital signature verification requires the signer's public key. Typically, the public key will be provided in the form of a file, such as public_key.pe

In Debian systems, OpenSSL is an important library for encryption, decryption and certificate management. To prevent a man-in-the-middle attack (MITM), the following measures can be taken: Use HTTPS: Ensure that all network requests use the HTTPS protocol instead of HTTP. HTTPS uses TLS (Transport Layer Security Protocol) to encrypt communication data to ensure that the data is not stolen or tampered during transmission. Verify server certificate: Manually verify the server certificate on the client to ensure it is trustworthy. The server can be manually verified through the delegate method of URLSession
