Table of Contents
Technology comparable to the Nobel Prize
ProGen based on language model
Protein design, entering a new era
Home Technology peripherals AI Beyond the Nobel Prize? For the first time in the biological world, 'ChatGPT' has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Beyond the Nobel Prize? For the first time in the biological world, 'ChatGPT' has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Apr 13, 2023 am 09:43 AM
Model language learning

The application of artificial intelligence has greatly accelerated research on protein engineering.

Recently, a fledgling startup in Berkeley, California, once again made amazing progress.

Scientists used Progen, a protein engineering deep learning language model similar to ChatGPT, to achieve AI prediction of protein synthesis for the first time.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Not only are these proteins completely different from those known, the lowest similarity is even only 31.4 %, but as effective as natural protein.

Now, this work has been officially published in the Nature sub-journal.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Paper address: https://www.nature.com/articles/s41587-022-01618-2

#This experiment also shows that although natural language processing was developed for reading and writing language text, it can also learn some basic principles of biology.

Technology comparable to the Nobel Prize

In response, researchers said that this new technology may become more powerful than directed evolution (the Nobel Prize-winning protein design technology ) is more powerful.

"It will revitalize the 50-year-old field of protein engineering by accelerating the development of new proteins that can be used in virtually everything from therapeutics to degrading plastics."

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

The company is called Profluent. It was founded by the former head of Salesforce AI research and has received US$9 million in start-up funding. Yu established an integrated wet lab and recruited machine learning scientists and biologists.

In the past, it was very laborious to mine proteins in nature or adjust proteins to the required functions. Profulent's goal is to make this process effortless.

They did it.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Profluent founder and CEO Ali Madani

Madani said in the interview that Profulent has designed multiple families of proteins. These proteins function like exemplar proteins and are therefore highly active enzymes.

This task is very difficult and is done in a zero-shot manner, which means that multiple rounds of optimization are not performed, or even any data from the wet laboratory is not provided at all.

The resulting protein is a highly active protein that usually takes hundreds of years to evolve.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

ProGen based on language model

As a kind of deep neural network, the conditional language model is not only Semantically and grammatically correct, novel and diverse natural language text can be generated, and input control tags can be leveraged to guide style, topic, and more.

Similarly, researchers have developed today’s protagonist—ProGen, a conditional protein language model with 1.2 billion parameters.

Specifically, ProGen based on the Transformer architecture simulates the interaction of residues through a self-attention mechanism, and can generate different artificial protein sequences across protein families based on input control labels.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Generating artificial proteins using conditional language models

In order to create this model , the researchers fed the amino acid sequences of 280 million different proteins and let them "digest" for several weeks.

They then fine-tuned the model using 56,000 sequences from five lysozyme families and information about these proteins.

Progen’s algorithm is similar to GPT3.5, the model behind ChatGPT. It learns the ordering rules of amino acids in proteins and their relationship with protein structure and function.

Soon, the model generated a million sequences.

The researchers selected 100 for testing based on their similarity to natural protein sequences and the naturalness of their amino acid "syntax" and "semantics."

Of these, 66 produced chemical reactions similar to natural proteins that destroy bacteria in egg whites and saliva.

In other words, these new proteins generated by AI can also kill bacteria.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

The artificial proteins generated are diverse and well expressed in experimental systems

Going a step further, the researchers selected the five proteins that reacted most strongly and added them to samples of E. coli.

Among them, there are two artificial enzymes that can break down the cell wall of bacteria.

By comparing with hen egg white lysozyme (HEWL), it can be found that their activity is equivalent to HEWL.

The researchers then used X-rays for imaging.

Although the amino acid sequences of artificial enzymes are up to 30% different from existing proteins, and only 18% are the same between them, their shapes are similar to those in nature. Proteins are not that different and have comparable functions.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Applicability of conditional language modeling to other protein systems

Besides, for a highly evolved natural protein, it may only take a small mutation to stop it from working.

But the researchers found in another round of screening that even though only 31.4% of the sequences of the AI-generated enzymes were identical to known proteins, they still showed considerable activity and Similar structure.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Protein design, entering a new era

As you can see, the way ProGen works is very similar to ChatGPT similar.

ChatGPT can take MBA and bar exams and write college papers by studying massive data.

And ProGen learned how to generate new proteins by learning the syntax of how amino acids are combined into the 280 million existing proteins.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

In the interview, Madani said, “Just like ChatGPT learns human languages ​​such as English, we are learning the language of biology and proteins. ."

"Artificially designed proteins perform much better than proteins inspired by evolutionary processes," said James, co-author of the paper and professor of bioengineering and therapeutic sciences at the UCSF School of Pharmacy. Fraser said.

"Language models are learning aspects of evolution, but it is different from the normal evolutionary process. We now have the ability to adjust the production of these features to obtain specific effects. For example, let a Enzymes that are incredibly thermally stable, or prefer acidic environments, or don't interact with other proteins."

Back in 2020, Salesforce Research developed ProGen . It is based on natural language programming and was originally used to generate English text.

From previous work, researchers know that artificial intelligence systems can teach themselves grammar and word meanings, as well as other basic rules that make writing organized.

“When you train sequence-based models with large amounts of data, they are very powerful at learning structures and rules,” said Nikhil, director of artificial intelligence research at Salesforce Research and senior author of the paper. Dr. Naik said, "They will understand which words can appear together and how to combine them."

"Now, we have demonstrated the ability of ProGen to generate new proteins and made it public Released, everyone can conduct research based on ours."

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

Lysozyme, which is a protein, although very small , with up to about 300 amino acids.

But with 20 possible amino acids, there are 20^300 possible combinations.

This is more than all human beings throughout the ages multiplied by the number of grains of sand on the earth, multiplied by the number of atoms in the universe.

Given the near-infinite possibilities, it’s truly remarkable that Progen was able to design effective enzymes so easily.

Beyond the Nobel Prize? For the first time in the biological world, ChatGPT has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!

"Generate it from scratch right out of the box," said Dr. Ali Madani, founder of Profluent Bio and former research scientist at Salesforce Research. The ability to create functional proteins shows that we are entering a new era of protein design."

"This is a versatile new tool available to all protein engineers, and we look forward to seeing it used. Applied to treatment."

At the same time, researchers continue to improve ProGen, trying to break through more limitations and challenges.

One of them is that it relies heavily on data.

"We have explored ways to improve sequence design by adding structure-based information," Naik said. "We are also looking at when you don't have much information about a particular protein family or How to improve the model generation capabilities when using data in the field."

It is worth noting that some startups are also trying similar technologies, such as Cradle, and the Biotechnology Incubator Flagship Pioneering's Generate Biomedicines, but these studies have not yet been peer-reviewed.

The above is the detailed content of Beyond the Nobel Prize? For the first time in the biological world, 'ChatGPT' has synthesized a new protein from scratch, and it has been published in the Nature sub-journal!. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo The world's most powerful open source MoE model is here, with Chinese capabilities comparable to GPT-4, and the price is only nearly one percent of GPT-4-Turbo May 07, 2024 pm 04:13 PM

Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

AI subverts mathematical research! Fields Medal winner and Chinese-American mathematician led 11 top-ranked papers | Liked by Terence Tao AI subverts mathematical research! Fields Medal winner and Chinese-American mathematician led 11 top-ranked papers | Liked by Terence Tao Apr 09, 2024 am 11:52 AM

AI is indeed changing mathematics. Recently, Tao Zhexuan, who has been paying close attention to this issue, forwarded the latest issue of "Bulletin of the American Mathematical Society" (Bulletin of the American Mathematical Society). Focusing on the topic "Will machines change mathematics?", many mathematicians expressed their opinions. The whole process was full of sparks, hardcore and exciting. The author has a strong lineup, including Fields Medal winner Akshay Venkatesh, Chinese mathematician Zheng Lejun, NYU computer scientist Ernest Davis and many other well-known scholars in the industry. The world of AI has changed dramatically. You know, many of these articles were submitted a year ago.

KAN, which replaces MLP, has been extended to convolution by open source projects KAN, which replaces MLP, has been extended to convolution by open source projects Jun 01, 2024 pm 10:03 PM

Earlier this month, researchers from MIT and other institutions proposed a very promising alternative to MLP - KAN. KAN outperforms MLP in terms of accuracy and interpretability. And it can outperform MLP running with a larger number of parameters with a very small number of parameters. For example, the authors stated that they used KAN to reproduce DeepMind's results with a smaller network and a higher degree of automation. Specifically, DeepMind's MLP has about 300,000 parameters, while KAN only has about 200 parameters. KAN has a strong mathematical foundation like MLP. MLP is based on the universal approximation theorem, while KAN is based on the Kolmogorov-Arnold representation theorem. As shown in the figure below, KAN has

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Apr 01, 2024 pm 07:46 PM

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Time Series Forecasting NLP Large Model New Work: Automatically Generate Implicit Prompts for Time Series Forecasting Time Series Forecasting NLP Large Model New Work: Automatically Generate Implicit Prompts for Time Series Forecasting Mar 18, 2024 am 09:20 AM

Today I would like to share a recent research work from the University of Connecticut that proposes a method to align time series data with large natural language processing (NLP) models on the latent space to improve the performance of time series forecasting. The key to this method is to use latent spatial hints (prompts) to enhance the accuracy of time series predictions. Paper title: S2IP-LLM: SemanticSpaceInformedPromptLearningwithLLMforTimeSeriesForecasting Download address: https://arxiv.org/pdf/2403.05798v1.pdf 1. Large problem background model

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

FisheyeDetNet: the first target detection algorithm based on fisheye camera FisheyeDetNet: the first target detection algorithm based on fisheye camera Apr 26, 2024 am 11:37 AM

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

See all articles