Table of Contents
1. Text-image generation
2. Text-text generation
3. Text-Robot Model
4. Text-Video
References:
Home Technology peripherals AI Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Apr 18, 2023 am 10:49 AM
ai chatgpt alphafold

The explosion of artificial intelligence is distorting our sense of time.

Can you believe that Stable Diffusion is only 4 months old and ChatGPT has been around for less than a month?

To use a vivid metaphor, as long as you blink, you will miss a brand new industry.

In the AI ​​field in 2022, large-scale generative models have sprung up like mushrooms after a rain, changing the landscape of the entire AI industry.

Moreover, these models are rapidly moving out of the laboratory and being applied in reality.

For example, LLM technology has inspired two emerging fields-decision-making agents (games, robots, etc.) and AI4Science.

Jim Fan, a disciple of Li Feifei, summarized the top ten AI highlight moments in 2022 for us. Let’s turn back the clock and see what amazing AI breakthroughs there will be in 2022.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

1. Text-image generation

DALLE-2 is the first to generate realistic high-resolution images from any title Large-scale diffusion models for images.

It launched an artistic revolution in AI, spawning many new applications, startups, and ways of thinking.

But DALLE-2 is protected behind the walls of OpenAI and is not open source.

After OpenAI, LMU's StabilityAI and runwayml took a heroic step and trained their own Internet-scale text2image model based on the "potential diffusion" algorithm. They call the model "stable diffusion" and open source the code and weights.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Facts have proved that the openness of Stable Diffusion has brought great changes to the game.

Now, many startups and research labs are creating new applications based on Stable Diffusion, and Stable Diffusion itself is continuously improved by the open source community.

Recently, Stable Diffusion has reached v2.1 and can run on a single GPU.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

In addition, there are two image2text models from GoogleAI this year. GoogleAI has neither released the model nor the API, but from the paper, we can still see many interesting insights.

Imagen

https://imagen.research.google

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Parti

https://parti.research.google. It is a Transformer model without diffusion.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

2. Text-text generation

As everyone knows, I am talking about ChatGPT!

This is the only app in history to gain 1 million users in 5 days.

ChatGPT has also greatly inspired our human creativity.

In this list, you can see all useful and imaginative ideas about ChatGPT: https://github.com/f/awesome-chat

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Both ChatGPT and GPT-3.5 use a new technology called RLHF ("Reinforcement Learning from Human Feedback").

This also means that the prompt project may disappear soon.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

The popularity of ChatGPT has spawned a wave of new startups and competitors, such as Jasper Chat, YouChat, Replit’s Ghostwriter chat, and perplexity_ai.

These competitors provide such intuitive search methods that even Google executives are starting to sweat!

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

3. Text-Robot Model

How to give GPT arms and legs so they can clean your messy kitchen?

Unlike NLP, robot models need to interact with the physical world.

This year, the large pre-trained Transformer finally began to solve the most difficult problems in the field of robotics!

VIMA

In October, my colleagues and I Created a "robot GPT" - a transformer named VIMA.

It can receive any mixed text, images and videos as prompts and output the control of the robot arm.

Our model is called VIMA ("VisuoMotor Attention") and is completely open source.

Now, a single agent can solve visual targets, one-time imitation of videos, new concept foundations, visual constraints, etc., with strong scalability of model capacity and data.

RT-1

Following a similar path to VIMA, researchers from GoogleAI released RT-1, a Robot transformer trained on 700 tasks and 130K human demonstrations.

This data was collected over 17 months by 13 robots, a literal army of steel!

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

4. Text-Video

Essentially, a video is a series of images tied together over time, giving us Creates the illusion of movement.

If we can do text2image, why not add a timeline to it for some extra fun?

Currently, there are three major works in the text-to-video field, but none of them are open source.

Make-A-Video

The first is Meta AI’s Make-A-Video: No need for paired text-video data, you can get text-video of generation.

You can sign up for trial access here: https://makeavevideo.studio

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/abs /2209.14792

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

##Imagen Video

Google AI’s Imagen Video: It uses a diffusion model to generate high-definition video, based on the Imagen static image generator.

Demo: http://imagen.research.google/video/

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/abs/2210.02303

Phenaki

Phenaki from Google AI: Generating variable-length videos from open-domain text descriptions.

Demo: https://phenaki.video

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/abs/2210.02399

5. Text-3D Modeling

From designing innovative products to creating fantastic visual effects in movies and games, 3D modeling is becoming text-X generation The next blue ocean of models.

Surprisingly, there are many promising 3D generative models in 2022. Here, Fan lists 3 models.

DreamFusion

The first to appear is DreamFusion jointly developed by the Google AI research team and UC Berkeley.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/pdf/2209.14988.pdf

This model is performed using a two-dimensional text-to-image diffusion model Text-to-3D synthesis.

Based on the NeRF algorithm, DreamFusion can generate 3D models from given text.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

The model can be viewed from any angle, relit under any lighting, and composited into any 3D environment.

Magic3D

The second result is two projects of the NVIDIA AI team, named GET3D and Magic3D.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

GET3D paper link: https://nv-tlabs.github.io/GET3D/assets/paper.pdf

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Magic3D paper link: https://arxiv.org/pdf/2211.10440.pdf

Trained using only 2D images, GET3D can generate 3D graphics with high-fidelity textures and complex geometric details.

This model allows users to instantly import their shapes into 3D renderers and game engines for subsequent editing.

Magic3D is similar to DreamFusion, using a text-to-image model to generate 2D images, which are then optimized into volumetric NeRF (neural radiation field) data, optimizing the coarse model generated at low resolution into a fine model at high resolution.


Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

The resulting Magic3D method can generate 3D objects faster than DreamFusion, according to the NVIDIA AI team.

Point-E

After the DALL-E 2 launched at the beginning of the year surprised everyone with its genius brush, OpenAI released its latest image generation model "POINT-E" on Tuesday , which can generate 3D models directly from text.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/pdf/2212.08751.pdf

Compared with competitors (such as Google’s DreamFusion) how many While a single GPU can work for hours, POINT-E can generate 3D images in minutes with just a single GPU.

According to the test, POINT-E can basically output 3D images in seconds after prompt input. In addition, the output image also supports custom editing, saving and other functions.

6. AI that can play "Minecraft"

"Minecraft" is an excellent game to test the general intelligence of AI. First of all, it is an infinitely open sandbox game that extremely reflects the player's creativity.

Secondly, the game has a player base of 140 million, which is twice the total population of the UK. With such a huge user base, there is an endless supply of game data for AI learning.

So, can AI use its imagination as much as humans can?

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Jim Fan and colleagues collaborated to develop the first AI to play "Minecraft", "MineDojo", which can solve many tasks under natural language prompts.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/pdf/2206.08853.pdf

Fan’s ultimate goal is to build an “embodied ChatGPT” . Currently, the MineDojo platform is completely open source.

At the same time, Jeff Clune’s team announced a model called Video Pre-Training (VPT), ​​which can directly output keyboard and mouse movements.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/pdf/2206.11795.pdf

VPT has a broader perspective, But it is not restricted by language conditions. At this point, MineDojo and VPT complement each other.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

##7. AI Diplomat

CICERO launched by Meta AI is the first to achieve human level performance in the game "Diplomacy" Expressive artificial intelligence agents.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://www.science.org/doi/10.1126/science.ade9097

"Diplomacy" It is a seven-player classic strategy game that can be said to be a combination of the board game Risk, the card game poker and the TV show Survivor. The game requires extensive natural language negotiation to cooperate and compete with humans.

However, the emergence of CICERO shows that artificial intelligence now has the ability to persuade others and bluff.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Currently, DeepMind has also announced the development of its own diplomat AI agent. So, what will happen if CICERO uses this AI model?

8. Audio-Text Model

Whisper is a large-scale open source speech recognition model released by OpenAI. It has near-human level robustness and accuracy in English speech recognition. accuracy.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://arxiv.org/pdf/2212.04356.pdf

Whisper passed 680 from the Internet ,000 hours of training on audio data. Open AI emphasizes that Whisper’s speech recognition capabilities have reached human levels.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Open AI will open source Whisper. Is it to unlock more text tokens to train the much-anticipated GPT-4?

9. Nuclear fusion

DeepMind and the Swiss Federal Institute of Technology in Lausanne (EPFL) jointly developed the first nuclear fusion-related deep reinforcement learning system, which can maintain nuclear Stabilization of fusion plasma within a tokamak.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Paper link: https://www.nature.com/articles/s41586-021-04301-9

Same This month, the U.S. Department of Energy announced a huge breakthrough: For the first time, mankind has achieved a net energy gain from a nuclear fusion reaction!

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

This is the first time humans have achieved this milestone. In this life, we may become a fusion civilization!

10. Transformer applied in biology

In 2021, AlphaFold kicked off the prediction of protein 3D structure by language model.

In July, DeepMind announced “Protein Universe”—expanding AlphaFold’s protein database to 200 million structures!

In addition, the NVIDIA AI research team has also expanded the BioNeMo large-scale language model framework to help biotechnology companies and researchers generate, predict and understand biomolecule data.

Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list

Video explanation: https://www.youtube.com/watch?v=PWcNlRI00jo&t=4399s

The above is Jim Fan’s comments on the 2022 October An inventory of the highlights of big AI. Of course, Fan also said that there are countless exciting works that have contributed to the advancement of artificial intelligence.

Every paper is a brick in the AI ​​building, and all efforts should be celebrated.

However, Fan also emphasized at the end that as artificial intelligence systems become more and more powerful, we must be aware of potential dangers and risks and take measures to mitigate them.

Whether it is through careful training design, appropriate supervision or new safeguard methods, the safety and ethics of artificial intelligence have become an agenda discussed by more and more AI experts.

There is no doubt that 2022 is a year full of miracles and an amazing year. What breakthroughs will be made in the next year that will shock the world? We are watching with you.

References:

https://twitter.com/drjimfan/status/1607746957753057280?s=46&t=OVM_4zdRW2rQwqLohMdPpw

The above is the detailed content of Li Feifei takes stock of the top ten AI highlights of the year: nuclear fusion, ChatGPT, and AlphaFold are on the list. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Why is it necessary to pass pointers when using Go and viper libraries? Why is it necessary to pass pointers when using Go and viper libraries? Apr 02, 2025 pm 04:00 PM

Go pointer syntax and addressing problems in the use of viper library When programming in Go language, it is crucial to understand the syntax and usage of pointers, especially in...

Why do all values ​​become the last element when using for range in Go language to traverse slices and store maps? Why do all values ​​become the last element when using for range in Go language to traverse slices and store maps? Apr 02, 2025 pm 04:09 PM

Why does map iteration in Go cause all values ​​to become the last element? In Go language, when faced with some interview questions, you often encounter maps...

Is there a free XML to PDF tool for mobile phones? Is there a free XML to PDF tool for mobile phones? Apr 02, 2025 pm 09:12 PM

There is no simple and direct free XML to PDF tool on mobile. The required data visualization process involves complex data understanding and rendering, and most of the so-called "free" tools on the market have poor experience. It is recommended to use computer-side tools or use cloud services, or develop apps yourself to obtain more reliable conversion effects.

How to correctly import custom packages under Go Modules? How to correctly import custom packages under Go Modules? Apr 02, 2025 pm 03:42 PM

In Go language development, properly introducing custom packages is a crucial step. This article will target "Golang...

How to beautify the XML format How to beautify the XML format Apr 02, 2025 pm 09:57 PM

XML beautification is essentially improving its readability, including reasonable indentation, line breaks and tag organization. The principle is to traverse the XML tree, add indentation according to the level, and handle empty tags and tags containing text. Python's xml.etree.ElementTree library provides a convenient pretty_xml() function that can implement the above beautification process.

Why does the code using locks in Go occasionally lead to panic? Why does the code using locks in Go occasionally lead to panic? Apr 02, 2025 pm 04:36 PM

Why does using locks cause panic occasionally? Let's take a look at an interesting question: Why in Go, even if locks are added in the code, sometimes...

How to verify the xml format How to verify the xml format Apr 02, 2025 pm 10:00 PM

XML format validation involves checking its structure and compliance with DTD or Schema. An XML parser is required, such as ElementTree (basic syntax checking) or lxml (more powerful verification, XSD support). The verification process involves parsing the XML file, loading the XSD Schema, and executing the assertValid method to throw an exception when an error is detected. Verifying the XML format also requires handling various exceptions and gaining insight into the XSD Schema language.

How to use char array in C language How to use char array in C language Apr 03, 2025 pm 03:24 PM

The char array stores character sequences in C language and is declared as char array_name[size]. The access element is passed through the subscript operator, and the element ends with the null terminator '\0', which represents the end point of the string. The C language provides a variety of string manipulation functions, such as strlen(), strcpy(), strcat() and strcmp().

See all articles