Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT-AI-php.cn

Home

Technology peripherals

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

王林

Apr 24, 2023 pm 04:04 PM

chatgpt paper

After ChatGPT became popular, many students who pay attention to technology are asking a question: Are there any learning materials that can allow us to systematically understand the principles behind ChatGPT? This problem becomes tricky because OpenAI has not released a paper related to ChatGPT.

However, we know from OpenAI’s blog about ChatGPT that the method used by ChatGPT is the same as its brother model-InstructGPT, except that InstructGPT is fine-tuned on GPT-3 , while ChatGPT is based on GPT-3.5. There are also some differences between the two in terms of data collection.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

Blog link: https://openai.com/blog/chatgpt/

The InstructGPT paper was released in March 2022, but OpenAI published a related blog as early as January (see "What to do with GPT-3 nonsense? OpenAI: We re-trained it" , the new version is more "obedient"). At that time, OpenAI clearly mentioned that InstructGPT used the reinforcement learning method of human feedback (RLHF) to fine-tune GPT-3, making the output of the model more consistent with human preferences. This has been continued in the training of ChatGPT.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

Paper link: https://arxiv.org/pdf/2203.02155.pdf

In addition, there are many similarities between InstructGPT and ChatGPT. Therefore, a thorough understanding of the InstructGPT paper will be of great benefit to students who want to do some work in the direction of ChatGPT. This is why we highly recommend Li Mu’s lectures.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

Course address: https://jmq.xet.tech/s/2lec6b (Click "Read Original text" can be accessed directly)

Dr. Li Mu is the senior chief scientist of Amazon. He previously co-authored "Hands-on Deep Learning" with Aston Zhang and others. In the past two years, he has been introducing various AI knowledge to everyone through videos and produced intensive reading courses on dozens of papers. Many students have developed the habit of reading papers intensively with Li Mu.

Dr. Li Mu’s account on Station B is “Learn AI from Li Mu”.

This interpretation course for InstructGPT lasts 67 minutes in total and is basically introduced in the order of writing the paper.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

Students who have read the ChatGPT blog know that its technical principles can basically be summarized in one picture. This It is also a picture that has already appeared in the InstructGPT paper (there are subtle differences between the two). When interpreting the abstract and introduction of the paper, Li Mu introduced the three steps in the diagram in detail.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

Technical schematics from the ChatGPT blog.

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

InstructGPT Technical schematic from the paper.

In the third chapter of the paper, the authors of InstructGPT first introduced their data acquisition method and process, and Li Mu also took everyone to read it in detail. This part is very valuable in engineering. As Li Mu said, if you have never done anything like this before (data labeling, etc.) and need to find someone to help you label data, then you can look at its appendix, which contains many templates that can be used directly. The author of the paper It even describes what the UI of their annotated website looks like, which is worth learning from.

Next, Li Mu focused on the three models written in Chapter 3 (see 3.5 Models) - SFT (Supervised fine-tuning) model, RM (Reward modeling) model and RL (Reinforcement learning) ) models, including details such as parameters and objective functions involved in these models.

Finally, Li Mu concluded that technically speaking, InstructGPT is still a very practical technology. It tells everyone a method: given a large language model, how can you quickly improve its performance in a field you care about through some annotated data to make it practical. Therefore, it provides an operational idea for people who want to use generative models to make products.

Of course, as Dr. Li Mu said, scientific research work is step-by-step, and InstructGPT is also based on previous research, so students who want to thoroughly understand ChatGPT will inevitably have to go back and read it. More papers. In previous courses, Li Mu also interpreted the papers of GPT, GPT-2, and GPT-3 in detail:

Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT

## Course address: https://jmq.xet.tech/s/2lec6b

The above is the detailed content of Quickly learn the key technical points of the InstructGPT paper: follow Li Mu to master the technology behind ChatGPT. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7597

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

123

Related knowledge

ChatGPT now allows free users to generate images by using DALL-E 3 with a daily limit Aug 09, 2024 pm 09:37 PM

DALL-E 3 was officially introduced in September of 2023 as a vastly improved model than its predecessor. It is considered one of the best AI image generators to date, capable of creating images with intricate detail. However, at launch, it was exclus

The Stable Diffusion 3 paper is finally released, and the architectural details are revealed. Will it help to reproduce Sora? Mar 06, 2024 pm 05:34 PM

StableDiffusion3’s paper is finally here! This model was released two weeks ago and uses the same DiT (DiffusionTransformer) architecture as Sora. It caused quite a stir once it was released. Compared with the previous version, the quality of the images generated by StableDiffusion3 has been significantly improved. It now supports multi-theme prompts, and the text writing effect has also been improved, and garbled characters no longer appear. StabilityAI pointed out that StableDiffusion3 is a series of models with parameter sizes ranging from 800M to 8B. This parameter range means that the model can be run directly on many portable devices, significantly reducing the use of AI

The perfect combination of ChatGPT and Python: creating an intelligent customer service chatbot Oct 27, 2023 pm 06:00 PM

The perfect combination of ChatGPT and Python: Creating an Intelligent Customer Service Chatbot Introduction: In today’s information age, intelligent customer service systems have become an important communication tool between enterprises and customers. In order to provide a better customer service experience, many companies have begun to turn to chatbots to complete tasks such as customer consultation and question answering. In this article, we will introduce how to use OpenAI’s powerful model ChatGPT and Python language to create an intelligent customer service chatbot to improve

How to install chatgpt on mobile phone Mar 05, 2024 pm 02:31 PM

Installation steps: 1. Download the ChatGTP software from the ChatGTP official website or mobile store; 2. After opening it, in the settings interface, select the language as Chinese; 3. In the game interface, select human-machine game and set the Chinese spectrum; 4 . After starting, enter commands in the chat window to interact with the software.

How to develop an intelligent chatbot using ChatGPT and Java Oct 28, 2023 am 08:54 AM

In this article, we will introduce how to develop intelligent chatbots using ChatGPT and Java, and provide some specific code examples. ChatGPT is the latest version of the Generative Pre-training Transformer developed by OpenAI, a neural network-based artificial intelligence technology that can understand natural language and generate human-like text. Using ChatGPT we can easily create adaptive chats

NeRF and the past and present of autonomous driving, a summary of nearly 10 papers! Nov 14, 2023 pm 03:09 PM

Since Neural Radiance Fields was proposed in 2020, the number of related papers has increased exponentially. It has not only become an important branch of three-dimensional reconstruction, but has also gradually become active at the research frontier as an important tool for autonomous driving. NeRF has suddenly emerged in the past two years, mainly because it skips the feature point extraction and matching, epipolar geometry and triangulation, PnP plus Bundle Adjustment and other steps of the traditional CV reconstruction pipeline, and even skips mesh reconstruction, mapping and light tracing, directly from 2D The input image is used to learn a radiation field, and then a rendered image that approximates a real photo is output from the radiation field. In other words, let an implicit three-dimensional model based on a neural network fit the specified perspective

Can chatgpt be used in China? Mar 05, 2024 pm 03:05 PM

chatgpt can be used in China, but cannot be registered, nor in Hong Kong and Macao. If users want to register, they can use a foreign mobile phone number to register. Note that during the registration process, the network environment must be switched to a foreign IP.

The Chinese team won the best paper and best system paper awards, and the CoRL research results were announced. Nov 10, 2023 pm 02:21 PM

Since it was first held in 2017, CoRL has become one of the world's top academic conferences in the intersection of robotics and machine learning. CoRL is a single-theme conference for robot learning research, covering multiple topics such as robotics, machine learning and control, including theory and application. The 2023 CoRL Conference will be held in Atlanta, USA, from November 6th to 9th. According to official data, 199 papers from 25 countries were selected for CoRL this year. Popular topics include operations, reinforcement learning, and more. Although CoRL is smaller in scale than large AI academic conferences such as AAAI and CVPR, as the popularity of concepts such as large models, embodied intelligence, and humanoid robots increases this year, relevant research worthy of attention will also

See all articles