Table of Contents
2 Basic knowledge
3 What is a task order?
4 How to model instructions?
5 Application
Home Technology peripherals AI What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of 'Instructional Learning'

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of 'Instructional Learning'

Apr 07, 2023 pm 07:51 PM
ai natural language

Task semantics can be represented by a set of input to output examples or a text instruction. Traditional natural language processing (NLP) machine learning methods mainly rely on the availability of large-scale task-specific sample sets.

But two problems arise: First, collect task-specific markup examples that are not suitable for tasks that may be too complex or expensive to annotate, or that the system requires Scenarios where new tasks are handled immediately; secondly, this is not user friendly as end users may prefer to provide a description of the task before using the system rather than a set of examples.

As a result, the community has become increasingly interested in a new supervision-seeking paradigm for NLP: From task instructions Study in . Despite the impressive progress, the community still faces some common issues.

This article attempts to summarize the current research on instruction learning from the following aspects:

(1) What is a task instruction and what kinds of instructions exist? Instruction type?

(2) How to model instructions?

(3) What factors affect and explain the execution of instructions?

(4) What challenges still exist in the directive?

To our knowledge, this is the first comprehensive investigation of textual instructions.

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of Instructional Learning

##Paper address: https://arxiv.org/pdf/2303.10475v2.pdf

1 Introduction

One of the goals of artificial intelligence is to build a system that can universally understand and solve new tasks. Labeled examples, as mainstream task representations, are unlikely to be widely available or even non-existent. So, are there other task representations that can contribute to task understanding? Task instructions provide another supervisory dimension for expressing task semantics, and instructions often contain more abstract and comprehensive knowledge of the target task than a single labeled example.

Instruction learning is inspired by typical human learning of new tasks, For example, a child can solve it well by learning from instructions and a few examples A new mathematical task. This new learning paradigm has recently attracted major attention from the machine learning and NLP communities.

As shown in Figure 1, through the availability of task instructions, systems can be quickly built to handle new tasks, especially when task-specific annotations are scarce.

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of Instructional Learning

When it comes to task instructions, most of us first associate the concept with prompts - using A short template reformats new input into a language modeling problem in order to generate responses for initiating PLM. Although hints are ubiquitous in text classification, machine translation, etc., hints are just a special case of instructions. This article provides a comprehensive and broader view of instruction-driven NLP research. Specifically, we try to answer the following questions:

  • What are task instructions and what types of instructions exist?
  • Given a task Instructions, how can they be encoded to help complete the target task?
  • What factors (such as model size, number of tasks) affect the performance of instruction-driven systems, and how to design better instructions?
  • What applications can instruction learning bring?
  • What challenges exist in instruction learning, and what are the future directions?

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of Instructional Learning

To our knowledge, this is the first paper to investigate learning from text instructions. Compared with some existing surveys that focus on specific context instructions, such as prompts, input-by-output demonstrations, or reasoning, we provide a broader perspective that connects different research in this field in an organized way. I hope this article can present a better instruction learning story and attract more colleagues to study this challenging artificial intelligence problem. We have also published a corresponding reading list for this survey.

2 Basic knowledge

For task-based learning, the goal is to drive the system to achieve the output of a given input by following instructions. Therefore, a dataset consists of three elements:

Input (X): the input of the instance; it can be a piece of text (such as sentiment classification) or a Group text (such as text implication, question answer, etc.).

Output (Y): The output of the instance; in a classification problem, it can be one or more predefined labels; in a text generation task, it Can be any open-form text.

Template (T): A text template that attempts to express the meaning of a task alone, or to act as a bridge between X and y. T may not yet be a component structure.

3 What is a task order?

Various types of text instructions have been used in previous zero-shot and few-shot NLP tasks, such as prompts, Amazon Mechanical Turk instructions, instructions supplemented by demonstrations, and Thought chain explanation. Different instructions were originally designed for different goals (e.g., Mturk instructions were originally created for human annotator understanding, prompts were for controlling PLM). In this section, as shown in Figure 2, we first summarize these instructions into three categories that perform different combinations of T, formal definition.

3.1 I=T^ Y:Entailment-led directive

A traditional solution for handling classification tasks is to The target label is converted into an index and the model is allowed to decide which index the input belongs to. This paradigm focuses on encoding input semantics while losing label semantics. In order for the system to recognize new labels without relying on a large number of labeled examples, Yin et al. propose to establish a hypothesis for each label - then, the derived truth value of the label is converted into the truth value of the determined hypothesis. As shown in Table 1, this method is built into instruction I and combines template T with label Y to interpret each target label Y. Since this paradigm naturally satisfies the format of textual entailment (TE, where task inputs and instructions can be viewed as premises and hypotheses, respectively), these types of instructions are called "entailment-oriented instructions."

The entailment-oriented instruction learning method has the following four advantages:

(1) Maintains the label semantics, so that Input encoding and output encoding receive equal attention when modeling input-output relationships;

(2) forms a unified reasoning process—textual implication—to handle various NLP Question;

(3) It creates the opportunity to leverage indirect supervision of existing TE datasets so that pre-trained TE models are expected to perform well on these targets without task-specific fine-tuning. Work on the task;

(4) Extend the original closed-set label classification problem to an open-domain open-form label recognition problem with a small number or even zero generic class samples.

Therefore, it is widely used in various few-shot/zero-shot classification tasks, such as classifying topics, emotions, postures, entity types and entity relationships.

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of Instructional Learning

##3.2 I=T^ X: PLM-oriented instructions (such as ˆ prompt)

A prompt is a representation of a PLM-oriented instruction. It is usually a short statement preceded by task input (prefix prompt), or a cloze question template (cloze prompt). It is mainly used to query intermediate responses (which can be further converted into final answers) from pre-trained language models (PLM).

Since the prompt input meets the pre-training goals of PLM, for example, the Gestalt-style input meets the masked language modeling goal, it helps to get rid of the dependence on traditional supervised fine-tuning and greatly alleviates the cost of manual annotation. . As a result, fast learning has achieved impressive results on a large number of previous few/zero-shot NLP tasks, such as question answering, machine translation, sentiment analysis, text entailment, and named entity recognition.

What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of Instructional Learning

3.3 People-oriented instructions

People-oriented instructions are basically Refers to instructions used for crowdsourcing on human annotation platforms (such as Amazon MTurk instructions). Unlike human-oriented instructions, human-oriented instructions are usually some human-readable, descriptive, paragraph-style task-specific text information, consisting of task titles, categories, definitions, things to avoid, etc. Therefore, human-centered instructions are more user-friendly and can be ideally applied to almost any complex NLP task.

4 How to model instructions?

In this section, we summarize several of the most popular instructional learning modeling strategies. Overall, this paper introduces four different modeling schemes: for early machine learning-based systems, (1) semantic parser-based strategies are a common method for encoding instructions; with the advent of neural networks and pre-trained language models Emerging, (2) cue template-based and (3) prefix-instruction-based instruction learning models have become two favored paradigms; recently, (4) hypernetwork-based methods have also attracted greater interest.

5 Application

##5.1 Human-computer interaction

Text instructions can be naturally regarded as A human-computer interaction method. Much previous work has used natural language instructions to "instruct" computers to perform a variety of real-world tasks.

For non-NLP (multimodal) tasks, most focus on environment-based language learning, that is, driving the agent to associate natural language instructions with the environment and make corresponding Reactions such as selecting mentioned objects from images/videos, following navigation instructions, drawing corresponding traces on the map, playing football/card games based on given rules, generating real-time sports broadcasts, controlling software and querying external databases. At the same time, instructions are also widely used to help communicate with systems to solve NLP tasks, such as following instructions for manipulating strings, classifying emails based on a given explanation, and text-to-code generation.

In recent years, more and more researches have tended to design the human-computer communication process in an iterative and modular manner. For example, Li et al. built a system to help users with daily tasks (e.g., ordering coffee or requesting an Uber). Thanks to the user-friendly graphical interface, the system can iteratively ask questions about tasks, and users can continually refine their instructions to avoid unclear descriptions or vague concepts. Similarly, Dwivedi-Yu et al. proposed a benchmark to iteratively guide PLM to improve text, where each iteration uses only a short set of instructions with a precise purpose (e.g., “simplify text” or “make text neutral”). In addition, Chakrabarty et al. built a collaborative poetry writing system where users can initially provide an ambiguous instruction (e.g., "Write a poem about cakes") and then gradually refine it with more details by observing the model's intermediate output. Instructions (e.g., "Contains the word -chocolate"). Meanwhile, Mishra and Nouri proposed a biography generation system that gradually collects necessary personal information from the user (by asking questions to guide the user in conversational scenarios) and ultimately generates a paragraph-based biography. In response to the problem that non-expert users have difficulty writing complete instructions at once, adopting an iterative and modular design paradigm in the design of instruction-based artificial intelligence systems can guide users to gradually enrich task instructions, thereby effectively alleviating users' thinking needs. Make the system more user-oriented. This article highlights the importance of this branch of work given its practical value.

5.2 Data and feature enhancement

Task orders are considered an indirect source of supervision, which sometimes contain superficial and arbitrary rules. These rules are also called labeling functions and can be applied directly to annotations (e.g., the sentence "a very fair price" is sentimentally positive because "the word price is directly preceded by fair"). Therefore, some existing works also use instructions as remote supervision to perform data or feature enhancement. For example, Srivastava et al. use semantic parsers to convert natural language explanations into logical forms and apply them to all instances in the dataset to generate additional binary features. Wang et al. used label interpretation to automatically annotate the original corpus and train a classifier on the generated noisy data. In addition to direct expansion, Su et al. further used task instructions to enrich the model representation and achieve strong cross-task generalization. Specifically, they trained an embedding model (single encoder) on a different instruction dataset with contrastive learning and then used the model to generate instruction-based task-specific representations for downstream unseen tasks.

5.3 Universal Language Model

According to the definition of Artificial General Intelligence (AGI), "General A "model" is usually a system that is capable of performing different tasks and scalable in changing environments, far beyond its creator's original expectations. Although specific to the NLP domain, the general language model should be an excellent multi-task assistant capable of proficiently handling a variety of real-world NLP tasks and different languages ​​in a completely zero-shot/few-shot manner. Since much existing work demonstrates the surprising ability of using instructions in cross-task generalization, this instruction is likely to be a breakthrough toward this ultimate goal.

It is worth noting that two recent notable applications of instructions, namely InstructGPT and ChatGPT, also indicate a big step towards building general language models. However, unlike other works that mainly adopt instructional learning, ChatGPT also adopts some other components such as reinforcement learning with human feedback (RLHF). While the answer to “which component contributes more to ChatGPT’s excellent results” remains vague and requires further investigation, we introduce some recent work to highlight the critical role of instruction learning. For example, Chung et al. conducted extensive experiments to evaluate human preference alignment for PaLM. They found that even without any human feedback, instruction fine-tuning significantly reduced the toxicities of PaLM's open generation, such as gender and occupational bias. Additionally, some other work has also used creative guidance alone rather than human feedback and achieved significant cross-task results. Although ChatGPT still has many unsatisfactory aspects and is still far from a universal language model, we hope that the goal of AGI can continue to be promoted through the adoption and development of more powerful technologies, including instruction learning.

The above is the detailed content of What is the instruction learning behind ChatGPT? PSU publishes its first comprehensive review of 'Instructional Learning'. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to implement file sorting by debian readdir How to implement file sorting by debian readdir Apr 13, 2025 am 09:06 AM

In Debian systems, the readdir function is used to read directory contents, but the order in which it returns is not predefined. To sort files in a directory, you need to read all files first, and then sort them using the qsort function. The following code demonstrates how to sort directory files using readdir and qsort in Debian system: #include#include#include#include#include//Custom comparison function, used for qsortintcompare(constvoid*a,constvoid*b){returnstrcmp(*(

How to optimize the performance of debian readdir How to optimize the performance of debian readdir Apr 13, 2025 am 08:48 AM

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

How debian readdir integrates with other tools How debian readdir integrates with other tools Apr 13, 2025 am 09:42 AM

The readdir function in the Debian system is a system call used to read directory contents and is often used in C programming. This article will explain how to integrate readdir with other tools to enhance its functionality. Method 1: Combining C language program and pipeline First, write a C program to call the readdir function and output the result: #include#include#include#includeintmain(intargc,char*argv[]){DIR*dir;structdirent*entry;if(argc!=2){

Debian mail server firewall configuration tips Debian mail server firewall configuration tips Apr 13, 2025 am 11:42 AM

Configuring a Debian mail server's firewall is an important step in ensuring server security. The following are several commonly used firewall configuration methods, including the use of iptables and firewalld. Use iptables to configure firewall to install iptables (if not already installed): sudoapt-getupdatesudoapt-getinstalliptablesView current iptables rules: sudoiptables-L configuration

How to configure firewall rules for Debian syslog How to configure firewall rules for Debian syslog Apr 13, 2025 am 06:51 AM

This article describes how to configure firewall rules using iptables or ufw in Debian systems and use Syslog to record firewall activities. Method 1: Use iptablesiptables is a powerful command line firewall tool in Debian system. View existing rules: Use the following command to view the current iptables rules: sudoiptables-L-n-v allows specific IP access: For example, allow IP address 192.168.1.100 to access port 80: sudoiptables-AINPUT-ptcp--dport80-s192.16

How to set the Debian Apache log level How to set the Debian Apache log level Apr 13, 2025 am 08:33 AM

This article describes how to adjust the logging level of the ApacheWeb server in the Debian system. By modifying the configuration file, you can control the verbose level of log information recorded by Apache. Method 1: Modify the main configuration file to locate the configuration file: The configuration file of Apache2.x is usually located in the /etc/apache2/ directory. The file name may be apache2.conf or httpd.conf, depending on your installation method. Edit configuration file: Open configuration file with root permissions using a text editor (such as nano): sudonano/etc/apache2/apache2.conf

How to learn Debian syslog How to learn Debian syslog Apr 13, 2025 am 11:51 AM

This guide will guide you to learn how to use Syslog in Debian systems. Syslog is a key service in Linux systems for logging system and application log messages. It helps administrators monitor and analyze system activity to quickly identify and resolve problems. 1. Basic knowledge of Syslog The core functions of Syslog include: centrally collecting and managing log messages; supporting multiple log output formats and target locations (such as files or networks); providing real-time log viewing and filtering functions. 2. Install and configure Syslog (using Rsyslog) The Debian system uses Rsyslog by default. You can install it with the following command: sudoaptupdatesud

How Debian OpenSSL prevents man-in-the-middle attacks How Debian OpenSSL prevents man-in-the-middle attacks Apr 13, 2025 am 10:30 AM

In Debian systems, OpenSSL is an important library for encryption, decryption and certificate management. To prevent a man-in-the-middle attack (MITM), the following measures can be taken: Use HTTPS: Ensure that all network requests use the HTTPS protocol instead of HTTP. HTTPS uses TLS (Transport Layer Security Protocol) to encrypt communication data to ensure that the data is not stolen or tampered during transmission. Verify server certificate: Manually verify the server certificate on the client to ensure it is trustworthy. The server can be manually verified through the delegate method of URLSession

See all articles