Table of Contents
2. Baidu PLATO" >2. Baidu PLATO
1. Open domain dialogue "one-to-many" problem " >1. Open domain dialogue "one-to-many" problem
2. PLATO-1 latent space dialogue generation model" >2. PLATO-1 latent space dialogue generation model
4, PLATO-2 in DSTC-9" >4, PLATO-2 in DSTC-9
5. PLATO-XL’s first tens of billions of parameters Chinese and English dialogue generation model" >5. PLATO-XL’s first tens of billions of parameters Chinese and English dialogue generation model
8. PLATO-KAG unsupervised knowledge dialogue based on joint optimization" >8. PLATO-KAG unsupervised knowledge dialogue based on joint optimization
9. PLATO Comprehensive Knowledge Enhancement Dialogue" >9. PLATO Comprehensive Knowledge Enhancement Dialogue
2. Challenges encountered in landing applications" >2. Challenges encountered in landing applications
3. Outlook" >3. Outlook
4. Quote " >4. Quote
5. Q&A session" >5. Q&A session
#Q: How is the effectiveness of the dialogue evaluated? " >#Q: How is the effectiveness of the dialogue evaluated?
Home Technology peripherals AI Human-computer interactive dialogue driven by large models

Human-computer interactive dialogue driven by large models

Apr 11, 2023 pm 07:27 PM
digital man natural language generation

Human-computer interactive dialogue driven by large models

Introduction: Dialogue technology is one of the core capabilities of digital human interaction. This sharing is mainly from Baidu Starting from PLATO-related R&D and applications, let’s talk about the impact of large models on dialogue systems and some opportunities for digital humans. The title of this sharing is: Human-computer interactive dialogue promoted by large models.

Today’s introduction starts from the following points:

  • Overview of the dialogue system
  • Baidu PLATO and related technologies
  • ##Applications, challenges and prospects of dialogue large models

1. Overview of the dialogue system

1. Overview of the dialogue system

# #In daily life, we often come into contact with some task-oriented dialogue systems, such as asking a mobile assistant to set an alarm or asking a smart speaker to play a song. The technology for this kind of vertical dialogue in a specific field is relatively mature, and the system design is usually modular, including modules such as dialogue understanding, dialogue management, and natural language generation.

Human-computer interactive dialogue driven by large models

The general process of traditional task-based dialogue is as follows: the user inputs a sentence, and the system parses it out through the natural language understanding module Related intentions and slot-value pairs, these slots are predefined; the dialogue management module tracks the status of multiple rounds of dialogue, and interacts with external databases to make system action decisions; and then uses the dialogue generation module , the output reply is returned to the user.

#In recent years, a lot of research has been done on open-domain dialogue technology, which means that you can chat on any topic without limiting the field. Representative works include Google Meena, Mata Blender, and Baidu PLATO. Compared with traditional modular dialogue systems, these end-to-end dialogue systems directly generate corresponding replies given the context of the dialogue.

2. End-to-end dialogue generation - new opportunities for dialogue systems

Human-computer interactive dialogue driven by large models End-to-end The end-to-end dialogue system can be designed based on RNN, LSTM or Transformer. The network architecture mainly includes two parts: encoder Encoder and decoder Decoder.

The encoder encodes the dialogue text into a vector and understands the dialogue content. ​

The decoder generates the corresponding reply based on the dialogue vector and the previous hidden vector. The training corpus is mainly Renren dialogue material, and comments can be extracted from public social media forums (Weibo, Tieba, Twitter, etc.) as approximate dialogue material. The training objective is mainly to minimize the negative log-likelihood function.

3. Challenges facing open domain dialogue

Large-scale models trained based on a large amount of corpus can already produce relatively coherent responses, but there are still many problems.

#The first problem is that the content is relatively empty and lacks information. The model's replies are relatively brief and have no substantial content, which can easily reduce the user's willingness to chat.

#Another problem is knowledge abuse. Some of the detailed information returned by the model is sometimes wrong or fabricated.

2. Baidu PLATO

Baidu PLATO has done some technical exploration on the above two types of problems.

#In view of the content holes, a pre-training dialogue generation technology based on discrete latent variables is proposed to achieve the rational and diverse generation of open domain replies. Regarding the problem of knowledge abuse, a weakly supervised dialogue generation model that integrates knowledge is proposed, which alleviates the problem of knowledge abuse to a certain extent and improves dialogue richness and knowledge accuracy.

1. Open domain dialogue "one-to-many" problem

Why does the dialogue model produce "empty content" Safe reply"?

Essentially, open domain dialogue is a one-to-many issue. In one dialogue, there are usually many reasonable responses, with different backgrounds and experiences. , depending on the scenario, the responses given may be different. Neural network training is usually mapped one by one, and what is learned is the average state of these responses, such as "very good" and "hahaha", which are safe and non-informative responses.

Human-computer interactive dialogue driven by large models

2. PLATO-1 latent space dialogue generation model

##PLATO -1 Proposes modeling of dialogue one-to-many relationships based on discrete latent variables.

This involves two tasks, mapping the original dialogue context and dialogue response to the latent variable Latent Action, and then learning the reply based on the latent variable generate. PLATO uses the same network to jointly model two tasks. It first estimates the distribution of latent variables, samples latent variables through Gumbel Softmax, and then learns to generate responses. In this way, diverse responses can be generated by sampling different latent variables. .

Human-computer interactive dialogue driven by large models

The case shows that different latent variables are selected to produce different response effects. These responses are all based on the responses above and are of good quality, appropriate and informative.

Human-computer interactive dialogue driven by large models

3. PLATO-2 Universal dialogue model based on course learning

PLATO-2 continues to expand on the basis of PLATO-1. In terms of parameters, it has reached a scale of 1.6 billion; in terms of pre-training corpus, there are 1.2 billion Chinese dialogue samples and 700 million English samples; in terms of training methods, it is based on course learning. What is Curriculum Learning? Just learn the simple ones first and then the complex ones.

In addition, PLATO-2 continues to use the unified network design PrefixLM, while learning dialogue understanding and reply generation. Training based on course learning is highly efficient, and unified network-based training is highly cost-effective.

Human-computer interactive dialogue driven by large models

PLATO-2 The first stage trains simplified general reply generation, and the second stage trains diversified reply generation , latent variables are added at this stage. The second stage also introduces dialogue coherence assessment training. Compared with the common generation probability ranking, coherence assessment effectively improves the quality of reply selection.

Human-computer interactive dialogue driven by large models

Can PLATO-2 serve as a universal dialogue framework? We know that the dialogue field is roughly divided into three categories, task-based dialogue, knowledge dialogue and open domain chat system. It is too expensive to pre-train different types of dialogue systems separately. PLATO-2's course learning mechanism can help it become a universal dialogue framework. Task-based dialogue is relatively focused. The one-to-one mapping model in the first stage of course learning just meets this situation. There are one-to-many situations in both knowledge dialogue and casual chat. In the knowledge dialogue, you can use different knowledge to reply to the user, and in the casual chat dialogue, you can There are different reply directions, so the second-stage model of course learning can be applied to knowledge dialogue and chat systems.

4, PLATO-2 in DSTC-9

In order to verify this capability, PLATO-2 participated DSTC, an international competition in the field of dialogue, comprehensively covers various dialogue fields. PLATO-2 won 5 championships in 6 tasks with a unified technical framework. This is the first time in the history of DSTC.

Human-computer interactive dialogue driven by large models

5. PLATO-XL’s first tens of billions of parameters Chinese and English dialogue generation model

What effect will be achieved if we continue to increase the parameter scale of the PLATO model? In September 2021, we launched PLATO-XL, the world's first tens-billion-scale Chinese and English conversation generation model.

Human-computer interactive dialogue driven by large models

In Chinese and English, several common commercial products are compared in terms of rationality, richness and attraction. When evaluated from other angles, PLATO's effect is far ahead.

Human-computer interactive dialogue driven by large models

## The WeChat public account "Baidu PLATO" is connected to the PLATO-XL model, and everyone can try it out and experience it.

Human-computer interactive dialogue driven by large models

PLATO The number of model parameters ranges from 100 million to one billion to tens of billions. In fact, when it reaches the billions The conversation has become smoother and smoother, and the model's logical capabilities have significantly improved when it reaches tens of billions of scale.

#6. Knowledge abuse problem

Large models all have the problem of knowledge abuse. How to solve it? How do we humans solve problems we don’t understand? You might check it on a search engine. Can this method of searching for external knowledge be used in the model?

Human-computer interactive dialogue driven by large models

Integrating external knowledge to assist reply generation is a promising direction to alleviate knowledge abuse. However, for large-scale dialogue materials, only the dialogue text and reply information exist, and it is impossible to know the correspondence between a certain corpus and external knowledge, that is, there is a lack of label information for knowledge selection.

Human-computer interactive dialogue driven by large models

7. PostKS knowledge selection based on posterior guidance

PostKS It is one of the representative works in the field of knowledge dialogue. It proposes knowledge selection based on posterior guidance. During the training process, the prior knowledge distribution is approximated to the posterior knowledge distribution.

Human-computer interactive dialogue driven by large models

#In the inference stage, since there is no posterior information, the model needs to use prior knowledge to generate responses. There will be inconsistencies in the training and inference phases. Training is based on posterior but inference can only be based on prior.

8. PLATO-KAG unsupervised knowledge dialogue based on joint optimization

PLATO-KAG unsupervised model, Knowledge selection and reply generation are jointly modeled. The top-k pieces of knowledge are selected based on a priori and sent to the generative model for end-to-end joint training. If the knowledge is selected accurately, it will be very helpful in generating the target reply, and the generation probability will be relatively high. Joint optimization will encourage this selection and make use of the given knowledge; if the knowledge is poorly selected, it will have no effect on generating the target reply, and the generation probability will be relatively high. Low, joint optimization suppresses this choice and ignores the given knowledge. This optimizes both knowledge selection and reply generation.

Human-computer interactive dialogue driven by large models

9. PLATO Comprehensive Knowledge Enhancement Dialogue

Human-computer interactive dialogue driven by large models

Judging from human knowledge learning experience, we also memorize a lot of knowledge in our brains. PLATO has tried comprehensive knowledge enhancement, while doing knowledge external application and knowledge internalization. On the one hand, it uses external general unstructured knowledge and portrait knowledge, and on the other hand, it also internalizes a large amount of question and answer knowledge into the model parameters through pre-training. After such comprehensive knowledge enhancement, the error rate of general dialogue knowledge has been reduced from 30% to 17%, the consistency of portraits has been increased from 7.1% to 80%, and the accuracy of question and answer has been increased from 3.2% to 90%. The improvement is very obvious.

#The picture below is a comparison of the effects after comprehensive knowledge enhancement.

Human-computer interactive dialogue driven by large models

##It is worth noting that although the effect has been significantly improved, the problem of knowledge abuse has not been completely solved, only alleviated That’s all. Even if the model scale is expanded to hundreds of billions of parameters, the problem of knowledge abuse still exists.

There are still several points worthy of our continued efforts: The first is the triggering timing of external knowledge, that is, when to check external knowledge and when to use internal knowledge. knowledge, which affects the flow and engagement of the conversation. The second is the accuracy of knowledge selection, which involves retrieval technology. The Chinese knowledge corpus is built in the scale of billions. It is not that easy to accurately retrieve appropriate knowledge through a given conversation. The third is the rationality and fidelity of knowledge utilization. Sometimes the model cannot accurately understand the knowledge or confuse and piece together inaccurate responses.

Human-computer interactive dialogue driven by large models

##3. Implementation, challenges and prospects of large-scale dialogue models

## The above introduces some technologies of PLATO dialogue, such as introducing large-scale models, adding discrete latent variables to improve the richness of dialogue, and introducing external knowledge to alleviate knowledge abuse through unsupervised introduction. So what are the practical applications in actual production?

1. Implementation application

Human-computer interactive dialogue driven by large models

PLATO is used in smart speakers and virtual humans Provides open domain chat capabilities in multiple scenarios such as , community chat, etc.

Human-computer interactive dialogue driven by large models

On the left is the digital person Du Xiaoxiao. Search for Du Xiaoxiao in Baidu APP or directly enter "Hello" to call the digital person. You can call the digital person through chat. Convenient search process and efficient access to answers and information. On the right is a virtual person in Baidu input method, who is both good-looking and good at chatting.

2. Challenges encountered in landing applications

In landing applications,the first challenge is Inference performance, the performance data of 1.6 billion parameter PLATO is listed in the figure. The number of operators has been reduced by 98% through operator fusion, and the model inference time has been reduced from 1.2s on the original v100 to less than 300ms on the A10 card. Through calculation accuracy optimization, 40% of the video memory was reduced. The inference card was changed from v100 to A10 to reduce costs. At the same time, architecture optimization and platform migration were performed to reduce link overhead.

Human-computer interactive dialogue driven by large models

The second challenge is conversation security. For example, harmful speech, political sensitivity, regional discrimination, privacy and many other aspects require great attention. PLATO deeply cleans the corpus, deletes unsafe samples, and uses a safe discriminant model to remove unsafe candidate responses after deployment. At the same time, the keyword table is maintained and adversarial training is added to detect and fill in gaps to improve security.

Human-computer interactive dialogue driven by large models

3. Outlook

In the past, people thought that open domain chatting was a With the development of large-scale models in recent years, significant progress has been made in the field of dialogue. Currently, models can generate coherent, smooth, rich and cross-domain dialogues, but there are still great challenges in aspects such as emotion, character design, personality and speculation. Room for improvement.

The road is long and difficult, but the road is coming. If we keep on walking, we can look forward to the future. I also hope that colleagues in the field of dialogue can work together to reach the peak of human-computer dialogue.

Human-computer interactive dialogue driven by large models

4. Quote

Human-computer interactive dialogue driven by large models

5. Q&A session

#Q: How is the effectiveness of the dialogue evaluated?

#A: Currently, there are no automatic indicators in the dialogue system that are more consistent with manual evaluation, and manual evaluation is still the gold standard. During the development phase, you can iterate with reference to the perplexity. In the final comprehensive evaluation, you still need to ask a large number of crowdsourcers to interact with different machines and perform manual evaluation on some indicators. Evaluation indicators also change with the development of technology. For example, when fluency is no longer a problem, then indicators such as safety and knowledge accuracy can be added to evaluate more advanced abilities.

The above is the detailed content of Human-computer interactive dialogue driven by large models. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Shanghai University of Science and Technology and others released DreamFace: just text can generate a 'hyper-realistic 3D digital human' Shanghai University of Science and Technology and others released DreamFace: just text can generate a 'hyper-realistic 3D digital human' May 17, 2023 am 08:02 AM

With the development of large language model (LLM), diffusion (Diffusion) and other technologies, the birth of products such as ChatGPT and Midjourney has set off a new wave of AI craze, and generative AI has also become a topic of great concern. Unlike text and images, 3D generation is still in the technology exploration stage. At the end of 2022, Google, NVIDIA, and Microsoft have successively launched their own 3D generation work, but most of them are based on advanced neural radiation field (NeRF) implicit expressions and are incompatible with the rendering pipelines of industrial 3D software such as Unity, UnrealEngine, and Maya. . Even if it is converted into a geometric and color map expressed by Mesh through traditional solutions, it will cause a lack of accuracy.

Large models are popular with digital people: one sentence can be customized in 5 minutes, and you can hold it while dancing, hosting and delivering goods Large models are popular with digital people: one sentence can be customized in 5 minutes, and you can hold it while dancing, hosting and delivering goods May 08, 2024 pm 08:10 PM

In as little as 5 minutes, you can create a 3D digital human that can go directly to work. This is the latest shock that large models have brought to the field of digital humans. Just like this, one sentence describes the demand: the generated digital people can directly enter the live broadcast room and serve as anchors. It's no problem to dance in a girl group dance. During the entire production process, just say whatever comes to mind. The large model can automatically disassemble the requirements, and you can get designs and modify ideas instantly. △With 2x speed, you no longer have to worry about the boss/Party A’s ideas being too novel. Such Vincent digital human technology comes from the latest release of Baidu Intelligent Cloud. It’s time to say it or not, but it’s time to cut down the barriers to digital people’s use in one fell swoop. After hearing about such an artifact, we immediately obtained the qualification for internal testing as usual. Let’s take a sneak peek at more details~ In 5 minutes in one sentence, the 3D digital man will be directly on duty.

Damn it, I'm surrounded by digital colleagues! Xiaobing AI digital employees are upgraded again, with zero-sample customization and immediate employment Damn it, I'm surrounded by digital colleagues! Xiaobing AI digital employees are upgraded again, with zero-sample customization and immediate employment Jul 19, 2024 pm 05:52 PM

"Hello, I have just joined our company. If I have any questions about business, please give me your advice!" What, these colleagues are all "digital people" driven by large models? It only takes 30 seconds of image, 10 seconds of audio, and 10 minutes to quickly customize a "digital colleague" that is no different from a real person. It can directly interact with you in real time, and has high-quality and low-latency audio and video transmission at the communication operator level. Like this: Like this: This is the latest "zero-shot Xiaoice Neural Rendering, Zero-XNR" technology launched by Xiaoice. Relying on a large model base of over 100 billion, new technology

Digital people light the main torch of the Asian Games, and this ICCV paper reveals Ant's generative AI black technology Digital people light the main torch of the Asian Games, and this ICCV paper reveals Ant's generative AI black technology Sep 29, 2023 pm 11:57 PM

Open a digital human and it will be full of generative AI. On the evening of September 23, at the opening ceremony of the Hangzhou Asian Games, the lighting of the main torch showed the "little flames" of hundreds of millions of online digital torchbearers gathering on the Qiantang River, forming the image of a digital human. Then, the digital human torchbearer and the sixth torchbearer on site walked to the torch stage together and lit the main torch together. As the core idea of ​​the opening ceremony, the digital-real-interconnected torch lighting format became a hot search topic, arousing people's interest. Focus. Rewritten content: As the core idea of ​​the opening ceremony, the torch lighting method of Digital Reality Internet has aroused heated discussions and attracted people's attention. Digital human ignition is an unprecedented initiative. Hundreds of millions of people participated in it, involving a large number of advanced and Complex technology. One of the most important questions is how

Yang Dong, Platform Technical Director of Unity Greater China: Starting the Digital Human Journey in the Metaverse Yang Dong, Platform Technical Director of Unity Greater China: Starting the Digital Human Journey in the Metaverse Apr 08, 2023 pm 06:11 PM

As the cornerstone of building Metaverse content, digital people are the earliest mature scenarios for metaverse subdivision that can be implemented and sustainably developed. Currently, commercial applications such as virtual idols, e-commerce delivery, TV hosting, and virtual anchors have been recognized by the public. In the world of the metaverse, one of the most core contents is none other than digital humans, because digital humans are not only the "incarnations" of real-world humans in the metaverse, they are also one of the important vehicles for us to carry out various interactions in the metaverse. one. It is well known that creating and rendering realistic digital human characters is one of the most difficult problems in computer graphics. Recently, at the MetaCon Metaverse Technology Conference "Games and AI Interaction" branch venue hosted by 51CTO, Unity Greater China Platform Technical Director Yang Dong gave a series of Demo demonstrations

What is a digital human and what does the future hold? What is a digital human and what does the future hold? Oct 16, 2023 pm 02:25 PM

In today's technologically advanced world, lifelike digital humans have become an emerging field that attracts much attention. As a digital virtual image that is close to the human image created based on computer graphics (CG) technology and artificial intelligence technology, digital humans can provide people with more convenient, efficient, and personalized services. At the same time, the emergence of digital people can also promote the development of the virtual economy and provide more opportunities for digital content innovation and digital consumption. According to a report released by International Data Corporation (IDC), the global virtual digital human market is expected to reach US$27 billion in 2025, with a compound annual growth rate of 22.5%. It can be seen that digital humans have very broad application prospects and market potential. What is a digital person? Digital people are lucky

DreamFace: Generate 3D digital human in one sentence? DreamFace: Generate 3D digital human in one sentence? May 16, 2023 pm 09:46 PM

Today, with the rapid development of science and technology, research in the fields of generative artificial intelligence and computer graphics is increasingly attracting attention. Industries such as film and television production and game development are facing huge challenges and opportunities. This article will introduce you to a research in the field of 3D generation - DreamFace, which is the first text-guided progressive 3D generation framework that supports Production-Ready3D asset generation, and can realize text generation-driven 3D hyper-realistic digital people. This work has been accepted by Transactionson Graphics, the top international journal in the field of computer graphics, and will be presented at the top international computer graphics conference SIGGRAPH2023. Project website: https://sites.

AI+Digital Human Realizes New Interaction China Telecom Brings AI to Smart Life AI+Digital Human Realizes New Interaction China Telecom Brings AI to Smart Life May 27, 2023 pm 12:34 PM

(Photo source: Photo Network) (Reporter Chen Jinfeng) Recently, the 2023 Shanghai Information Consumption Festival kicked off, and "digital people" have become the inevitable protagonists. Industry insiders believe that the application of AI technology will accelerate the development of high-quality content, and virtual digital people may become a new traffic entrance. AI digital people enter daily life With the development of artificial intelligence, virtual reality and other technologies, virtual digital people enter people's daily life and play a unique role in many fields. Virtual beauty expert Liu Yexi received over one million likes within three days of her Douyin debut, becoming the top virtual idol in China overnight. At Jiangsu Satellite TV’s New Year’s Eve concert, former singer Teresa Teng returned to the stage. Singing duet with singer Zhou Shen on the same stage, interweaving the youthful memories of several generations; more than 20 digital people appeared on the same stage at the Winter Olympics, serving as sign language protagonists

See all articles