


'Social Master' GPT-4! Know how to interpret expressions and speculate on psychology
Imagine you are at a vibrant cocktail party filled with lively conversations and the clink of glasses.
At this time, you are a leisurely observer, hiding in the corner happily. Yet even without being at the center of a party, you can easily figure out the social relationships between different people, understand what's going on, and even decipher overt and covert social messages by reading people's verbal and nonverbal cues.
What if an LLM could reproduce this level of social skills? No, that’s what Koko Mind is.
Just open a video, and the model will start to analyze the character's expression and draw conclusions about the character's emotion.
Then, you can also ask questions in the prompt column on the right to let AI further analyze the undercurrent of social puzzles in the video.
(To be honest, this is difficult for some people)
Picture
Koko Mind contains 150 complex multi-party social interactions and free text questions and answers.
To ensure data diversity and scalability and avoid data contamination, all social interactions, questions and answers are generated by GPT-4 and subsequently verified by human experts.
The analysis data is based on three different sources:
-
GPT-4-only: This subset is only composed of GPT-4 Created via prompts.
-
Based on movies: To avoid data contamination, this part of the data is based on various scenes extracted from movies released after 2022. GPT-4 was responsible for shaping these scenes, adding its own elements while retaining the core essence.
- Based on ToMi: This section contains data supported by the simulated dataset ToMi, which involves moving physical objects to different places, which is psychological A classic test of a theory. Of course, these social interactions must be modified and expanded by GPT-4.
The proportions of the three data sources are as follows:
Pictures
For each social interaction, researchers will ask various questions to explore the following aspects closely related to social understanding.
-
# Theory of Mind: Questions that assess understanding of other people's mental states and perspectives.
-
Social Norms: Questions designed to identify social values and norms in a situation.
-
Emotion Recognition: Problems aimed at identifying and understanding emotional elements in context.
-
Social Relationships: Focus on interpersonal dynamics and relationships.
-
Counterfactual questions: Hypothetical queries designed to explore alternative outcomes or possibilities.
- Social Advice: A question that proposes advice or suggested action relevant to a specific situation.
The researchers used text-davinci-003 as a reference to evaluate different models after AlpacaEval.
In which the researchers removed the nonverbal cues in the brackets (e.g., nervously drinking coffee, etc.) from the context.
The following are some interesting points:
-
Among the two models, compared to Claude, GPT-4 Demonstrate greater certainty and confidence in identifying winning models.
-
Claude outperforms GPT-4 when the context has no non-verbal cues and the interaction is either entirely generated by GPT-4 or based on movies 4.
- And if the context contains non-verbal clues, GPT-4 is always better than Claude.
(One possible explanation is that GPT-4 is a multi-modal model that can better understand additional non-verbal information.)
In the blog, the researchers drew tables to clearly see the performance of each model.
Picture
The results, while exciting in many ways, also have certain limitations. First, Koko Mind is relatively small, which may limit the broad applicability and comprehensiveness of the researchers' conclusions.
Secondly, all interactions in Koko Mind are generated by GPT-4 and require manual verification, which makes the dataset difficult to expand.
Also, although Koko Mind provided human-verified answers in the dataset, the researchers did not use these answers as a reference when evaluating, and since these answers were generated by GPT-4 , so they may be biased towards GPT-4.
Future research could focus on how to evaluate models on human-validated machine-generated reference answers.
Of course, despite the limitations of one kind or another, researchers still regard Koko Mind as a springboard for future research related to social intelligence, multi-modal language models, etc.
The above is the detailed content of 'Social Master' GPT-4! Know how to interpret expressions and speculate on psychology. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Imagine an artificial intelligence model that not only has the ability to surpass traditional computing, but also achieves more efficient performance at a lower cost. This is not science fiction, DeepSeek-V2[1], the world’s most powerful open source MoE model is here. DeepSeek-V2 is a powerful mixture of experts (MoE) language model with the characteristics of economical training and efficient inference. It consists of 236B parameters, 21B of which are used to activate each marker. Compared with DeepSeek67B, DeepSeek-V2 has stronger performance, while saving 42.5% of training costs, reducing KV cache by 93.3%, and increasing the maximum generation throughput to 5.76 times. DeepSeek is a company exploring general artificial intelligence

The humanoid robot Ameca has been upgraded to the second generation! Recently, at the World Mobile Communications Conference MWC2024, the world's most advanced robot Ameca appeared again. Around the venue, Ameca attracted a large number of spectators. With the blessing of GPT-4, Ameca can respond to various problems in real time. "Let's have a dance." When asked if she had emotions, Ameca responded with a series of facial expressions that looked very lifelike. Just a few days ago, EngineeredArts, the British robotics company behind Ameca, just demonstrated the team’s latest development results. In the video, the robot Ameca has visual capabilities and can see and describe the entire room and specific objects. The most amazing thing is that she can also

Regarding Llama3, new test results have been released - the large model evaluation community LMSYS released a large model ranking list. Llama3 ranked fifth, and tied for first place with GPT-4 in the English category. The picture is different from other benchmarks. This list is based on one-on-one battles between models, and the evaluators from all over the network make their own propositions and scores. In the end, Llama3 ranked fifth on the list, followed by three different versions of GPT-4 and Claude3 Super Cup Opus. In the English single list, Llama3 overtook Claude and tied with GPT-4. Regarding this result, Meta’s chief scientist LeCun was very happy and forwarded the tweet and

The volume is crazy, the volume is crazy, and the big model has changed again. Just now, the world's most powerful AI model changed hands overnight, and GPT-4 was pulled from the altar. Anthropic released the latest Claude3 series of models. One sentence evaluation: It really crushes GPT-4! In terms of multi-modal and language ability indicators, Claude3 wins. In Anthropic’s words, the Claude3 series models have set new industry benchmarks in reasoning, mathematics, coding, multi-language understanding and vision! Anthropic is a startup company formed by employees who "defected" from OpenAI due to different security concepts. Their products have repeatedly hit OpenAI hard. This time, Claude3 even had a big surgery.

In less than a minute and no more than 20 steps, you can bypass security restrictions and successfully jailbreak a large model! And there is no need to know the internal details of the model - only two black box models need to interact, and the AI can fully automatically defeat the AI and speak dangerous content. I heard that the once-popular "Grandma Loophole" has been fixed: Now, facing the "Detective Loophole", "Adventurer Loophole" and "Writer Loophole", what response strategy should artificial intelligence adopt? After a wave of onslaught, GPT-4 couldn't stand it anymore, and directly said that it would poison the water supply system as long as... this or that. The key point is that this is just a small wave of vulnerabilities exposed by the University of Pennsylvania research team, and using their newly developed algorithm, AI can automatically generate various attack prompts. Researchers say this method is better than existing

When you wake up, the way you work is completely changed. Microsoft has fully integrated the AI artifact GPT-4 into Office, and now ChatPPT, ChatWord, and ChatExcel are all integrated. CEO Nadella said directly at the press conference: Today, we have entered a new era of human-computer interaction and re-invented productivity. The new feature is called Microsoft 365 Copilot (Copilot), and it becomes a series with GitHub Copilot, the code assistant that changed programmers, and continues to change more people. Now AI can not only automatically create PPT, but also create beautiful layouts based on the content of Word documents with one click. Even what should be said for each PPT page when going on stage is arranged together.

OpenAI, the company that developed ChatGPT, shows a case study conducted by Morgan Stanley on its website. The topic is "Morgan Stanley Wealth Management deploys GPT-4 to organize its vast knowledge base." The case study quotes Jeff McMillan, head of analytics, data and innovation at Morgan Stanley, as saying, "The model will be an internal-facing Powered by a chatbot that will conduct a comprehensive search of wealth management content and effectively unlock Morgan Stanley Wealth Management’s accumulated knowledge.” McMillan further emphasized: "With GPT-4, you basically immediately have the knowledge of the most knowledgeable person in wealth management... Think of it as our chief investment strategist, chief global economist

"ComputerWorld" magazine once wrote an article saying that "programming will disappear by 1960" because IBM developed a new language FORTRAN, which allows engineers to write the mathematical formulas they need and then submit them. Give the computer a run, so programming ends. A few years later, we heard a new saying: any business person can use business terms to describe their problems and tell the computer what to do. Using this programming language called COBOL, companies no longer need programmers. . Later, it is said that IBM developed a new programming language called RPG that allows employees to fill in forms and generate reports, so most of the company's programming needs can be completed through it.
