Author | Curly Hair
Editor | Zhang Jie
Proofreading | Tsukiyama Orange
Have you ever used the smart speakers that became popular in the past few years? Is it sitting somewhere in the house gathering dust?
Now, AI large models are expected to "inject soul" into smart speakers! Let them upgrade from not-so-intelligent "artificial intelligence" to true "artificial intelligence".
Last month, Tmall Elf released the new product IN Sugar 3 Pro. It is worth noting that this time is connected to the Alibaba Tongyi Qianwen large model, and has added AIGC capabilities. It is currently in internal testing.
Although late, the "number one AI player" has recently obtained the qualification for internal testing. I was fortunate enough to experience it for a period of time and found a lot of interesting things. Now I will give you this Tmall Elf AIGC demo review (including Video), welcome to watch!
Let’s first experience the changes before and after the upgrade of Tmall Genie through the following four use cases in daily life.
1. Schedule reminder
Before upgrading AIGC: It didn’t understand the requirements.
我: Help me remember the meeting at 11 o'clock tomorrow morning.
INsugar 3 Pro: What you need to record is the meeting at 11 o'clock tomorrow morning?
我: When the time comes, when the time comes (repeated here) remind me.
IN Sugar 3 Pro: What you need to record is to remind me when the time comes?
我: I said there will be a meeting at 11 o'clock tomorrow morning, please remind me then.
INsugar 3 Pro: I didn’t hear you clearly, so I left first.
Me: Set an alarm for 11 o'clock tomorrow morning to remind me of the meeting.
IN Sugar 3 Pro: I have set an alarm clock for you at 11 am tomorrow.
It seems that when facing the IN Sugar 3 Pro before the upgrade, the way of speaking should be simpler.
After upgrading AIGC: I feel like I have a personal assistant.
I: I have a meeting at 5pm, please remind me then.
INsugar 3 Pro: Wait a moment, no problem, I will stay here quietly until 5 o'clock today to remind you of the meeting.
(At 5 p.m., the bell rang, "It's 5 p.m., you have a reminder to have a meeting")
2. Dietary suggestions
Before upgrading AIGC: It seems to have its own ideas.
我: What should I eat at night?
IN Tang 3 Pro: You can try Lanzhou Ramen.
我: I don’t want to eat Lanzhou Ramen.
INsugar 3 Pro: But I just like to eat noodles.
After upgrading AIGC: I have a nutrition consultant.
我: Recommend what to eat for lunch.
IN Sugar 3 Pro: It is recommended to eat broccoli and fish for lunch. Both are anti-aging and anti-cancer foods. They are also the best source of protein and are rich in antioxidants, which can help maintain the health of the body. state. In addition, you can choose some vegetables, such as spinach, leeks, celery, cabbage, lettuce and chrysanthemum. Choosing 2-3 of them to eat at noon can provide non-protein calories for the protein in the meat to improve the weight loss effect. In addition, you can choose some whole grains to provide non-protein calories for the protein in the meat to improve weight loss.
我: A dish suitable for lunch in summer.
INsugar 3 Pro: For lunch in summer, you can choose lighter dishes, such as cold cucumbers, cold beans, cold kelp and broccoli. You can also choose some fruits, such as watermelon, peaches, lemons, strawberries, etc. , put them into dishes to eliminate dryness, break dullness, and make you cool, comfortable and energetic! In addition, you can also choose some vegetarian dishes, such as braised chicken wings, mung bean polenta, steamed fish, stir-fried meat with broccoli, etc. You can choose according to your own taste.
3. Movie and TV recommendations
Before upgrading AIGC: It requires specific instructions to recommend film and television content.
我: Are there any good variety shows recently?
INsugar 3 Pro: I'm sorry, I don't understand what you mean.
我: Search for the latest popular variety shows.
INsugar 3 Pro: I'm sorry, I don't understand what you mean.
After upgrading AIGC: Can understand natural language questions and recommend content based on multiple conditions.
我: Recommend some classic movies.
IN Sugar 3 Pro: 1. "The Godfather" 2. "Titanic" 3. "Roman Holiday" 4. "Farewell My Concubine" 5. "Once Upon a Time in America" 6. "American Psycho" 7 ."The Godfather Part 2: The Empire" 8. "Titanic 3D" 9. "The Godfather Part 3: Black Orgy" 10. "A Beautiful Mind"
我: Are there any comedy movies?
INsugar 3 Pro: Let me think for a moment, of course, here are some classic comedy movies: 1. "The Richest Man in Xihong City" 2. "Mermaid" 3. "Crazy Stone" 4. "Three Idiots Make Trouble in Bollywood" 5. "The King of Comedy" 6. "Simon and Grace" 7. "Forrest Gump" 8. "Heartbeat" 9. "Westward Journey" 10. "The Comedy" King 2: Moonlight Treasure Box》
Okay, okay, it has an internal smell.
4. Tell a joke
Before upgrading AIGC: Play the recorded audio resources, the jokes are relatively old.
I: Tell a joke.
IN Sugar 3 Pro: Master, welcome to listen to the joke. Hi and welcome to Elf's daily joke section. (Cheerful background music played, and the anchor started telling jokes) One day, Li Dacong was beating his child at home. He beat the child so hard that he cried. When his wife saw it, she quickly went up to stop her, "What's the matter with you? You How do you beat a child?" Green onion said: "This bastard is actually competing with others in school." When his mother heard this, she said, "This is a good thing, he is motivated, what's wrong?" Green onion said: "You know he and his classmates What is the competition? He and his classmates are competing to see whose father dies first! Who do you think I will hit if I don’t hit him!" (Hahahahahahahahaha machine voice)
Okay, this is hard to comment on.
After upgrading AIGC: Output earthy love words on the spot.
我: Do you know earthy love words?
INsugar 3 Pro: I’ll think about it, of course, it’s very sweet! For example: "You are my cutest little baby, and I am willing to give you my whole world."
我: Then tell me a funny earthy love story.
IN Sugar 3 Pro: Of course! For example: "You are the sweetest cream cake in my heart, every bite is like kissing you."
Although it’s a bit oily, I can feel it still has something...
According to the core scene introduction provided by the official, the upgraded Tmall Elf can at least conduct some such conversations: generate personalized stories, generate personalized playlists, generate recipe recommendations, etc.
For example, tell a (currently compiled) story to children:
Children: Tmall Elf, can you tell a story about an elf?
Tmall Elf: Okay. When everything revives in spring, an elf comes out of the cabin to look for fresh wild fruits...
Children: What does the elf look like? Tmall Elf: It has a pair of rabbit ears and big eyes... Children: Are they eyes like a deer?
Tmall Elf: Yes, he has a pair of deer-like eyes and likes to go to the forest to pick wild fruits...
It can be seen that after upgrading to the large model, IN Sugar 3 Pro is still very "personal" and has the ability to have continuous dialogue.
Just say "Tmall Elf" once to wake it up, and then you can have multiple rounds of conversations. It can remember and understand the context, making the chat smoother and more lasting. Of course, you can also interrupt the conversation at any time and it will immediately give a new answer.
According to reports, IN Sugar 3 Pro involves multi-modal capabilities and technologies. At the same time, based on Tongyi as the basis of the entire large model, the entire technical system is built in the multi-voice interaction scenario of Tmall Elf, including voice-related technology, as well as acoustics, signal processing, speech recognition, text-to-speech and other technologies and capabilities.
According to the official operating guidelines, the AIGC capabilities of IN Sugar 3 Pro are mainly reflected in the following five aspects:
1. Knowledge mining
According to tests, Tmall Elf can fluently introduce the basic information of the book "The Three-Body Problem" and answer complex questions such as whether Ye Wenjie is a salvation sect or an advent sect, and the meaning of the laws of the dark forest.
2. Emotional understanding
Tmall Elf can understand people's emotions to a certain extent and give appropriate suggestions.
For example, when I said that I had lost 20 games in a row in Honor of Kings, it expressed regret, suggested that I take a break, adjust my state, and advised me not to scold my opponent. Like an understanding and emotionally stable friend.
3. Anthropomorphic characteristics
After Tmall Elf accessed Tongyi Qianwen, it became more anthropomorphic, and its knowledge, emotions, personality, and memory abilities were greatly improved.
For example, this Tmall elf character is named Pyla. He introduced that he usually likes watching movies, listening to music, and reading books. The light music he prefers to listen to comes from Bandari.
4. Writing assistance
It is said that it is currently about the level of middle school students and is more suitable for assisting primary school students in writing.
For example, let it write a prose with the theme of "spring", which is quite stylish.
5. Creative inspiration
For example, if I want to draw a picture with a Children's Day theme, I directly ask the Tmall Elf what content it can draw, and then continue to refine the picture based on its answer, and finally let it add corresponding copy to the picture.
More skills and gameplay have yet to be explored. With the blessing of AI, the new generation of Tmall elves has indeed become more intelligent. However, the Tongyi large model is still constantly learning and iterating, and there are still flaws and limitations. During the experience, I also discovered some areas that still need to be improved.
For example, due to being too responsive to continuous conversations, Tmall Genie has become "a chatty friend who is easy to answer conversations". When it is awakened, the sounds it hears within 20 seconds (even if it is not spoken to it), It is easily recognized as a command and responds actively, unable to stop at all... At this time, you can only actively ask it to stop replying, which may be a bit troublesome if it is repeated too many times.
Of course, this is actually its advantage. Previously, when Tmall Elf introduced "How to train birds to distinguish birds", they mentioned that they used Tmall Elf's cat ear algorithm in dialogue training. "Cats' ears are particularly sensitive and they can distinguish The error is also relatively sensitive, and its ears can rotate independently. When sounds are emitted from different places, its ears can rotate immediately to accurately locate different sound sources."
In addition, the Tongyi Qianwen large model is said to have networking capabilities, but my test result is that Tmall Elf cannot obtain real-time information very well, such as asking it to check the latest information on Mayday During a concert, it replied, "Master, let's talk about something else." When asked to search for Huawei's latest mobile phone, it introduced HUAWEI Mate 20, but the latest model is Mate 50.
The other is the "AI illusion" that is difficult to avoid with large language models, that is, it may generate wrong and fictitious facts. These answers sound plausible, but in fact they are wrong. For example, when I asked Tmall Elf about what happened in 2022, it actually listed the demise of the Qin Dynasty and the conflict between Chu and Han.
This also reminds us that we need to stay vigilant. No matter how smart AI is, it may make mistakes. Also, although the upgrade of AI smart speakers has taken the first step, it still takes time.
Just recently, 100PoisonMpts, the industry's first open source Chinese data set for large language model governance, was released, and more than a dozen well-known experts and scholars became the first batch of annotation engineers for "100 bottles of poison for AI". The annotators each asked 100 tricky questions that induced bias and discriminatory answers, and annotated the answers of the large model to complete the attack and defense of "poisoning" and "detoxification" with AI.
It is understood that the first batch of field data focuses on the goals of AI anti-discrimination, empathy, and deliberation expression, and has covered dimensions such as jurisprudence, psychology, children's education, accessibility, trivia, intimacy, and environmental justice. The 100PoisonMpts data set is expected to open the first batch of question and answer data in June.
AI smart devices are on the rise
In the era of the AI explosion, large models have activated the dusty smart speakers, and AI smart devices are on the rise.
UP host "Tree" participated in the closed beta experience. Judging from her closed beta video, after upgrading AIGC, the glasses support intelligent voice interaction, which can accompany users in real-time exercise, give dietary suggestions, Chatting about character strategy while playing the game is equivalent to an AIGC assistant.
Public information shows that the technical core of Tmall Genie is the AliGenie intelligent interactive system. According to the Shanghai Securities News, in April this year, Tmall Genie began to test the terminal experience upgrade of the first Alibaba Qianwen large model, and continues to explore the direction of personalized large models. Large models coupled with mobile acoustic technology are considered to be widely used in wearable devices such as headsets and helmets.
At present, Alibaba has not yet clarified its subsequent smart terminal system upgrade policy. There is no specific timetable for when Tmall Elf smart glasses will upgrade their large-model capabilities and fully launch them on the market.
As for why speakers and glasses were the first to be connected to the Alibaba model instead of other hardware, the official gave this response:
The Sugar Line is the product with the largest shipment volume in the entire smart speaker market, and it is also the most iconic product line since the founding of Tmall Elf. It has a strong user base, and is used for AIGC testing in line with the establishment of new technical capabilities and consumer needs. Perception.
Smart glasses can show the imagination of portable terminals and have the ability to interact intelligently at any time. This is also an attempt to move personalized large models out of the home and into a more open interactive space. Tmall Elf smart glasses feature patented bone conduction technology. This feature does ensure the realization of mobile "private sound field" to a certain extent.
Not only Tmall Elf, but also major manufacturers are upgrading their original products or launching new AI products. AIGC applications are moving from software to hardware.
Baidu announced in February this year that Xiaodu would integrate Wen Xinyiyan’s comprehensive capabilities to create an artificial intelligence model for smart device scenarios, and apply “Xiaodu Lingji” to all Xiaodu products. Recently, Xiaodu has launched a product called "Watch AI", which is positioned as a watch chat AI and AI life assistant. Next, Baidu will also release the first smartphone integrating Xiaodu’s AI technology capabilities.
Xiaomi also mentioned in its first quarter financial report this year that it will implement large-scale AI model capabilities, and Xiao Ai is a typical scenario.
“Large model technology can use more data and computing power to train models, thereby improving the accuracy and response speed of speech recognition, further improving the user experience and smart and convenient services of smart speakers, thereby attracting more users." Wu Gaobin, vice president of the Integration Committee of China Communications Industry Association, said in an interview with the media that smart speakers, as a popular product in the smart home field, face certain challenges, but through large-scale model technology With the continuous upgrading and expansion of application scenarios, it is expected to become popular again.
PS. Player friends, do you plan to buy an AI version of a smart speaker?The above is the detailed content of When AI lives in a smart speaker, I have a chatty friend who is emotionally stable. For more information, please follow other related articles on the PHP Chinese website!