On the afternoon of November 10, He Zhongjiang, General Manager of China Telecom Artificial Intelligence, interpreted the products and ideas of General Large Model at the Artificial Intelligence and Data Industry Development Cooperation Forum
He Zhongjiang first shared his views on general artificial intelligence. He believed that general artificial intelligence refers to the ability to see, listen, and think like humans. Being able to see requires visual technology, and being able to listen requires voice technology. After the information and voice information are collected into the brain, the brain processes and judges it and provides decision-making ideas. The general large model plays the role of the brain. Today's massive data, advanced algorithms, and solid computing power will also promote the large-scale development of large models.
After explaining the basic views, He Zhongjiang gave a detailed explanation from the China Telecom Star Semantic Model and the China Telecom Star Multimodal Model. The China Telecom Star Semantic Large Model is the core of general artificial intelligence. It has better capabilities and can alleviate multiple rounds of hallucinations, reducing the "hallucination rate" by 40%. In the future, China Telecom's star semantic large model can empower 2B2G services externally, improve quality and efficiency, and optimize experience; it can be fully applied internally, improve production collaboration efficiency, and have richer applications. He Zhongjiang also revealed that China Telecom’s AI team will also participate in the open source and open source process. It will open source the tens of billions model before the end of this year and the hundreds of billions model in April next year. All underlying codes will be open sourced.
When He Zhongjiang introduced China Telecom’s Xingchen multi-modal large model, he said that China Telecom has trained more than 1.2 billion image and text pairs, using a mixed precision strategy to significantly improve GPU efficiency and speed up inference by 4.5 times. The multi-modal large model will As the basic capability base for the next generation of digital people.
By comparing Wanhao intelligent customer service voice with Supernatural TTS1.0, He Zhongjiang said that China Telecom Xingchen Voice Large Model 1.0 can achieve naturalness comparable to real people, real-time streaming into a suitable voice; the first packet response time is less than 50 milliseconds; it supports extremely Small data volume sound conversion and customization, thereby achieving better, faster and more flexible. He also revealed that Supernatural Speech Synthesis 2.0 will be released in mid-2024.
China Telecom HR is based on the China Telecom Star multi-modal large model, and uses basic digital avatars to display functions such as arbitrary matching of makeup accessories and personalized generation and customization. He Zhongjiang said that with the continuous enhancement of large-scale model technology and the continuous enrichment of knowledge, digital people in the virtual space and robots in the real world will have an increasing impact on people's production, operation and life, and the era of artificial intelligence is about to truly come!
Operator Finance (official WeChat public account yyscjrd) - a mainstream financial website, a website that comprehensively covers technology, finance, securities, automobiles, real estate, food, medicine, daily chemicals, wine and other consumer products .
The above is the detailed content of He Zhongjiang, General Manager of China Telecom Artificial Intelligence: Supernatural Voice 2.0 will be released in 2024. For more information, please follow other related articles on the PHP Chinese website!