Home Technology peripherals AI Kuaishou's 'Keling' explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Kuaishou's 'Keling' explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Jun 21, 2024 am 01:13 AM
quick worker industry Vincent Video

After just one year, the AI-generated "eating noodles" has become so natural and smooth? This shocked netizens around the world.

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

                                                                                                                                                                                                                        The generated videos on the right come from the Wensheng video model just launched by Kuaishou. Kling.

It is not a pre-release or a pure demo collection, but a product-level application that is directly open for testing and everyone can apply. Moreover, Keling supports the generation of 1080P videos up to 2 minutes and 30fps, focusing on "one-click conversion" from brainstorming to publishable works. (官网地址:https://kling.kuaishou.com/)

最早一批用上的用户已经「真香」:

                              图源:https://x.com/ op7418/status/1799047146089619589

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

                                                                                        m8Or?refer_flag=1001030103_

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

The communication group has a maximum of 500 people and will fill up quickly. Now, the screen is full of tql:

Foreign friends who haven’t used it yet can only be anxious and post “please” on social media:

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to findIt is no exaggeration to say that Ke Ling is now “a It’s hard to find a number”:

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
The news spread to the Silicon Valley venture capital circle, and it triggered a heated discussion.
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to findStability AI former CEO Emad Mostaque said: "China's AI technology has its own advantages." 463003684918

YC CEO also forwarded the demo generated by Keling on the More vivid and real than Sora:
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
Prompt: Une personne tapant son meilleur croc dans son hamburger

                      Video address: https://x.com/AngryTomtweets/status/1799787209651859910

For those who pay attention to AI, they must have seen a lot of Ke Ling in recent days. generated works . This website also clicked into the application channel as soon as possible and obtained the trial qualification.

Next, we might as well try it out and analyze the reasons why Keling is so popular.

The first product-level application of Vincent Video in China

Maybe you still remember this once very popular "Balloon Man" video. Three creators spent nearly two weeks using Sora to create this stunning 1 minute and 21 second short video. However, Patrick Cederberg, who was in charge of post-production, confessed to many problems in the process, such as the color of the balloon changing every time it was generated, some flaws appearing in the footage, etc.

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Sora generates results. Full video address: https://youtu.be/9oryIMNVtto?si=F6oDzvrhzfVcQGeh
For previous video generation models, it is indeed difficult to generate more than 1 minute of content in one go, especially if the screen is required All elements remain consistent.
Fu Sheng, Chairman and CEO of Cheetah Mobile and Chairman of Orion Star, released the "Balloon Man" video he made with Corin and said that it only took "tens of minutes" to create the continuity , a short film with excellent realism and clarity.
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to findDuring the internal testing process, we also discovered tutorials and demo documents spontaneously created by a community of professional creators, including hundreds of works that can be generated, and also provided guidance on testing dimensions.
Interested readers please click: https://waytoagi.feishu.cn/wiki/GevKwyEt1i4SUVk0q2JcqQFtnRd
The following 2-minute public welfare short film "A Place Far Far Away" is also completely generated by Keling Yes, can you see it?
In the work "Zootopia Racing Competition" by the creator @AIGC Thirteen, these 20 seconds include the generation of fast racing cars (large movements), animal-driven vehicles (conceptual combinations that test imagination), etc. Difficult, but judging from the results, Keling has solved these problems very well: Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to findKuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
                      Source: Keling creator @AIGC Thirteen
There is another very interesting case. "How to Open the Holidays" created by @八级Mechanics. This 56-second short video took a total of 3 hours to produce and included 23 shots. Then add dubbing on top of Ke Ling's generated results, and the humorous feeling will be there immediately:
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
                                                                       After reading these, We should have realized that the influence of the video generation technology represented by Keling goes far beyond simple creation. The implementation of this technology is accelerating in different research fields and industry tracks, providing transformative potential for a variety of tasks from automatic content generation to complex decision-making processes.
Which industries will be changed first?

Traditional game development is often limited by pre-rendered environments and scripted events. Once video generation models are integrated into gaming, the way games are developed, played, and experienced will be innovated, bringing new possibilities for storytelling, interactivity, and immersive experiences. For game developers, one of the most intuitive methods is to generate customized visual effects and even character actions based on user narratives. M In the DEMO below, we can see that users can create an unparalleled body experience with the help of cocoa:


source: https: //x.com/dustinhollywood /status/1800056286215553444

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

                                                                                                  Another industry that will be disrupted is film and television production. Traditional filmmaking is an arduous and expensive process that often requires years of effort, extensive equipment, and financial investment. The emergence of video generation technology heralds a new "democratization era" in film production, and the dream of autonomously generating personal film and television works from simple text input is becoming a reality.

Now, what we use Keling to generate is a 5-second single-shot clip. As technology continues to evolve, the length of video that users can generate at a time will also increase. For example, in the future we may be able to generate longer video content at once to maintain the coherence and enjoyment of story scenes. The camera techniques may be more advanced, such as continuous long shots.

                                                           Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

The silhouette work below once again proves one point: AI’s understanding and aesthetic level of art, Not inferior to humans at all.

Prompt: “A dancer’s silhouette transitions seamlessly through different dance styles, from hip-hop to ballet, in one continuous shot”


Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
. Picture source: https://x.com/dustinhollywood/status/1799970059957555210

The style of science fiction movies is completely grasped:
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
 Source: Keling creator @狗儿李

AI can also inject inspiration into the production of luxury blockbusters:
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find
                                                                            We can take a look at this generated by Ke Ling In the "Honey" commercial, the performance of AI in simulating the close-up of pouring honey is not inferior to that of the professional camera team:

What technologies are behind KeLing?
We were unable to obtain enough Sora research and development details from OpenAI’s brief technical report, but the official website of Keling Large Model disclosed more reference information, mainly including data preparation, model architecture, training plan and Several aspects of optimization strategy.

Data preparation
Relying on Kuaishou’s years of accumulation in the field of video technology, the Keling Model team has built a complete labeling system, including basic video quality, aesthetics, naturalness, etc. Dimensions characterize the quality of video data and design a variety of customized label features for each dimension to refine training data or adjust the distribution of training data.
In order to meet the needs of paired video and text descriptions in the process of training Wensheng video model, the Keling Big Model team has self-developed a video description model, which can generate accurate, detailed, and structured video descriptions, significantly improving video generation The model's responsiveness to text commands.

Model Architecture
After the high-quality annotation data is prepared, how does the Keling large model obtain the ability to simulate the characteristics and concept combinations of the physical world?
In the overall architecture design, Keling adopts the currently popular Diffusion Transformer (DiT). Traditional diffusion models mainly utilize convolutional U-Net containing downsampling and upsampling blocks as the denoising network backbone. But some studies have shown that the U-Net architecture is not critical for good performance of diffusion models. By adopting a more flexible Transformer architecture, diffusion models can use more training data and larger model parameters. DiT is one of the representative works under this research idea.
In the past few months, a consensus has been formed in the industry that the success of video generation models is ultimately due to the role of Scaling Law. This consensus is based on the findings of the DiT paper. Using Transformer can steadily expand the model size: as the amount of training calculation increases (the training time of the model is extended or the model is increased, or both), the performance will also Increase accordingly.
This means that for video generation models, as long as more computing power and more data are used to scale up, the generation quality will continue to improve.
The reason why Keling can transform users' text prompts into specific pictures, including fictional scenes that will not appear in the real world, is based on a deep understanding of text-video semantics and the powerful capabilities of the Diffusion Transformer architecture . Driven by the powerful modeling capabilities inspired by its self-developed architecture and Scaling Law, Caling can well simulate the physical characteristics of the real world and generate videos that comply with physical laws.

At the same time, based on the team’s self-developed 3D VAE network, the large-scale model can generate 1080p resolution movie-level video, whether it is a vast and majestic scene or a delicate close-up, it can be vivid Present. In natural scenes, the light changes smoothly. Tester: @shanshan

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

Of course, for the video generation model, another factor that must be considered is: video is a kind of visual content with a time dimension, and incoherent content will make users distracted The experience is greatly compromised.

In order to ensure that the presentation of motion in the picture is more reasonable, the Keling large model adopts a 3D spatio-temporal joint attention mechanism to better model complex spatio-temporal motion, and can generate video content with larger motions while meeting the requirements. pattern of motion.

Training and optimization strategy

Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find

If you have tested it yourself, you will find that Keling supports outputting multiple video aspect ratios for the same content during the inference process. This is because Keling adopts a variable resolution training strategy to meet the needs of using video materials in richer scenes.

At the same time, thanks to efficient training infrastructure, extreme inference optimization and scalable infrastructure, the Keling model can generate videos up to 2 minutes long with a frame rate of 30fps.

Video generation is no longer a game of "catching up with OpenAI"

2024 is called the year of the explosion of video generation technology, but before Keling, we never saw it Sora-level products are available, and it is unknown when Sora will be available.

In a sense, Keling is the first real "Chinese version of Sora" and brings this technology to a stage where it is usable, easy to use, and practical.

As Fu Sheng said: "This may be the best Wensheng video product you can use in the world today." Anyone who has tried Keling in person will understand that this is by no means an exaggeration. .

Fu Sheng’s video also gave another point of view: "In turn, it also shows that Sora is not a technical breakthrough, but a product breakthrough."

I still remember a few months ago , Sora has raised the technical level of the entire video generation track with its advantages of 60 seconds of continuous video, high-definition picture texture, continuous camera movement, and movement methods, setting off a wave of competition in the Vincentian video track.

We originally thought that the field of video generation would evolve into a technological catch-up between domestic and overseas countries, just like last year’s text model. However, the release of Keling means that the exploration of domestic Wensheng video large model technology has reached a new height, and it has achieved substantial leadership in product implementation. We may not need to play the “catching up with OpenAI” game again.

Some people have judged that China is surpassing the United States in the field of artificial intelligence.

The birth of Keling may mean the beginning of a new era. In the era of generative AI, generating and editing videos may be as easy as using photos on mobile phones today, and the barrier between imagination and reality will be completely broken.
Kuaishous Keling explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to findDue to its popularity, the number of people currently queuing up to test Corin has exceeded 50,000. If you are interested in how to play AI-generated videos, you may wish to follow the "Keling AI Video Account" first to get more high-quality cases.

The above is the detailed content of Kuaishou's 'Keling' explodes: a huge shock in overseas AI circles, the Chinese version of Sora is hard to find. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

DeepMind robot plays table tennis, and its forehand and backhand slip into the air, completely defeating human beginners DeepMind robot plays table tennis, and its forehand and backhand slip into the air, completely defeating human beginners Aug 09, 2024 pm 04:01 PM

But maybe he can’t defeat the old man in the park? The Paris Olympic Games are in full swing, and table tennis has attracted much attention. At the same time, robots have also made new breakthroughs in playing table tennis. Just now, DeepMind proposed the first learning robot agent that can reach the level of human amateur players in competitive table tennis. Paper address: https://arxiv.org/pdf/2408.03906 How good is the DeepMind robot at playing table tennis? Probably on par with human amateur players: both forehand and backhand: the opponent uses a variety of playing styles, and the robot can also withstand: receiving serves with different spins: However, the intensity of the game does not seem to be as intense as the old man in the park. For robots, table tennis

The first mechanical claw! Yuanluobao appeared at the 2024 World Robot Conference and released the first chess robot that can enter the home The first mechanical claw! Yuanluobao appeared at the 2024 World Robot Conference and released the first chess robot that can enter the home Aug 21, 2024 pm 07:33 PM

On August 21, the 2024 World Robot Conference was grandly held in Beijing. SenseTime's home robot brand "Yuanluobot SenseRobot" has unveiled its entire family of products, and recently released the Yuanluobot AI chess-playing robot - Chess Professional Edition (hereinafter referred to as "Yuanluobot SenseRobot"), becoming the world's first A chess robot for the home. As the third chess-playing robot product of Yuanluobo, the new Guoxiang robot has undergone a large number of special technical upgrades and innovations in AI and engineering machinery. For the first time, it has realized the ability to pick up three-dimensional chess pieces through mechanical claws on a home robot, and perform human-machine Functions such as chess playing, everyone playing chess, notation review, etc.

Claude has become lazy too! Netizen: Learn to give yourself a holiday Claude has become lazy too! Netizen: Learn to give yourself a holiday Sep 02, 2024 pm 01:56 PM

The start of school is about to begin, and it’s not just the students who are about to start the new semester who should take care of themselves, but also the large AI models. Some time ago, Reddit was filled with netizens complaining that Claude was getting lazy. "Its level has dropped a lot, it often pauses, and even the output becomes very short. In the first week of release, it could translate a full 4-page document at once, but now it can't even output half a page!" https:// www.reddit.com/r/ClaudeAI/comments/1by8rw8/something_just_feels_wrong_with_claude_in_the/ in a post titled "Totally disappointed with Claude", full of

At the World Robot Conference, this domestic robot carrying 'the hope of future elderly care' was surrounded At the World Robot Conference, this domestic robot carrying 'the hope of future elderly care' was surrounded Aug 22, 2024 pm 10:35 PM

At the World Robot Conference being held in Beijing, the display of humanoid robots has become the absolute focus of the scene. At the Stardust Intelligent booth, the AI ​​robot assistant S1 performed three major performances of dulcimer, martial arts, and calligraphy in one exhibition area, capable of both literary and martial arts. , attracted a large number of professional audiences and media. The elegant playing on the elastic strings allows the S1 to demonstrate fine operation and absolute control with speed, strength and precision. CCTV News conducted a special report on the imitation learning and intelligent control behind "Calligraphy". Company founder Lai Jie explained that behind the silky movements, the hardware side pursues the best force control and the most human-like body indicators (speed, load) etc.), but on the AI ​​side, the real movement data of people is collected, allowing the robot to become stronger when it encounters a strong situation and learn to evolve quickly. And agile

ACL 2024 Awards Announced: One of the Best Papers on Oracle Deciphering by HuaTech, GloVe Time Test Award ACL 2024 Awards Announced: One of the Best Papers on Oracle Deciphering by HuaTech, GloVe Time Test Award Aug 15, 2024 pm 04:37 PM

At this ACL conference, contributors have gained a lot. The six-day ACL2024 is being held in Bangkok, Thailand. ACL is the top international conference in the field of computational linguistics and natural language processing. It is organized by the International Association for Computational Linguistics and is held annually. ACL has always ranked first in academic influence in the field of NLP, and it is also a CCF-A recommended conference. This year's ACL conference is the 62nd and has received more than 400 cutting-edge works in the field of NLP. Yesterday afternoon, the conference announced the best paper and other awards. This time, there are 7 Best Paper Awards (two unpublished), 1 Best Theme Paper Award, and 35 Outstanding Paper Awards. The conference also awarded 3 Resource Paper Awards (ResourceAward) and Social Impact Award (

Hongmeng Smart Travel S9 and full-scenario new product launch conference, a number of blockbuster new products were released together Hongmeng Smart Travel S9 and full-scenario new product launch conference, a number of blockbuster new products were released together Aug 08, 2024 am 07:02 AM

This afternoon, Hongmeng Zhixing officially welcomed new brands and new cars. On August 6, Huawei held the Hongmeng Smart Xingxing S9 and Huawei full-scenario new product launch conference, bringing the panoramic smart flagship sedan Xiangjie S9, the new M7Pro and Huawei novaFlip, MatePad Pro 12.2 inches, the new MatePad Air, Huawei Bisheng With many new all-scenario smart products including the laser printer X1 series, FreeBuds6i, WATCHFIT3 and smart screen S5Pro, from smart travel, smart office to smart wear, Huawei continues to build a full-scenario smart ecosystem to bring consumers a smart experience of the Internet of Everything. Hongmeng Zhixing: In-depth empowerment to promote the upgrading of the smart car industry Huawei joins hands with Chinese automotive industry partners to provide

Li Feifei's team proposed ReKep to give robots spatial intelligence and integrate GPT-4o Li Feifei's team proposed ReKep to give robots spatial intelligence and integrate GPT-4o Sep 03, 2024 pm 05:18 PM

Deep integration of vision and robot learning. When two robot hands work together smoothly to fold clothes, pour tea, and pack shoes, coupled with the 1X humanoid robot NEO that has been making headlines recently, you may have a feeling: we seem to be entering the age of robots. In fact, these silky movements are the product of advanced robotic technology + exquisite frame design + multi-modal large models. We know that useful robots often require complex and exquisite interactions with the environment, and the environment can be represented as constraints in the spatial and temporal domains. For example, if you want a robot to pour tea, the robot first needs to grasp the handle of the teapot and keep it upright without spilling the tea, then move it smoothly until the mouth of the pot is aligned with the mouth of the cup, and then tilt the teapot at a certain angle. . this

Distributed Artificial Intelligence Conference DAI 2024 Call for Papers: Agent Day, Richard Sutton, the father of reinforcement learning, will attend! Yan Shuicheng, Sergey Levine and DeepMind scientists will give keynote speeches Distributed Artificial Intelligence Conference DAI 2024 Call for Papers: Agent Day, Richard Sutton, the father of reinforcement learning, will attend! Yan Shuicheng, Sergey Levine and DeepMind scientists will give keynote speeches Aug 22, 2024 pm 08:02 PM

Conference Introduction With the rapid development of science and technology, artificial intelligence has become an important force in promoting social progress. In this era, we are fortunate to witness and participate in the innovation and application of Distributed Artificial Intelligence (DAI). Distributed artificial intelligence is an important branch of the field of artificial intelligence, which has attracted more and more attention in recent years. Agents based on large language models (LLM) have suddenly emerged. By combining the powerful language understanding and generation capabilities of large models, they have shown great potential in natural language interaction, knowledge reasoning, task planning, etc. AIAgent is taking over the big language model and has become a hot topic in the current AI circle. Au

See all articles