Is it possible that the era of AI-generated short dramas is really coming?
Recently, the demos released by various video generation AIs are dazzling. From playing with memes and lengths to paying attention to real physical logic, it’s hard to distinguish between the endless artificial intelligence creativity, and all of them have to compete with Sora. At this time, someone suddenly took a step ahead to perform the performance of "film -level":
From the real style of light and shadow effect:
Source: https://x.com/i/status/1806383419661730197 In the rich imagination, the elements are complete, you can get it:
I did not expect that in the eyes of AI, Batman could make the clown unable to stretch. Source: https://x.com/blizaine/status/1806383419661730197
Some people are already trying to use this ability to complete complex tasks. With video generation AI, music generation AI, and some PS and AE, we can create a complete MV.
Source: https://twitter.com/Arata_Fukoe/status/1809840865063629292
You ask netizens what they think of this generation effect? Friends want to ask, "What do you think of Hollywood?"
The effect generated by this kind of AI video is silky and precise, attracting a large number of likes. Looking carefully, there are many short videos produced by it on social networks.According to netizens’ summary, the main advantage of the new AI is that it is less likely to be distracted when generating large-scale movements. Another example is to create a video of a running centaur:
The generative AI behind these videos is
Kuaishou's large model "Kling" (Kling) started to explode on the global Internet a few weeks ago. At that time, it was known as "the number one is hard to find." That’s right, this is not a demo released for PPT release first, but a product-level application that will be opened directly from the get-go. Now Keling AI has launched the web version, focusing on simplicity and ease of use
.According to the latest data, the number of users applying for Keling AI has approached 700,000
, making it the hottest video generation model on the entire network.Several upgrades in January, Keling AI’s rapid evolution
This year is the first year of generative AI. As early as February, OpenAI’s Sora raised the competition to the level of video generation. But domestic technology companies were the first to implement it.
Since its official debut on June 6, in just one month, Kuaishou Keling AI, the first large-scale domestic model that has aroused heated discussions in overseas AI circles, has gone through three iterative updates.From the very beginning of Wensheng videos, to the support of Tusheng videos, video continuation, and multi-size selection two weeks later, Keling AI has become increasingly outstanding and comprehensive. Various needs for video generation seem to have been solved unconsciously.
Just last weekend at the World Artificial Intelligence Conference WAIC 2024, Keling AI ushered in its third major upgrade and released a series of new functions, which greatly improved the texture, beauty, and playability of video generation, bringing Another leap in creative experience. Gai Kun, senior vice president of Kuaishou and head of Kuaishou’s main website business and community science line, introduced the three highlights of this Keling AI upgrade, including
high-definition version, first and last frame control and camera lens control.
. After the upgrade, the quality of the generated videos has made a qualitative leap compared to the previous model.
At the same time, thanks to the higher spatio-temporal resolution of training, Keling AI has greatly improved in terms of generating details, composition, lens movement aesthetics, and light and shadow.
From the comparison of image quality below, we can clearly see the difference between Keling AI’s previous models and the latest models.
Secondly, Keling AI has added a practical and highly requested "First and Last Frame Control" function in the field of Tusheng videos, making Tusheng videos with echoing first and last frames a reality.
By customizing the start frame and end frame images, users can accurately control the smooth transition of shots between different video clips, achieving effects such as one shot to the end. Judging from the actual generated results, not only the movements are natural and smooth, but the image quality is also guaranteed. The introduction of this function allows users to have a more intuitive and convenient editing experience, meeting the needs of personalized Tusheng videos.
For example, generate a video from the following two pictures:
The effect is like this:
Finally, Keling AI adds Mirror movement control and automatic master lens movement functions. In the world of video, the combination of more lenses can capture more images and enhance the overall expression.
Keling AI presets six sets of classic lens control methods, including Roll, Tilt, Pan, Vertical, Horizontal and Zoom, for different purposes. Scenes provide a wealth of choices. Users can also adjust the positive and negative parameters of these movements to control the intensity or gentleness of the movement, as well as reverse movement, etc. At the same time, masterful camera movements help produce eye-catching blockbusters with a full cinematic feel.
It can be seen that with the addition of these new features, Keling AI has made visible improvements in video clarity, aesthetic performance, and content customization control.
Not only that, the Keling AI web version, which is officially available to users, integrates Vincent pictures, Vincent videos, and video editing capabilities that will be supported in the near future, becoming a one-stop visual content creation platform that can be used immediately after release.
The newly added "First and Last Frame Control" and "Mirror Movement Control" functions are currently available on the web page. Friends who want to experience it can quickly apply!
Klingai AI web version address: klingai.kuaishou.com
It’s not an exaggeration to describe Keling AI’s upgrade as “full of sincerity”. Of course, it is inseparable from Kuaishou’s video generation capabilities and technology. continuous innovative breakthroughs.
"Movie-level" AI generation is all technology behind it
Compared with the already very mature image generation, the video generation task is more complex. In practical applications, it must face authenticity, action coherence, and picture smoothness. , detail accuracy, scene, character and light and shadow consistency, physical accuracy and time constraints and many other challenges.
How well these challenges are handled will directly determine the practicality and ease of use of the model. Obviously, the upgraded Keling AI has undergone radical changes in these aspects. To sum up, Keling AI has seven major capabilities highlights.
Head of Kuaishou Visual Generation and Interaction Center Wan Pengfei analyzed these capabilities one by one, which build Keling AI’s capabilities in video quality, image generation, motion generation, generation time, physical laws, and command response , video controllability and other aspects of core competitiveness, and created the all-powerful Keling AI today. At the same time, Wan Pengfei also looked forward to future development. He said that video generation effects are improving very quickly and are gradually approaching graphics rendering and camera shooting, which will bring new opportunities to the pan-video industry.
does. Further evolution in the three major capabilities of movie-level high-definition picture generation, leading graphic video effects and excellent video generation controllability.
Among them, themovie-level high-definition picture generation capability is capable of presenting magnificent natural scenery, human or animal movements and expressions and other grand or subtle scenes with high fidelity and vividness, giving it a full blockbuster feel.
Leading graphic video capabilities can animate static images and convert them into vivid 5-second short videos. At the same time, it is paired with different text inputs to make Tusheng videos more creative and "whatever you want".
For example, convert an image of a puppy swimming into a video:
The effect is like this:
Excellent video generation controllability puts more sophisticated video creation in the hands of the user. In addition to this camera lens control, Keling AI will also achieve controllable adjustments in more aspects such as voice facial matching, character ID retention, and control of the evolution of the screen and layout through simple stroke prompts in the future. The training of the model has been completed and these functions will be online soon.
At the same time, Keling AI has also been further upgraded in its other four major capabilities such as motion generation, generation duration, physical laws, and command response.
First of allKeling AI has large and reasonable motion generation capabilities. By modeling complex spatiotemporal motion, Keling AI can generate larger-amplitude motions that comply with motion laws.
Thanks to more adequate model training this time, the overall motion effect generated by Keling AI is more flexible, supporting a larger range of motion without weakening the rationality. The kitten's turning and walking postures shown below are all very natural and reasonable and consistent with physical facts.
The second is the minute-level long video generation capability. Now, minute-level duration has become an important metric for evaluating a video generation model, which requires more efficient multi-shot processing, longer storytelling, and more consistent motion expansion capabilities.
Currently, Keling AI can generate several minutes of 1080p, 30fps video. At the same time, the video continuation function that follows user instructions is opened. A single continuation delays the video movement by 4 to 5 seconds. It also supports multiple continuations. A video of up to 3 minutes can be generated, and it can be continued during continuation. Specify the direction of the subsequent development of the story, making it easy to use.
After this upgrade, Keling AI has carried out joint in-depth optimization at the algorithm and engineering levels, which has increased the length of a single generated video from 5 seconds to 10 seconds, achieving the longest duration among products open to users. , can present a more complete story line and provide users with a broader creative space.
Its Sankeling AI can simulate complex physical world characteristics. Since Sora, various video generation models have paid great attention to generating videos that comply with physical laws, which determines the upper limit of the model's capabilities.
Keling AI has been able to accurately model and simulate real-world properties since its release, making the generated videos close to reality, such as bathing a kitten.
Now, with the support of more complete model training, Keling AI’s modeling and simulation capabilities for interactive physical laws have reached a new level.
Qixikeling AI’s concept combination and command response capabilities are very strong. In terms of technical implementation, through a deep understanding of cross-modal semantics from text to video, Keling AI can easily convert users' rich imagination into specific video images, allowing them to unleash their imagination, such as coffee cup volcanoes.
The upgraded Keling AI has adopted better text data and encoding schemes, which naturally enhances its responsiveness to user prompt words and provides better visual rendering effects.
これらすべての機能は、Keling AI のビデオ生成テクノロジー ルート (DiT アーキテクチャを使用)、モデル設計 (潜在空間のエンコードとデコード、時間情報モデリング、テキストの拡張とエンコードなど)、データ保証 (多次元タグ システムなど) から派生しています。 、ビデオ記述モデルなどの側面におけるテクノロジーの蓄積と独自の革新、コンピューティング効率(分散トレーニングクラスター、段階的トレーニング戦略など)、機能拡張(ビデオタイミング拡張、マルチモーダル入力制御可能など)。
今日の Keling AI は技術的に進歩しており、信頼性が高いと言えます。このテクノロジーが発売されるとすぐに人々に求められたのも不思議ではありません。
生成 AI の時代には、Kuaishou が用意されています
ここ 1 年ほどで、大型モデルの分野全体が非常に忙しいと言えます。昨年はベースモデルの開発について話していましたが、今年は全員がアプリケーションについて話しています。ここ数日の WAIC カンファレンスの開催により、私たちは「モデルスクール」と「応用スクール」の間で新たな議論の波が起きているのを目の当たりにしました。
この波の中で、クアイショーは何をしているのですか?
まず、システムで遊びます。基盤となる IDC コンピューティング センターからネットワーク アーキテクチャと AI プラットフォーム、中間層の基本コア大規模モデル、アプリケーション層のさまざまなアプリケーション探索に至るまで、Kuaishou は完全なセットを自己研究および開発して実装しました。 Kuaishou の副社長で大規模モデルチームの責任者であるZhang Di氏は、このシステムについて話すとき、独立した研究開発へのしっかりとした投資が「技術的な雪だるま式」効果と長期的には大きなコストメリットをもたらすと信じています。走る。 Kuaishou の非常に大きな利点は、上位層に多数の AI アプリケーション シナリオがあり、大規模なモデルを実装する多くの機会がもたらされることです。
。基本モデルは AI の能力の上限を決定します。一方、商業応用では、新しい技術を段階的に適用し、継続的にフィードバックを収集することができます。好循環を生み出します。 昨年から、Kuaishou は「KwaiYi」大型モデルを提案しました。このモデルは、初期の 13B パラメータ サイズから 175B まで急速に成長し、マルチモーダル バージョンを発売しました。複数のバージョンの反復を経て、Ruiyi の大規模モデルは、Kuaishou の内部資料作成、AI インタラクション、コンテンツ制作、その他のシナリオで役割を果たし始め、今年 6 月には、Ruiyi ベースの AIGC マーケティング資料の 1 日あたりの消費量が 2,000 万件を超えました。
基本モデルを使用して、Kuaishou はより多くのシナリオで独自の差別化された機能を徐々に開発してきました。
具体的には、Wenshengtu では、Kuaishou の「Ketu」は、強力な意味理解とコマンド追従機能を備え、業界のトップモデルの 1 つになりました。テキスト表現の革新と画像データの調整に関する多くの作業のおかげで、Ketu は強化学習トレーニング後にカメラ レベルの画像テクスチャを描画できるようになり、その美しさも人間の普遍的な標準に合わせられました。
ビデオ生成に関しては、「Keling AI」は世界的なビデオ生成分野で新たな競争を引き起こしました。テキストベースのビデオと画像ベースのビデオを生成でき、豊富な画像編集機能を備えており、ビデオ生成の制御性、質感、美しさ、動きの合理性の点で業界で優れています。 Kuaishou のエンジニアはエンジニアリング アルゴリズムの最適化を続け、ビデオ生成 AI の敷居を継続的に下げるよう努めています。 基準を設定すると言えば、新しいテクノロジーの最適化は、生成 AI が現在直面している重要な課題の 1 つです。国家レベルのショートビデオアプリケーションとしての Kuaishou の利点は、AI アプリケーションのシナリオが多数あり、実装のシナリオと機会がもたらされることです。 テクノロジーの実装において、Kuaishou は一連のマイルストーンを達成しました。 アプリのコメントエリアにある Kuaishou の会話モデル アプリケーション「AI Xiaokuai
」は、ビデオの内容を理解し、ユーザーと対話することができます。これまでにテストされ、1,000 万人以上のファンが蓄積されています。 電子商取引ライブブロードキャストルームでは、Wenshengtu AI「ビデオ生成モデル「
コンテンツ制作、理解から推奨、その他のレベル、個人から電子商取引に至るまで、Kuaishou の生成 AI 機能は主要なビジネスを完全にカバーしており、Kuaishou エコシステムの継続的な発展を促進し続けています。
ついに新たな試みが始まりました。 WAICでKuaishouは、AIGC初の短編ドラマ「山と海の奇妙な鏡:波を切る」が今月正式にリリースされると発表した。
この劇は Keling AI による綿密な技術サポートを受けて提供され、サイバー スタイルを使用して古典的な山と海の古代神話の世界を再現しています。予告編から判断すると、山から海、森から空までのシーンはすべて驚くべき視覚効果を示しています。以前は、このような効果を実現するには専門の特殊効果チームが必要だったかも知れませんが、現在では、ビジュアル生成 AI が素晴らしいビジュアル体験をもたらすことができます。
はい、半年前、私たちはまだ未来を想像していましたが、今では AI が本格的に映画を作り始めています。
現在の大規模モデルの波では、大規模な実装ほど技術的能力を証明できるものはありません。
そして、Kuaishou の総合的な実践により、AI の生産性が無意識のうちに私たちの生活を変えていることが改めて確認されました。
The above is the detailed content of 700,000 people rushed to experience it! The new king of video generation 'Keling AI' has been upgraded again. For more information, please follow other related articles on the PHP Chinese website!